[emboss-dev] Commandline changes in EMBOSS applications
Jon Ison
jison at ebi.ac.uk
Mon Oct 3 07:16:23 UTC 2011
Morning
I think it depends on what's most important, maintaining the richness of the EMBOSS command-line
(dependencies in default values) or compatibility with the Galaxy or any interface that can't
handle this. That's a tough one! I'm leaning towards the latter, but not if it makes many
applications really messy.
So I would add new options and remove old ones - bearing in mind that would need to be done for
all apps.
Cheers
Jon
> A question for our developer community...
>
> I am working through the GALAXY wrappers for EMBOSS applications. GALAXY
> has a very clean way to define command line applications which is close
> to EMBOSS's ACD definitions, so most applications are easy to define.
>
> I have problems where the default values in the ACD file depend on other
> values. Two examples from prettyplot illustrate the problem. In both
> cases, the current GALAXY definitions ignore these qualifiers.
>
> integer: residuesperline [
> default: "50"
> information: "Number of residues to be displayed on each
> line"
> ]
>
> integer: resbreak [
> information: "Residues before a space"
> default: "$(residuesperline)"
> expected: "Same as -residuesperline to give no breaks"
> ]
>
>
> The second qualifier defaults to the value of the first. GALAXY is
> unable to interpret this. It could be defined with a default of "50" for
> GALAXY, but I would prefer to remove this qualifier and add a new one
> "-blocksperline" with a default of 1. In this way the dependency
> disappears, and the results are cleaner.
>
> The second value is a calculation from sequence properties:
>
> float: plurality [
> information: "Plurality check value (totweight/2)"
> default: "@( $(sequences.totweight) / 2)"
> expected: "Half the total sequence weighting"
> ]
>
> This has a long history, back to the EGCG version of prettyplot where
> the command line options were extensions of a GCG program. The "weight"
> is by default 1.0 per sequence, but GCG format had a way to adjust
> weights in the input file. Plurality is nice in that it allows a
> definition of how many of the sequences should match.
>
> In this case, it seems easier to ignore the weight-based value and
> instead to define -percent 50.0 then multiple the total weight (or
> number of sequences) by 0.50 and get the same results.
>
> I am a little nervous about removing command line options because of the
> risk of breaking some interfaces.
>
> So:
>
> 1. Should I go ahead and add the new options?
> 2. Do I remove the old options so old wrappers, scripts, etc. break with
> "unknown qualifier -plurality"
> 3. Or, do we keep the old options, declare them obsolete, object to
> their use but keep going
>
> As option 3 would also complicate life for wrappers - anyone making new
> wrappers would most probably include the obsolete options - I prefer 1+2
> but I would appreciate some feedback.
>
> regards,
>
> Peter Rice
> EMBOSS Team
> _______________________________________________
> emboss-dev mailing list
> emboss-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss-dev
>
More information about the emboss-dev
mailing list