[emboss-dev] Commandline changes in EMBOSS applications

Jon Ison jison at ebi.ac.uk
Mon Oct 3 07:16:23 UTC 2011


I think it depends on what's most important, maintaining the richness of the EMBOSS command-line
(dependencies in default values) or compatibility with the Galaxy or any interface that can't
handle this.  That's a tough one!  I'm leaning towards the latter, but not if it makes many
applications really messy.

So I would add new options and remove old ones - bearing in mind that would need to be done for
all apps.



> A question for our developer community...
> I am working through the GALAXY wrappers for EMBOSS applications. GALAXY
> has a very clean way to define command line applications which is close
> to EMBOSS's ACD definitions, so most applications are easy to define.
> I have problems where the default values in the ACD file depend on other
> values. Two examples from prettyplot illustrate the problem. In both
> cases, the current GALAXY definitions ignore these qualifiers.
>    integer: residuesperline [
>      default: "50"
>      information: "Number of residues to be displayed on each
>                    line"
>    ]
>    integer: resbreak [
>      information: "Residues before a space"
>      default: "$(residuesperline)"
>      expected: "Same as -residuesperline to give no breaks"
>    ]
> The second qualifier defaults to the value of the first. GALAXY is
> unable to interpret this. It could be defined with a default of "50" for
> GALAXY, but I would prefer to remove this qualifier and add a new one
> "-blocksperline" with a default of 1. In this way the dependency
> disappears, and the results are cleaner.
> The second value is a calculation from sequence properties:
>    float: plurality [
>      information: "Plurality check value (totweight/2)"
>      default: "@( $(sequences.totweight) / 2)"
>      expected: "Half the total sequence weighting"
>    ]
> This has a long history, back to the EGCG version of prettyplot where
> the command line options were extensions of a GCG program. The "weight"
> is by default 1.0 per sequence, but GCG format had a way to adjust
> weights in the input file. Plurality is nice in that it allows a
> definition of how many of the sequences should match.
> In this case, it seems easier to ignore the weight-based value and
> instead to define -percent 50.0 then multiple the total weight (or
> number of sequences) by 0.50 and get the same results.
> I am a little nervous about removing command line options because of the
> risk of breaking some interfaces.
> So:
> 1. Should I go ahead and add the new options?
> 2. Do I remove the old options so old wrappers, scripts, etc. break with
> "unknown qualifier -plurality"
> 3. Or, do we keep the old options, declare them obsolete, object to
> their use but keep going
> As option 3 would also complicate life for wrappers - anyone making new
> wrappers would most probably include the obsolete options - I prefer 1+2
> but I would appreciate some feedback.
> regards,
> Peter Rice
> _______________________________________________
> emboss-dev mailing list
> emboss-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss-dev

More information about the emboss-dev mailing list