[emboss-dev] Commandline changes in EMBOSS applications

Peter Cock p.j.a.cock at googlemail.com
Thu Sep 29 15:13:46 UTC 2011


On Thu, Sep 29, 2011 at 4:03 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> On Thu, Sep 29, 2011 at 3:43 PM, Peter Rice <pmr at ebi.ac.uk> wrote:
>> A question for our developer community...
>>
>> I am working through the GALAXY wrappers for EMBOSS applications. GALAXY has
>> a very clean way to define command line applications which is close to
>> EMBOSS's ACD definitions, so most applications are easy to define.
>>
>> I have problems where the default values in the ACD file depend on other
>> values. Two examples from prettyplot illustrate the problem. In both cases,
>> the current GALAXY definitions ignore these qualifiers.
>>
>>  integer: residuesperline [
>>    default: "50"
>>    information: "Number of residues to be displayed on each
>>                  line"
>>  ]
>>
>>  integer: resbreak [
>>    information: "Residues before a space"
>>    default: "$(residuesperline)"
>>    expected: "Same as -residuesperline to give no breaks"
>>  ]
>>
>>
>> The second qualifier defaults to the value of the first. GALAXY is unable to
>> interpret this. It could be defined with a default of "50" for GALAXY, but I
>> would prefer to remove this qualifier and add a new one "-blocksperline"
>> with a default of 1. In this way the dependency disappears, and the results
>> are cleaner.
>>
>> The second value is a calculation from sequence properties:
>>
>>  float: plurality [
>>    information: "Plurality check value (totweight/2)"
>>    default: "@( $(sequences.totweight) / 2)"
>>    expected: "Half the total sequence weighting"
>>  ]
>>
>> This has a long history, back to the EGCG version of prettyplot where the
>> command line options were extensions of a GCG program. The "weight" is by
>> default 1.0 per sequence, but GCG format had a way to adjust weights in the
>> input file. Plurality is nice in that it allows a definition of how many of
>> the sequences should match.
>>
>> In this case, it seems easier to ignore the weight-based value and instead
>> to define -percent 50.0 then multiple the total weight (or number of
>> sequences) by 0.50 and get the same results.
>>
>> I am a little nervous about removing command line options because of the
>> risk of breaking some interfaces.
>>
>> So:
>>
>> 1. Should I go ahead and add the new options?
>> 2. Do I remove the old options so old wrappers, scripts, etc. break with
>> "unknown qualifier -plurality"
>> 3. Or, do we keep the old options, declare them obsolete, object to their
>> use but keep going
>>
>> As option 3 would also complicate life for wrappers - anyone making new
>> wrappers would most probably include the obsolete options - I prefer 1+2 but
>> I would appreciate some feedback.
>>
>> regards,
>>
>> Peter Rice
>
> Hi Peter R,
>
> In theory you can use an optional integer parameter in Galaxy,
> with an empty default,  meaning the user doesn't have to put in
> a value. You can then check this in the tool wrapper's XML
> <command> tag with Cheetah syntax to decide if you add
> the -switch value to the command string (with the user's value),
> or not (to get the EMBOSS default).
>
> Perhaps I have misunderstood, but I think it is supported in
> Galaxy although probably quite fiddly.
>

To try and clarify,

Have a look at the NCBI BLAST+ wrapper for blastn as a
related example, where max_hits is an integer option
defaulting to zero. This pre-dated Galaxy fixing optional
integer arguments - at the time the best you could do was
a default (here zero) which you could recognise.

In the <command> tag, I treat zero as meaning use the
defaults, i.e. don't add the -max switch to the command string:

#if (str($adv_opts.max_hits) and int(str($adv_opts.max_hits)) > 0):
-max_target_seqs $adv_opts.max_hits
#end if

You should now be able to use a blank default, and in the
Cheetah if statement, check for non-blank. But it is the
same basic idea.

Peter




More information about the emboss-dev mailing list