[Biopython-dev] Changing the details of the app wrapper private API

Peter biopython at maubp.freeserve.co.uk
Fri Dec 3 23:13:47 UTC 2010


On Fri, Dec 3, 2010 at 10:01 PM, Brad Chapman <chapmanb at 50mail.com> wrote:
>
>> What are your thoughts regarding the types=["file"] stuff? Should we
>> leave it, replace it with a boolean, or look at the subclass route?
>> Other useful subclasses as well as a _FileOption include things like
>> _IntegerOption and _FloatOption (although the later is interesting
>> with things like "1e-10" which BLAST accepts for example, but not
>> all tools taking float arguments would like that).
>
> It might be easiest just to dump that. My initial idea behind this
> was that if you label things with their output type, they could be
> processed specifically downstream in a pipeline. This was probably
> way too ambitious, and it might be simpler to keep it lightweight.
> The application wrappers are working better as a simple way to
> specify a command line.

Hey Brad,

I think we may be talking a little bit at cross purposes. I'll try
to clarify (this may be interesting to the others too).

As you just suggested ("just dump that..."), I did (earlier today)
remove the "input" and "output" labels given to the parameters
via the types argument. These were only used in the old
ApplicationResult object (deprecated and just removed after
the release of Biopython 1.56). In addition to these two now
useless tags (input and output), there was one other tag "file",
and that is still present and used in most if not all the wrappers.

It is being used for some important functionality - supporting
nasty filenames, in particular those with spaces in them. This
is more an issue on Windows where even the user's home
directory has spaces in it. Consider a silly example,

$ tool -input filename.fasta

If the filename has spaces you must quote it,

$ tool -input "filename with spaces.fasta"

In general that works on Windows, Mac, Linux etc.

It means that users can do this:

cline = WrapperClass(input="filename with space.fasta")

or

cline = WrapperClass(input='filename with space.fasta')

or

cline.input = "filename with space.fasta"

etc

and the wrapper will know to add the quotes for them. This
all works as things stand (or at least, we have unit tests to
check some examples like this and I'm not aware of any
open issues). In order for that to happen the wrapper input
parameter would be defined with the following:

_Option(["-input", "input"], "input filename", types=["file"])

(That's with the new ordering of name list, description,
then optional args)

That special "file" entry in the otherwise unused types
argument triggers the automatic quoting/escaping done
by function _escape_filename in Bio.Application.

Looking over the history, this functionality via the "file" tag
was added in early 2009, as part of Bug 2815,
http://bugzilla.open-bio.org/show_bug.cgi?id=2815

I want to keep this functionality, but change the current
interface - which is to use types=["file"] or the default of
types=[]. The simplest option is to replace it with a
boolean (e.g. filename=True, or auto_quote=True).

Also, thinking ahead to Python 3, there may be issues
with converting filenames given as unicode strings into
byte strings suitable for use in command line strings.
In that case maybe filename=True is more future
proof than auto_quote=True.

Since this is a private API, we don't have to worry much
about breaking backwards compatibility - that was a
good design choice back then Brad.

Regards,

Peter

Hopefully that wasn't too long and boring ;-)



More information about the Biopython-dev mailing list