[Biopython-dev] Bio.SeqIO

Peter biopython-dev at maubp.freeserve.co.uk
Sun Feb 25 12:50:19 UTC 2007


Michiel de Hoon wrote:
> Currently the format has to be in lower-case. It might be better to make 
> the format case-insensitive. So I won't have to remember whether it is 
> "fasta", "Fasta", or "FASTA".

If you do use a non lower-case form, then a ValueError is raised - so it 
should be easy to see what has happened.  Does anyone else care either way?

> Three of the ValueErrors raised by WriteSequences and SequenceIterator 
> are actually TypeErrors:

Good point.  I have changed the type tests for the handle and format to 
raise TypeErrors.

> The "if not format" is actually not needed, since Python will complain 
> already if these functions are called without the correct number of 
> arguments.

I was actually trying to catch cases where format was supplied as None, 
or the empty string "".  I have moved this below the type check, so it 
is only checking for an empty string and will still raise a ValueError.

> For an incorrect format argument, WriteSequences raises an 
> AssertionError. A ValueError (as in SequenceIterator) seems more 
> appropriate.

Agreed.  I hadn't noticed that remaining assertion.

 > Also, it might be a good idea to print possible values for
> the format if the user passes an incorrect format.

At the moment this is a fairly short list, but it should grow in future. 
  Doing this would make the functionality more discoverable.  It would 
also help where the user had tried another name for a supported format, 
e.g. "genpept" versus "genbank", or "clustalw" versus "clustal".

> Btw, the docstring for SequenceIterator mentions guessing the file 
> format from the handle if the format is not specified.

Whoops.  Fixed.

Thanks for your attention to detail Michiel.

Peter




More information about the Biopython-dev mailing list