[Biopython-dev] SeqIO

Brad Chapman chapmanb at arches.uga.edu
Wed Sep 26 22:47:54 EDT 2001


Hi Thomas!

> ok - I rewrote and _commited_ the lost code for sequence conversion :-)

Glad it made it in okay :-)

> Brad or/and Andrew: could you check how we can use the GenBank and the SWISS
> parser in the SeqIO stuff ?

Yeah, I spent some time looking through it (a few more comments are
below), and I think what I'd need to do for GenBank is just create a
converter that takes SeqRecord objects and turns them into a
GenBank.Record object. This way, I could just do str(the_record) to
get the output and re-use the output work I already did. 

One big question I have is, how many of the features do you want to
try and retain in the conversion? So, for GenBank format, do you
want me to just write out the basic information (sequence, type,
etc) and ignore the feature table, or do we want to somehow map the
features from format to format (ie. EMBL <-> GenBank).

If we want to think about feature conversion, this'll be tougher and
we'll need to think about converters between "similar" formats like
EMBL and GenBank.

> The current file for seqeunce format IO is SeqIO/generic.py ... (should
> definitely change name, maybe to SeqIO.py ?)

You could just change it to __init__.py, like in the other modules
(so we could do from Bio import SeqIO and get it).

I also had a couple of questions from looking at this:

=> Why are you duplicating SeqRecord in the SeqIO stuff instead of
just reusing it? I don't think I understand what you are talking
about with stripping newlines...

=> Is there a way to plug in a specialized converter for similar
formats, like I was talking about above with EMBL/GenBank? I think
Jeff suggested this earlier, and it seems like a good idea to me. I
guess right now you could subclass ReadSeq and define your own
Convert function, but maybe there is another way to do it.

Thanks for your work and code on this. Nice to see it progressing
along!

Brad
-- 
PGP public key available from http://pgp.mit.edu/



More information about the Biopython-dev mailing list