[Biopython-dev] Output sequence files

Iddo Friedberg idoerg at cc.huji.ac.il
Sat May 26 23:44:57 EDT 2001


: > So maybe we just need a writer for each {database}.Record types, and a
: > to_fasta converter and writer in Tools.

: Isn't this what the SeqIO directory is for?I had always hoped to get SeqIO
: functionality similar to bioperl's.

DAMN! I thought I was missing out on something...
Did I also miss the existence of writers for {database}.Record types?

Sorry about that. I'll look into SeqIO, and bioperl's one, see if I can
learn something. Thanks for clearing this up.

: > The problem arises from annotation. Do you think it's feasable to perform a
: > good GenPept (that's the GenBank translation database) <--> SwissProt
: > converter that will preserve everything?
: >

: The gold standard for preserving information, is if you can convert A to B
: back to A, and have it come out exactly the same.That'll probably be
: possible for a lot of records, but many of them will not work.For example,
: GenBank locations are much richer than SwissProt ones, so complex location
: semantics that SwissProt doesn't handle will be lost.

Actually, GenBank <--> SwissProt is probably the least convertible of the
kind. Many GenBank records hold the annotation to several CDS's, and
generally a GenBank sequence holds also untranslated regions, etc.
GenPept, the GenBank translation, is not much better: it holds coding
information and all sorts of stuff which is SwissProt irrelevant. And



Iddo Friedberg                                  | Tel: +972-2-6758647
Dept. of Molecular Genetics and Biotechnology   | Fax: +972-2-6757308
The Hebrew University - Hadassah Medical School | email: idoerg at cc.huji.ac.il
POB 12272, Jerusalem 91120                      |
Israel                                          |

More information about the Biopython-dev mailing list