[Biopython-dev] Accessing ExPASy through Bio.SwissProt / Bio.SeqIO

Michiel De Hoon mdehoon at c2b2.columbia.edu
Tue Dec 4 07:10:40 UTC 2007


Hi everybody,

I am still looking at the different code in Biopython to access SwissProt.
With Bio.SwissProt, we can access the SwissProt database as follows:

>>> from Bio.SwissProt import SProt
>>> dictionary = SProt.ExPASyDictionary()
>>> record = dictionary["O23719"]
# record is now a string containing the SwissProt record O23719

Another option is to pull out a Bio.SwissProt.SProt.Record object:
>>> from Bio.SwissProt import SProt
>>> s_parser = SProt.RecordParser()
>>> dictionary = SProt.ExPASyDictionary(parser=s_parser)
>>> record = dictionary["O23719"]
# record is now a Bio.SwissProt.SProt.Record object containing record O23719

A third option is to pull out a SeqRecord by using SeqIO:
>>> from Bio.SwissProt import SProt
>>> dictionary = SProt.ExPASyDictionary()
>>> record = dictionary["O23719"]
>>> from Bio import SeqIO
>>> import StringIO
>>> record = SeqIO.parse(StringIO.StringIO(record), "swiss").next()
# record is now a Bio.SeqRecord.SeqRecord object containing record O23719

Compare this to how we would read a Fasta file:
>>> from Bio import SeqIO
>>> input = open("mydata.fa")
>>> record = SeqIO.parse(input, "fasta").next()

For consistency with Bio.SeqIO, it would make sense if ExPASyDictionary would
returns handles instead of parsed objects. Then these examples look like:
>>> from Bio.SwissProt import SProt
>>> dictionary = SProt.ExPASyDictionary()
>>> record = dictionary["O23719"].read()
# record is now a string containing the SwissProt record O23719

To pull out a Bio.SwissProt.SProt.Record object:
>>> from Bio.SwissProt import SProt
>>> dictionary = SProt.ExPASyDictionary()
>>> handle = dictionary["O23719"]
>>> record = SProt.parse(handle)
# record is now a Bio.SwissProt.SProt.Record object containing record O23719

To pull out a SeqRecord by using SeqIO:
>>> from Bio.SwissProt import SProt
>>> dictionary = SProt.ExPASyDictionary()
>>> handle = dictionary["O23719"]
>>> from Bio import SeqIO
>>> record = SeqIO.parse(handle, "swiss").next()
# record is now a Bio.SeqRecord.SeqRecord object containing record O23719

*If* we decide that ExPASyDictionary should return handles, *then* actually
we don't really need an ExPASyDictionary, as its behavior is then largely the
same as Bio.WWW.ExPASy.get_sprot_raw. So in short, in my opinion
Bio.SwissProt.SProt.ExPASyDictionary does not add much beyond what
Bio.WWW.ExPASy.get_sprot_raw already offers.

Any comments?

--Michiel.
Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032





More information about the Biopython-dev mailing list