[Bioperl-l] GenBank XML format?

Dan Bolser dan.bolser at gmail.com
Wed Jan 28 19:13:47 UTC 2009


Hi,

Can BioPerl handle GenBank XML format? How about the various 'native'
database XML formats? Should I just fall back on GenBank text format
for the given division?

To be clear I'm thinking about parsing dbEST data. The following
'queries' highlight the different formats:

# A 'native' ASN.1 format dbEST record
http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nucess&id=116038450&rettype=native

# A 'native' XML format dbEST record
http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nucess&id=116038450&rettype=native&retmode=xml

# GenBank text format dbEST record:
http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nucest&id=116038450&rettype=gb&retmode=text

# GenBank XML format dbEST record:
http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nucest&id=116038450&rettype=gb&retmode=xml


(If any of the above 'queries' fail (Bad Gateway, or anything else
weird), just try again in a couple of seconds).

Note that setting db=nucleotide has no effect over setting db=nucess,
so I guess I may as well use the BioPerl GenBank parser. The reason I
ask is because I have had trouble using XSLT on large documents, and I
wondered if BioPerl used some tricks to get round this (if it can read
these XML formats).

Thanks for any suggestions,

Dan.



More information about the Bioperl-l mailing list