[Biopython-dev] GenBank parser -- first go

Andrew Dalke dalke at acm.org
Mon Dec 11 15:55:12 EST 2000


I was playing around with a different way to handle
the FEATURES section and came across this example
in IRO125195:

FEATURES             Location/Qualifiers
     source          1..1326
                     /organism="Homo sapiens"
                     /db_xref="taxon:9606"
                     /chromosome="21"
                     /clone="IMAGE cDNA clone 125195"
                     /clone_lib="Soares fetal liver spleen 1NFLS"
                     /note="contains Alu repeat; likely to be be derived
from
                     unprocessed nuclear RNA or genomic DNA; encodes
putative
                     exons identical to FTCD; formimino transferase
                     cyclodeaminase; formimino transferase (EC 2.1.2.5)
                     /formimino tetrahydro folate cyclodeaminase (EC
4.3.1.4)"


See the "/formimino"?  I had thought that any line starting
with a '/' was a new qualifier, but it looks like you really do
have to parse the quotes as you go to tell when you are done.
While the qouted quote checking (double the "s) is doable with
a regular expression, it's gets pretty complicated.

                    Andrew





More information about the Biopython-dev mailing list