[Biopython-dev] Bio.GenBank.LocationParser chokes on misc_feature in Desulfurococcus kamchatkensis 1221n/NC_011766.gbk

Tim te Beek tim.te.beek at nbic.nl
Mon Jul 11 08:46:54 UTC 2011


The same happens when parsing
ftp://ftp.ncbi.nih.gov/genomes/Bacteria/Saccharopolyspora_erythraea_NRRL_2338_uid62947/NC_009142.gbk,
offending features:

     misc_feature    order(2409324..2409326,2409399..2409401,2409528..2409533,
                     2409619..2409624,2409679..2409681,2409748..2409753,
                     2409754..2409759,2409835..2409837,join(2409886..2409890,
                     2409892..2409898),2409911..2409913,2409920..2409925)
                     /locus_tag="SACE_2218"
                     /note="active site"
                     /db_xref="CDD:119408"
     misc_feature    order(2409324..2409326,2409399..2409401,2409528..2409530)
                     /locus_tag="SACE_2218"
                     /note="catalytic tetrad; other site"
                     /db_xref="CDD:119408"

could have something to do with the order() instruction, but I'm not sure.


On Mon, Jul 11, 2011 at 10:34, Tim te Beek <tim.te.beek at nbic.nl> wrote:
> When parsing ftp://ftp.ncbi.nih.gov/genomes/Bacteria/Desulfurococcus_kamchatkensis_1221n_uid59133/NC_011766.gbk
> using SeqIO.read(genbank_file, 'genbank') I get the following
> stacktrace:
>
> ...
>     gbk_records = (SeqIO.read(genbank_file, 'genbank') for
> genbank_file in genbank_files)
>   File "/usr/local/lib/python2.6/dist-packages/Bio/SeqIO/__init__.py",
> line 604, in read
>     first = iterator.next()
>   File "/usr/local/lib/python2.6/dist-packages/Bio/SeqIO/__init__.py",
> line 532, in parse
>     for r in i:
>   File "/usr/local/lib/python2.6/dist-packages/Bio/GenBank/Scanner.py",
> line 440, in parse_records
>     record = self.parse(handle, do_features)
>   File "/usr/local/lib/python2.6/dist-packages/Bio/GenBank/Scanner.py",
> line 423, in parse
>     if self.feed(handle, consumer, do_features):
>   File "/usr/local/lib/python2.6/dist-packages/Bio/GenBank/Scanner.py",
> line 395, in feed
>     self._feed_feature_table(consumer, self.parse_features(skip=False))
>   File "/usr/local/lib/python2.6/dist-packages/Bio/GenBank/Scanner.py",
> line 347, in _feed_feature_table
>     consumer.location(location_string)
>   File "/usr/local/lib/python2.6/dist-packages/Bio/GenBank/__init__.py",
> line 975, in location
>     raise LocationParserError(location_line)
> Bio.GenBank.LocationParserError:
> order(1078481..1078483,join(1078778,1078800..1078810))
>
> The offending feature is:
> misc_feature    complement(order(1078481..1078483,join(1078778,
>                 1078800..1078810)))
>                 /locus_tag="DKAM_1147"
>                 /note="active site"
>                 /db_xref="CDD:73252"
>
> Could you look into whether this is a bug in the parser or in the input file?
>




More information about the Biopython-dev mailing list