[Biopython-dev] skipping a bad record read in SeqIO

Peter biopython at maubp.freeserve.co.uk
Sun Jun 7 21:31:48 UTC 2009


On 6/7/09, Iddo Friedberg <idoerg at gmail.com> wrote:
> Here is the stack dump, coming from the file:
>
> ftp://ftp.ncbi.nih.gov/genbank/gbcon11.seq.gz
>
> The offender:
>
> ACCESSION   CH991540 ABGB01000000
>
> Syntax error at or near `Tokens('close_paren')' token
> Traceback (most recent call last):
>   File "./filter_seqs.py", line 108, in <module>
>     matching_seqs, non_matching_seqs = filter_sequences(open(inpath),
> match_pairs, condition,seq_format)
>    File "./filter_seqs.py", line 23, in filter_sequences
>     for seq_record in SeqIO.parse(in_handle,format):
>   File
> "/home/idoerg/biopy_cvs/biopython/Bio/GenBank/Scanner.py",
> line 420, in parse_records
>      record = self.parse(handle)
>   File
> "/home/idoerg/biopy_cvs/biopython/Bio/GenBank/Scanner.py",
> line 403, in parse
>     if self.feed(handle, consumer) :
>   File
> "/home/idoerg/biopy_cvs/biopython/Bio/GenBank/Scanner.py",
> line 381, in feed
>      self._feed_misc_lines(consumer, misc_lines)
>   File
> "/home/idoerg/biopy_cvs/biopython/Bio/GenBank/Scanner.py",
> line 1138, in _feed_misc_lines
>     consumer.contig_location(contig_location)
>   File
> "/home/idoerg/biopy_cvs/biopython/Bio/GenBank/__init__.py",
> line 987, in contig_location
>      self.location(content)
>   File
> "/home/idoerg/biopy_cvs/biopython/Bio/GenBank/__init__.py",
> line 684, in location
>     raise LocationParserError(location_line)
> Bio.GenBank.LocationParserError:
> join(complement(ABGB01000004.1:1..81568),gap(unk100),complement(ABGB01000012.1:1..1260),gap(unk100),ABGB01000013.1:1..1227,gap(unk100),ABGB01000011.1:1..1338,gap(unk100),complement(ABGB01000001.1:1..118303))
>

That look like Bug 2745 to me - does the patch on that bug work for
you, and would you be happy storing the CONTIG line as string?

Peter



More information about the Biopython-dev mailing list