[Bioperl-l] Bug in genbank parsing: CONTIG gaps
Chris Fields
cjfields at uiuc.edu
Thu May 4 18:40:32 UTC 2006
Are you using the CONTIG record or the full GenBank file? I see
problems with both (using bioperl-live) which seem unrelated to one another.
The full file seems to be running a bit slow b/c the full GenBank record is
huge (~55 MB) but the CONTIG file does exactly what you said (runs out of
memory).
Chris
> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Michael Rogoff
> Sent: Tuesday, May 02, 2006 10:32 PM
> To: bioperl-l at lists.open-bio.org
> Subject: [Bioperl-l] Bug in genbank parsing: CONTIG gaps
>
>
> I've encountered a pretty serious bug in Bio::SeqIO when parsing certain
> genbank
> files that contain CONTIG entries with gaps. One such record is
> NW_925173.
>
> When I try to parse this file using Bio::SeqIO::genbank, it will enter an
> infinite loop and spin until it runs out of memory.
>
> I'm pretty certain it relates to this bug:
> http://bugzilla.bioperl.org/show_bug.cgi?id=1319 which seems to indicate
> that
> genbank records with CONTIG gaps are not valid and can't be parsed. But
> this
> bug actually claims to be fixed, which is strange, since looking at the
> code for
> FTLocationFactory (where the loop is) it's still right there. I assume
> that
> this may be fixed in other contexts but is still not fixed in
> Bio::SeqIO::genbank? Or am I doing something wrong?
>
> I think that this should probably be filed as an open bug. I would think
> that
> even if bioperl isn't interested in parsing this type of file via SeqIO,
> certainly you'd want to ensure that no finite input file would send the
> parser
> into an infinite loop. Have others encountered this problem? Is there
> any plan
> to address it?
>
> Thanks very much for any information or help!
>
> -Mike
>
> P.S. I've played around with my version of FTLocationFactory and it seems
> to
> actually work and parse the gaps. I'm not sure if I've created other bugs
> or if
> it works in all cases, but at least the parser doesn't die. I also don't
> know
> that my hacky code is appropriate for putting back in to BioPerl, but I'm
> happy
> to provide it if someone wants to check it out and/or consider it for
> checkin.
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
More information about the Bioperl-l
mailing list