[Bioperl-l] SeqIO::genbank crash special case
Ewan Birney
birney@ebi.ac.uk
Wed, 12 Dec 2001 07:23:08 +0000 (GMT)
On Tue, 11 Dec 2001 jdiggans@genelogic.com wrote:
> I recently came across a horribly mis-formatted GenBank record on our local
> copy that caused SeqIO::genbank to choke. I've fixed the problem in my
> local copy but was wondering if bioperl has a policy for what to do in
> bizarre use cases?
There is an philosphocial point whether we should throw or we should warn
in these cases. I guess we warn and then if someone sets the severity flag
we blow up. Hmmm.
>
> The problem appears here:
>
> 285 # to the last line read before returning
> 286 my $ftunit = $self->_read_FTHelper_GenBank(\$buffer);
> 287 # process ftunit
> 288 $ftunit->_generic_seqfeature($seq);
>
> $ftunit is never tested to ensure it's defined before being used. In the
> event something happens in _read_FTHelper_GenBank (my current issue) the
> script ends up dying messily. I've patched mine to:
>
> # to the last line read before returning
> my $ftunit = $self->_read_FTHelper_GenBank(\$buffer);
>
> # process ftunit - if there is a problem, warn and skip this FT unit
> if( defined($ftunit) ) {
> $ftunit->_generic_seqfeature($seq);
> } else {
> $self->warn("Unexpected feature error - FTUnit undefined,
> skipping");
> unless( ($buffer =~ /^\s{5,5}\S+/) or ($buffer =~ /^\S+/)) {
> $buffer = $self->_readline;
> }
> }
>
> Is it worth adding some version of this to genbank.pm to allow a parse to
> recover from a single poorly-formatted entry in a feature table? Or within
> the bioperl mentality 'should' this kind of error be considered something
> terminal?
>
> This particular record happened to have an oddly-placed carriage return in
> the middle of a feature range, completely confusing the
> _read_FTHelper_GenBank routine and returning undef which then had a sub
> called on it.
>
Ok. I am going to apply you patch.
> -j
>
> -------------------------------------------------
> James Diggans
> Bioinformatics Programmer
> Gene Logic, Inc.
> Phone: 301.987.1756
> FAX: 301.987.1701
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>