[Bioperl-l] Exceptions thrown when parsing embl format

Ewan Birney birney@ebi.ac.uk
Wed, 20 Dec 2000 20:34:00 +0000 (GMT)


On Wed, 20 Dec 2000, Rob Ewing wrote:

> Hi,
> When parsing embl format sequence entries, I run into problems due to
> slight 'errors' in the embl format.

we are trying to have a bettererror control and also fuzzy location
parsing in 0.7, but noone has really got this under control yet.


I realise it is a pain in the arse. GenBank/EMBL is a pain in the arse,
period (it is not their fault... legacy data...).


> 
> for example , I get the following error message:
> 
> -------------------- EXCEPTION --------------------
> MSG: Weird location line [1280..(1547.1700)] in reading GenBank
> CONTEXT: Error in uNKNOWN CONTEXT
> SCRIPT: e
> STACK: 
> Bio::SeqIO::FTHelper::_generic_seqfeature(160)
> Bio::SeqIO::embl::next_seq(296)
> main::-e(1)
> ---------------------------------------------------
> 
> when parsing an embl format entry that has the line :
> 
> FT   exon            1280..(1547.1700)
> 
> (I assume that the parser cannot figure out the start and end of this
> exon feature).
> How can I deal with this - is there any way to prevent an exception
> being thrown and just move on to the next entry in the file. Or should
> I look at ways of excluding the 'bad' entries from the file?
> (All I want to do is convert a large embl format file to fasta format!)
> 
> thanks
> 
> Rob.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
> 

-----------------------------------------------------------------
Ewan Birney. Mobile: +44 (0)7970 151230, Work: +44 1223 494420
<birney@ebi.ac.uk>. 
-----------------------------------------------------------------