[Biojava-l] Encountered a Parsing Exception on Genbank

Cox, Greg gcox@netgenics.com
Wed, 27 Feb 2002 16:51:39 -0500


I've got a fix in that will treat the missing type tag as optional.  This
should be backed out when NCBI releases a new version, but will get people
through in the meantime.  The catch is that if there's a different field
missing, it will fail silently.  I haven't seen any records like that; if
you do please let me know.

Greg

> -----Original Message-----
> From: Cox, Greg [mailto:gcox@netgenics.com]
> Sent: Wednesday, February 27, 2002 2:15 PM
> To: 'cantey.lg@pg.com'; biojava-l@biojava.org
> Subject: RE: [Biojava-l] Encountered a Parsing Exception on Genbank
> 
> 
> This record is malformed, it's something we've seen 
> internally with the new
> version of genbank.  It's missing the TYPE tag in columns 45-53.  I'd
> suggest:
> 1) Dummy up your record so it conforms to the Genbank spec.
> 2) Comment out the body of parseLocusLinePost127() if you 
> don't need any
> information from there.
> 
> Greg
> 
> > -----Original Message-----
> > From: cantey.lg@pg.com [mailto:cantey.lg@pg.com]
> > Sent: Wednesday, February 27, 2002 1:28 PM
> > To: biojava-l@biojava.org
> > Subject: [Biojava-l] Encountered a Parsing Exception on Genbank
> > 
> > 
> > In attempting to parse the source file for Genbank accession 
> > number AB030903,
> > the following exception was encountered.   Any ideas would 
> be greatly
> > appreciated !
> > ----------------------------------------------------------
> > 
> > org.biojava.bio.seq.io.ParseException: LOCUS line incorrectly 
> > tokenized [LOCUS       AB030903                1441 bp        
> >     linear   VRT 15-AUG-2000]
> > 
> >      at 
> > org.biojava.bio.seq.io.GenbankContext.parseLocusLinePost127(Ge
> > nbankFormat.java:611)
> > 
> >      at 
> > org.biojava.bio.seq.io.GenbankContext.processHeaderLine(Genban
> > kFormat.java:521)
> > 
> >      at 
> > org.biojava.bio.seq.io.GenbankContext.processLine(GenbankForma
> > t.java:372)
> > 
> >      at 
> > org.biojava.bio.seq.io.GenbankFormat.readSequence(GenbankForma
> > t.java, Compiled Code)
> > 
> >      at 
> > org.biojava.bio.seq.io.StreamReader.nextSequence(StreamReader.
> > java:100)
> > 
> > rethrown as org.biojava.bio.BioException: Could not read sequence
> > 
> >      at 
> > org.biojava.bio.seq.io.StreamReader.nextSequence(StreamReader.
> > java:103)
> > 
> >      at seq.UpdatedTestGenbank.main(UpdatedTestGenbank.java, 
> > Compiled Code)
> > 
> > Process exited with exit code 1.
> > 
> > Best Regards,
> > Larry Cantey
> > 
> > 
> > _______________________________________________
> > Biojava-l mailing list  -  Biojava-l@biojava.org
> > http://biojava.org/mailman/listinfo/biojava-l
> > 
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
>