[Biojava-l] Parsing Genbank-sequences from NCBI

Seth Johnson johnson.biotech at gmail.com
Sat Aug 12 17:17:57 UTC 2006


More problems with parsing nucleotide sequences from NCBI.  Apparently,
there's an odd dbxref tag on some of the sequences submitted by ATCC that
causes an exception.  I've ran into 2 so far, but I'm sure there are more:

AA343569.1
AA325485.1

Exceptions produced are as follows:
--------------------------------------------------------------
Trying to get: AA343569.1
org.biojava.bio.BioException: Failed to read Genbank sequence
        at
org.biojavax.bio.db.ncbi.GenbankRichSequenceDB.getRichSequence(GenbankRichSequenceDB.java:157)
        at exonhit.parsers.EventParser.getSeqFromNCBI(EventParser.java:250)
        at exonhit.parsers.EventParser.insertRglrSE(EventParser.java:197)
        at
exonhit.parsers.EventParser.createSpliceEvents(EventParser.java:105)
        at exonhit.parsers.EventParser.main(EventParser.java:310)
Caused by: org.biojava.bio.BioException: Could not read sequence
        at
org.biojavax.bio.seq.io.RichStreamReader.nextRichSequence(RichStreamReader.java:112)
        at
org.biojavax.bio.db.ncbi.GenbankRichSequenceDB.getRichSequence(GenbankRichSequenceDB.java:153)
        ... 4 more
Caused by: org.biojava.bio.seq.io.ParseException: Bad dbxref found: ATCC
(inhost):145151, accession:AA343569
        at
org.biojavax.bio.seq.io.GenbankFormat.readRichSequence(GenbankFormat.java:438)
        at
org.biojavax.bio.seq.io.RichStreamReader.nextRichSequence(RichStreamReader.java:109)
        ... 5 more
Java Result: -1
=========================================================
Trying to get: AA325485.1
org.biojava.bio.BioException: Failed to read Genbank sequence
        at
org.biojavax.bio.db.ncbi.GenbankRichSequenceDB.getRichSequence(GenbankRichSequenceDB.java:157)
        at exonhit.parsers.EventParser.getSeqFromNCBI(EventParser.java:250)
        at exonhit.parsers.EventParser.insertRglrSE(EventParser.java:197)
        at
exonhit.parsers.EventParser.createSpliceEvents(EventParser.java:105)
        at exonhit.parsers.EventParser.main(EventParser.java:312)
Caused by: org.biojava.bio.BioException: Could not read sequence
        at
org.biojavax.bio.seq.io.RichStreamReader.nextRichSequence(RichStreamReader.java:112)
        at
org.biojavax.bio.db.ncbi.GenbankRichSequenceDB.getRichSequence(GenbankRichSequenceDB.java:153)
        ... 4 more
Caused by: org.biojava.bio.seq.io.ParseException: Bad dbxref found: ATCC
(inhost):125990, accession:AA325485
        at
org.biojavax.bio.seq.io.GenbankFormat.readRichSequence(GenbankFormat.java:438)
        at
org.biojavax.bio.seq.io.RichStreamReader.nextRichSequence(RichStreamReader.java:109)
        ... 5 more
Java Result: -1

-- 
View this message in context: http://www.nabble.com/Parsing-Genbank-sequences-from-NCBI-tf2052235.html#a5777810
Sent from the BioJava forum at Nabble.com.




More information about the Biojava-l mailing list