[Biojava-l] Error during genbank parsing

Marcel Huntemann marcel.huntemann at gmail.com
Fri Apr 30 00:49:10 UTC 2010


Hi!

I get the following error during the parsing of a genbank file:

Exception in thread "main" org.biojava.bio.BioException: Could not read
sequence
	at
org.biojavax.bio.seq.io.RichStreamReader.nextRichSequence(RichStreamReader.java:113)
	at gov.doe.jgi.img.pangenomes.Controller.createGeneMap(Controller.java:303)
	at gov.doe.jgi.img.pangenomes.Controller.start(Controller.java:197)
	at gov.doe.jgi.img.pangenomes.Main.createAndStartController(Main.java:105)
	at gov.doe.jgi.img.pangenomes.Main.main(Main.java:35)
Caused by: org.biojava.bio.seq.io.ParseException:

A Exception Has Occurred During Parsing.
Please submit the details that follow to biojava-l at biojava.org or post a
bug report to http://bugzilla.open-bio.org/

Format_object=org.biojavax.bio.seq.io.GenbankFormat
Accession=null
Id=null
Comments=Bad locus line
Parse_block=LOCUS   NC_008711      4597686 bp      DNA circular
17-DEC-2009
Stack trace follows ....


	at
org.biojavax.bio.seq.io.GenbankFormat.readRichSequence(GenbankFormat.java:322)
	at
org.biojavax.bio.seq.io.RichStreamReader.nextRichSequence(RichStreamReader.java:110)
	... 4 more

No matter which genbank file I use, I always get this error (for sure with
a different LOCUS line. The strange thing is that this used to work about
1/2 - 1 year ago. No I wanted to use my program again and get always this
error, although I didn't really change anything on that code.
The only thing I can think of that's different, since the last time I used
it (when it worked), is that I switched from a 32bit Linux to a 64bit
Linux machine. But can that really cause it?

Here's my code and how I use it:

for ( String taxonId : givenTaxonIds )
		{
    		gbkFile = new File( dirPath + taxonId + gbkSuffix );
    		if ( ! gbkFile.exists() )
    		{
    			logr.fatal( "Couldn't find genbank file for taxonOID " + taxonId +
    					"!\nI tried " + gbkFile.getPath() + ", but it doesn't exist!" );
    			System.exit( 0 );
    		}
    		
    		BufferedReader br = new BufferedReader( new FileReader( gbkFile ) );
        	Namespace ns = RichObjectFactory.getDefaultNamespace();

        	RichSequenceIterator seqs = RichSequence.IOTools.readGenbankDNA(
br, ns );
    		numberInGenome = 0;
    		while ( seqs.hasNext() )
    		{
    			RichSequence contig = seqs.nextRichSequence();
    			// Get genes and their positions
    			Set<Feature> features = contig.getFeatureSet();
    			positions = new ArrayList<int[]>();
    			geneIds = new ArrayList<String>();
    			
			    for ( Feature richFeature : features )
				{
			    	if ( richFeature.getType().equals( "CDS" ) )
					{
			    		RichLocation loc = (RichLocation) richFeature.getLocation();
			    		position = new int[3];
			    		position[0] = loc.getMin();
			    		position[1] = loc.getMax();
			    		position[2] = loc.getStrand().intValue();
			    		Annotation a = richFeature.getAnnotation();
		    			split = a.getProperty( "note" ).toString().split( "=" );
		    			geneIds.add( split[1].trim() );
			    		positions.add( position );
					}
			    	else if ( richFeature.getType().equals( "gene" ) )
					{
			    		Annotation a = richFeature.getAnnotation();
			    		if ( a.containsProperty( "pseudo" ) )
						{
			    			RichLocation loc = (RichLocation) richFeature.getLocation();
				    		position = new int[3];
				    		position[0] = loc.getMin();
				    		position[1] = loc.getMax();
				    		position[2] = loc.getStrand().intValue();
				    		split = a.getProperty( "note" ).toString().split( "=" );
			    			geneIds.add( split[1].trim() );
				    		positions.add( position );
						}
					}
				}

Thanks 4 the help,
Marcel

P.S.: Also the info on some of the biojava pages seems outdated. I got the
latest version from your svn trunk and on the GetStarted page it says that
 one just has to call ant to build it. But there's now build.xml in the
biojava folder. Instead there's a pom.xml, so I guess u switched to maven.
I bet a lot of people don'tknow how to geal with and have no clue what to
do, when the ant command didn't work...





More information about the Biojava-l mailing list