[Biojava-l] Problem parsing Genbank files

Fenger, Doug dfenger at amylin.com
Thu Apr 8 17:52:19 EDT 2004


Hi,
I've been using biojava-1.3.1 to parse some Genbank files, similar to the FilterEMBLBySpecies code at BioJava in Anger.  The problem is that with some Genbank files I get an IllegalArgumentException.  I believe it's caused when the Feature has a single base for a Location (such as in Genbank records AY197155 and M19699), like this:

     tRNA            <1

The problem goes away when I change it to

     tRNA            <1..457

I also get an error message if there's a newline in the feature, as in record L32753:

                     /note="
                     50 bp gap between spans; putative"

Here's the code I've been using to test it:

import java.io.*;
import org.biojava.bio.seq.*;
import org.biojava.bio.seq.io.*;

public class TestSeqIOTools {

    public static void main(String[] args) {

        if (args.length != 1) {
            System.out.println("Usage: java TestSeqIOTools <GenBank file name>");
            System.exit(1);
        }

        try {
            BufferedReader fin = new BufferedReader(new FileReader(args[0]));
            SequenceIterator stream = SeqIOTools.readGenbank(fin);
            while(stream.hasNext()) {
                Sequence seq = stream.nextSequence();
            }
            fin.close();
        } catch(Exception e) {
            System.err.println("Exception: " + e.getMessage());
        }
    }
}

Thanks for any suggestions,
Doug

p.s.  I get similar errors if I use EMBL files instead.



More information about the Biojava-l mailing list