[Biojava-l] GenBank XML File Parse Error

Toralf Kirsten tkirsten at izbi.uni-leipzig.de
Fri Jan 23 10:35:47 EST 2004


Hi,
I have to extract data from the GenBank XML files.
For this purpose I use the biojava API. But I get a parser error.

java.lang.StringIndexOutOfBoundsException: String index out of range: 12
at java.lang.String.substring(String.java:1477)
at org.biojava.bio.seq.io.GenbankContext.processHeaderLine
(GenbankContext.java:621)
at org.biojava.bio.seq.io.GenbankContext.processLine
(GenbankContext.java:263)
at org.biojava.bio.seq.io.GenbankFormat.readSequence
(GenbankFormat.java:144)
at org.biojava.bio.seq.io.StreamReader.nextSequence
(StreamReader.java:100)

rethrown as org.biojava.bio.BioException: Could not read sequence
at
org.biojava.bio.seq.io.StreamReader.nextSequence
(StreamReader.java:103)
at de.izbi.gbm.logistics.GenBankBioJavaImporter.readFile
(GenBankBioJavaImporter.java:41)
at de.izbi.gbm.gui.GenBankBaseFrame.actionPerformed
(GenBankBaseFrame.java:134)
at javax.swing.AbstractButton.fireActionPerformed
(AbstractButton.java:1764)
at javax.swing.AbstractButton$ForwardActionEvents.actionPerformed
(AbstractButton.java:1817)
at javax.swing.DefaultButtonModel.fireActionPerformed
(DefaultButtonModel.java:419)
at javax.swing.DefaultButtonModel.setPressed
(DefaultButtonModel.java:257)
at javax.swing.AbstractButton.doClick(AbstractButton.java:289)
at javax.swing.plaf.basic.BasicMenuItemUI.doClick
(BasicMenuItemUI.java:1109)
at javax.swing.plaf.basic.BasicMenuItemUI$MouseInputHandler.
mouseReleased(BasicMenuItemUI.java:943)
at java.awt.Component.processMouseEvent(Component.java:5093)
at java.awt.Component.processEvent(Component.java:4890)
at java.awt.Container.processEvent(Container.java:1566)
at java.awt.Component.dispatchEventImpl(Component.java:3598)
at java.awt.Container.dispatchEventImpl(Container.java:1623)
at java.awt.Component.dispatchEvent(Component.java:3439)
at java.awt.LightweightDispatcher.retargetMouseEvent
(Container.java:3450)
  at java.awt.LightweightDispatcher.processMouseEvent
(Container.java:3165)
at java.awt.LightweightDispatcher.dispatchEvent(Container.java:3095)
at java.awt.Container.dispatchEventImpl(Container.java:1609)
at java.awt.Window.dispatchEventImpl(Window.java:1585)
at java.awt.Component.dispatchEvent(Component.java:3439)
at java.awt.EventQueue.dispatchEvent(EventQueue.java:450)
at java.awt.EventDispatchThread.pumpOneEventForHierarchy
(EventDispatchThread.java:197)
at java.awt.EventDispatchThread.pumpEventsForHierarchy
(EventDispatchThread.java:150)
at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:144)
at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:136)
at java.awt.EventDispatchThread.run(EventDispatchThread.java:99)



The program is just simple. The user specifies path and file name by the
FileChooser component. Then I open the file and apply the Sequence and
Annotation classes as visible in the attached method taken from a extended
file class.

What I need are the sequence data of the GenBank entry (accession,
sequence etc.)
and also for its features (start, end position, subtype like t-RNA, cds
etc.)

Any hints are welcome.
Thanks Tori

---------------------

public GenBankBioJavaImporter(String path, String fileName, Connection
genDbCon) {
    super();
    super.setPath(path);
    super.setFileName(fileName);
  }
public boolean readFile() {
    if (!super.createInputFile()) return(false);

    //read the GenBank File
    SequenceIterator sequences =
SeqIOTools.readGenbank(super.fileReaderHandler); // fileReaderHandler is
a BufferedReader

    //iterate through the sequences
    while(sequences.hasNext()) {
      try {

        Sequence seq = sequences.nextSequence();
        //do stuff with the sequence
        System.out.println("Info: "+seq.getName()+", "+seq.getURN()+",
"+seq.countFeatures());
        Annotation anno = seq.getAnnotation();
        //anno.getProperty()
      }
      catch (BioException ex) {
        //not in GenBank format
        ex.printStackTrace();
        super.closeInputFile();
        return(false);
      }catch (NoSuchElementException ex) {
        //request for more sequence when there isn't any
        ex.printStackTrace();
        super.closeInputFile();
        return(false);
      }
    }
    super.closeInputFile();
    return(true);
  }








More information about the Biojava-l mailing list