[Biojava-l] Out of heap space during structure parsing.

Paul B tallpaulinjax at yahoo.com
Wed Mar 11 12:58:27 UTC 2009


Hi,
 
I am using BioJava 1.6.1 to parse PDB files. My machine has 2GB of RAM. I am using Netbeans 6.5 as my development environment with Java 1.6. My user-specific netbeans.conf file is attached, with a heap space of 1GB. The relevant BioJava code is below:
 
    try {
      pdbreader = new PDBFileReader();
      pdbreader.setPath(localFilePath);
      pdbreader.setAutoFetch(true); 
      struc = pdbreader.getStructureById(pdbCode);
    ...
 
Using this code, I had successfully parsed smaller PDB files like 2BEG and 1Q80. Then I tried to parse a slightly larger file 1FFK and received this message on the 'struc =' line:
 
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
        at org.biojava.bio.alignment.NeedlemanWunsch.pairwiseAlignment(NeedlemanWunsch.java:411)
        at org.biojava.bio.alignment.NeedlemanWunsch.getAlignment(NeedlemanWunsch.java:315)
        at org.biojava.bio.structure.io.SeqRes2AtomAligner.align(SeqRes2AtomAligner.java:220)
        at org.biojava.bio.structure.io.SeqRes2AtomAligner.align(SeqRes2AtomAligner.java:140)
        at org.biojava.bio.structure.io.PDBFileParser.triggerEndFileChecks(PDBFileParser.java:2249)
        at org.biojava.bio.structure.io.PDBFileParser.parsePDBFile(PDBFileParser.java:2155)
        at org.biojava.bio.structure.io.PDBFileParser.parsePDBFile(PDBFileParser.java:2013)
        at org.biojava.bio.structure.io.PDBFileReader.getStructureById(PDBFileReader.java:439)
        at biojavatest.PdbDemo.grabPdbFileStruc(PdbDemo.java:105)
        at biojavatest.PdbDemo.runTest(PdbDemo.java:67)
        at biojavatest.PdbDemo.main(PdbDemo.java:58)

Any suggestions? Is the problem specific to some deviation in 1FFK, or in BioJava's parser implementation? 
 
By the way, I am using BioJava simply as a parser, and I am then dumping the data into class objects of my own design and persisting them to a SQL Server database. As such, I don't need all the ATOM information held in memory. Perhaps there is a way to lazy load that information upon request?
 
Is there a development version of BioJava that's downloadable and offers a more memory efficient way of grabbing data?
Thanks,
 
Paul
-------------- next part --------------
A non-text attachment was scrubbed...
Name: netbeans.conf
Type: application/octet-stream
Size: 1965 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/biojava-l/attachments/20090311/62c4326c/attachment-0002.obj>


More information about the Biojava-l mailing list