[Biojava-l] A Simple Genbank Parser Runs Out of Memory

Schreiber, Mark mark.schreiber@agresearch.co.nz
Mon, 4 Mar 2002 08:25:25 +1300


Interesting ...

What was the file causing the problem?. If it represented a very large
sequence then it would require a large memory allocation while it was
being processed. If the required allocation was larger than the memory
available to the JVM them garbage collection wouldn't help it.

Mark

 Mark Schreiber
 Bioinformatics
 AgResearch Invermay
 PO Box 50034
 Mosgiel
 New Zealand
 
 PH: +64 3 489 9175
 

-----Original Message-----
From: cantey.lg@pg.com [mailto:cantey.lg@pg.com] 
Sent: Saturday, 2 March 2002 9:26 a.m.
To: biojava-l@biojava.org
Subject: [Biojava-l] A Simple Genbank Parser Runs Out of Memory


I created a simple parser based on the Genbank demo, to test a proof of
concept. I ran it on a large Genbank source file ( contains over 160,000
"sequences"). The program  processed 34,170 sequences then crashed with
the
java.lang.OutOfMemoryError .   I ran this on an NT with 768mb ram, using
the
1.2.2 JVM.

The interesting thing about this is that I was able watch the size of
the JVM. For the first 34,169 sequences it reached a steady state at
about 14 mb with normal expansion and contraction.  Then it processed a
sequence and the JVM size jumped up to 85 mb and crashed.  This scenario
was exactly reproducible.  To make sure it wasn't a data issue,  I took
5 sequences before and after the "crashing" sequence and put them in a
separate file.  I was able to process this file with no interesting
problems.

The net result was that I was able to "solve" this problem by
pre-allocating a larger JVM.  However, I am concerned when I see an
expansion of the JVM by 70 mb when it does a simple parse of a sequence.

Are any of you aware of perhaps garbage collecting problems in jdk 1.2.2
?

Any other ideas?

Thanks and Best Regards,
Larry Cantey


_______________________________________________
Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================