[Biojava-l] reading fasta file out of memory error

Scooter Willis HWillis at scripps.edu
Thu Jun 28 05:46:55 UTC 2012


Yes look for the lazyread option in the api

----- Reply message -----
From: "Mic" <mictadlo at gmail.com>
To: "Daniel Asarnow" <dasarnow at gmail.com>
Cc: "biojava-l at lists.open-bio.org" <biojava-l at lists.open-bio.org>
Subject: [Biojava-l] reading fasta file out of memory error
Date: Wed, Jun 27, 2012 9:18 pm



Is it possible to read entry by entry rather to read the whole file in
memory?


On Wed, Jun 27, 2012 at 5:44 PM, Daniel Asarnow <dasarnow at gmail.com> wrote:

> Hi,
> Have you tried increasing the size of the heap? You can use the -Xmx option
> to java, e.g. -Xmx2048m or higher.
>
> The GC overhead error is usually thrown when the constraints of the heap
> size force the JVM to spend too much time collecting garbage.
>
> -da
>
> On Wed, Jun 27, 2012 at 12:01 AM, Haluk Dogan <hlk.dogan at gmail.com> wrote:
>
> > Hi,
> >
> > I have an 1.8 GB fasta file and I was trying to read it with the
> following
> > code as in suggested examples page.
> >
> > LinkedHashMap<String, DNASequence> seqs =
> > FastaReaderHelper.readFastaDNASequence(new File(args[0]));
> >
> > I don't get any error for small size files but it gives the following
> error
> > for big files.
> >
> > Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit
> > exceeded
> >    at java.util.Arrays.copyOf(Arrays.java:2746)
> >    at java.util.ArrayList.ensureCapacity(ArrayList.java:187)
> >    at
> >
> >
> org.biojava3.core.sequence.storage.ArrayListSequenceReader.setContents(ArrayListSequenceReader.java:187)
> >    at
> >
> >
> org.biojava3.core.sequence.template.AbstractSequence.<init>(AbstractSequence.java:88)
> >    at org.biojava3.core.sequence.DNASequence.<init>(DNASequence.java:81)
> >    at
> >
> >
> org.biojava3.core.sequence.io.DNASequenceCreator.getSequence(DNASequenceCreator.java:62)
> >    at
> > org.biojava3.core.sequence.io.FastaReader.process(FastaReader.java:113)
> >    at
> >
> >
> org.biojava3.core.sequence.io.FastaReaderHelper.readFastaDNASequence(FastaReaderHelper.java:106)
> >    at
> >
> >
> org.biojava3.core.sequence.io.FastaReaderHelper.readFastaDNASequence(FastaReaderHelper.java:118)
> >
> >
> > Is there any efficient way?
> >
> > Thanks in advance.
> >
> > --
> > HD
> > _______________________________________________
> > Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/biojava-l
> >
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biojava-l
>
_______________________________________________
Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biojava-l




More information about the Biojava-l mailing list