[Biojava-l] reading fasta file out of memory error

Haluk Dogan hlk.dogan at gmail.com
Wed Jun 27 07:01:07 UTC 2012


Hi,

I have an 1.8 GB fasta file and I was trying to read it with the following
code as in suggested examples page.

LinkedHashMap<String, DNASequence> seqs =
FastaReaderHelper.readFastaDNASequence(new File(args[0]));

I don't get any error for small size files but it gives the following error
for big files.

Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit
exceeded
    at java.util.Arrays.copyOf(Arrays.java:2746)
    at java.util.ArrayList.ensureCapacity(ArrayList.java:187)
    at
org.biojava3.core.sequence.storage.ArrayListSequenceReader.setContents(ArrayListSequenceReader.java:187)
    at
org.biojava3.core.sequence.template.AbstractSequence.<init>(AbstractSequence.java:88)
    at org.biojava3.core.sequence.DNASequence.<init>(DNASequence.java:81)
    at
org.biojava3.core.sequence.io.DNASequenceCreator.getSequence(DNASequenceCreator.java:62)
    at
org.biojava3.core.sequence.io.FastaReader.process(FastaReader.java:113)
    at
org.biojava3.core.sequence.io.FastaReaderHelper.readFastaDNASequence(FastaReaderHelper.java:106)
    at
org.biojava3.core.sequence.io.FastaReaderHelper.readFastaDNASequence(FastaReaderHelper.java:118)


Is there any efficient way?

Thanks in advance.

-- 
HD



More information about the Biojava-l mailing list