[Biojava-dev] [BioJava - Bug #3305] SequenceFileProxyLoader repeatedly opens file and never closes

redmine at redmine.open-bio.org redmine at redmine.open-bio.org
Mon Oct 24 20:31:58 UTC 2011


Issue #3305 has been updated by John Kern.


Hello, 

A small improvement to the patch would be to move the close() call into a separate finally block. 

-jk
----------------------------------------
Bug #3305: SequenceFileProxyLoader repeatedly opens file and never closes
https://redmine.open-bio.org/issues/3305

Author: John May
Status: New
Priority: Normal
Assignee: biojava-dev list
Category: seq.io
Target version: live (SVN source)
URL: 


Brief: SequenceFileProxyLoader continuously reopens file and never closes thus eventually throwing a FileNotFoundException as too many files have been opened. This only occurs when requesting tens of thousands of sequences however with the provided fix the proxy reader works as expected.
Operating System: OS X 10.6
JDK: 1.6

Exception:
<pre>
Exception in thread "main" org.biojava3.core.exceptions.FileAccessError: Error accessing /databases/uniprot/uniprot_sprot.fasta offset=42133810 sequenceLength=290 java.io.FileNotFoundException: /databases/uniprot/uniprot_sprot.fasta (Too many open files)
	at org.biojava3.core.sequence.loader.SequenceFileProxyLoader.init(SequenceFileProxyLoader.java:105)
	at org.biojava3.core.sequence.loader.SequenceFileProxyLoader.iterator(SequenceFileProxyLoader.java:246)
	at org.biojava3.core.sequence.template.AbstractSequence.iterator(AbstractSequence.java:583)
	at org.biojava3.core.sequence.template.SequenceMixin.toStringBuilder(SequenceMixin.java:158)
	at org.biojava3.core.sequence.template.SequenceMixin.toString(SequenceMixin.java:169)
	at org.biojava3.core.sequence.template.AbstractSequence.getSequenceAsString(AbstractSequence.java:521)
	at org.biojava3.core.sequence.io.FastaWriter.process(FastaWriter.java:103)
	at org.biojava3.core.sequence.io.FastaWriterHelper.writeProteinSequence(FastaWriterHelper.java:77)
	at org.biojava3.core.sequence.io.FastaWriterHelper.writeProteinSequence(FastaWriterHelper.java:59)
</pre>




How to repeat (need uniprot_sprot.fa):
<pre>
File sprotFasta = new File("path/to/uniprot_sprot.fa");
FastaReader<ProteinSequence, AminoAcidCompound> fastaReader 
= new FastaReader(new FileInputStream(sprotFasta),
                              new GenericFastaHeaderParser<ProteinSequence, AminoAcidCompound>(),
                              new FileProxyProteinSequenceCreator(sprotFasta, new AminoAcidCompoundSet()));

Map<String, ProteinSequence> sprotMap = fastaReader.process();
FastaWriterHelper.writeProteinSequence(File.createTempFile("output", ".fa"), sprotMap.values()); // this is just to demonstrate the bug
</pre>


Fix (org.biojava3.core.sequence.loader.SequenceFileProxyLoader: line 98-108) Also see diff file
<pre>
    private boolean init() {
        try {
            RandomAccessFile randomAccessFile = new RandomAccessFile(file, "r");
            randomAccessFile.seek(sequenceStartIndex);
            String sequence = sequenceParser.getSequence(randomAccessFile, sequenceLength);
            setContents(sequence);
            randomAccessFile.close(); // close file to prevent too many being open
        } catch (Exception e) {
            throw new FileAccessError("Error accessing " + file + " offset=" + sequenceStartIndex + " sequenceLength=" + sequenceLength + " " + e.toString());
        }
        return true;
    }
</pre>



-- 
You have received this notification because you have either subscribed to it, or are involved in it.
To change your notification preferences, please click here and login: http://redmine.open-bio.org




More information about the biojava-dev mailing list