[Bioperl-l] looks like a Bio::SeqIO error

Chris Fields cjfields at illinois.edu
Fri May 15 23:36:47 UTC 2009


On May 15, 2009, at 1:58 PM, Hilmar Lapp wrote:

> I think you're running up against an OS limit on the number of open  
> files, or the number of files in a directory. You can check (and  
> change) your limits with ulimit.
>
> The largefasta modules is designed for reading in and handling large  
> (like, really large - whole-chromosome scale) sequences which, if  
> all held in memory, would exhaust the memory either immediately or  
> pretty quickly. So it stores them in temporary files. Most unix  
> systems will limit the number of files you can have open at any one  
> time.
>
> If your sequences in that file aren't huge, largefasta isn't the  
> module you want to use - just use the fasta parser, or if you need  
> random access to sequences in the file (do you?) then  
> Bio::DB::Fasta. Writing sequences to temporary files is a waste of  
> time if they fit into memory just fine.
>
> The odd thing is that you actually run up to the limit. Normally the  
> temporary files should be closed and deleted when the sequence  
> objects go out of scope (I think - should verify in the code of  
> course ...) , so the fact that they don't lets me suspect that the  
> code snippet that you presented isn't all that there is to it - are  
> you storing the sequences somewhere in a variable, such as in an  
> array or a hash table?
>
> 	-hilmar

This was a legit bug.  The DESTROY method in LargePrimarySeq only  
removed files but not their directories.  I added a few extra lines to  
do that.

chris



More information about the Bioperl-l mailing list