[Bioperl-l] looks like a Bio::SeqIO error

fungazid fungazid at yahoo.com
Fri May 15 20:55:50 UTC 2009


hilmar, I believe your suspicions are wrong. The proof: changing -format to
'Fasta' instead of 'largefasta' in: 
Bio::SeqIO->new(-file=> $fileIn, -format => 'Fasta') 
solved my problem (as was suggested, this is probably not the right method
to use, but it works).


Hilmar Lapp wrote:
> 
> I think you're running up against an OS limit on the number of open  
> files, or the number of files in a directory. You can check (and  
> change) your limits with ulimit.
> 
> The largefasta modules is designed for reading in and handling large  
> (like, really large - whole-chromosome scale) sequences which, if all  
> held in memory, would exhaust the memory either immediately or pretty  
> quickly. So it stores them in temporary files. Most unix systems will  
> limit the number of files you can have open at any one time.
> 
> If your sequences in that file aren't huge, largefasta isn't the  
> module you want to use - just use the fasta parser, or if you need  
> random access to sequences in the file (do you?) then Bio::DB::Fasta.  
> Writing sequences to temporary files is a waste of time if they fit  
> into memory just fine.
> 
> The odd thing is that you actually run up to the limit. Normally the  
> temporary files should be closed and deleted when the sequence objects  
> go out of scope (I think - should verify in the code of course ...) ,  
> so the fact that they don't lets me suspect that the code snippet that  
> you presented isn't all that there is to it - are you storing the  
> sequences somewhere in a variable, such as in an array or a hash table?
> 
> 	-hilmar
> 
> On May 15, 2009, at 9:05 AM, fungazid wrote:
> 
>>
>> Hello,
>>
>> I hope this is the right address for bioperl programming issues.  
>> Bioperl
>> saves me a lot of time (not to re-invent the wheel), but there are  
>> some
>> extremely irritating problems (I would change the code myself if I  
>> knew
>> how).
>>
>> I am trying to read a file (~20MB) containing multiple fasta  
>> sequences:
>>> a
>> AGTAGTGAGTGCGCTGA.........
>>> b
>> GCGCTGAAGTAGTGAGT.......
>>> c
>> AGTAGTGAGTGCGCTGA.........
>>> d...........
>>
>> with the following lines:
>>
>> my $seqin = Bio::SeqIO->new('-format'=>'largefasta','-file'=>$file1);
>>
>> LOOP1: while ( my $seqobj1 = $seqin->next_seq())
>>
>> {
>> ......
>> my $seq=$seqobj1->subseq(1,$seqobj1->length);
>> .......
>> }
>>
>>
>> This works right for the first ~30000 contig sequences but then the
>> following message appears:
>>
>> Error in tempdir() using /tmp/XXXXXXXXXX: Could not create directory
>> /tmp/6eS92VzVjm: Too many links at /usr/share/perl5/Bio/Root/IO.pm  
>> line 744
>> DESTROY() mysql_insert obj
>> destroying HANDLE
>>
>> What to do ??? (this is only one of some different Bioperl related  
>> bugs that
>> I'm experiencing)
>>
>>
>>
>>
>> -- 
>> View this message in context:
>> http://www.nabble.com/looks-like-a-Bio%3A%3ASeqIO-error-tp23559474p23559474.html
>> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 

-- 
View this message in context: http://www.nabble.com/looks-like-a-Bio%3A%3ASeqIO-error-tp23559474p23567169.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.




More information about the Bioperl-l mailing list