[Bioperl-l] StandAloneFasta and Too many open files

Dimitar Kenanov dimitark at bii.a-star.edu.sg
Tue May 11 04:24:13 UTC 2010

Hi Chris,
thank you for the information. I checked it out.
I wrote you and the list about that as well. To you on 16.04.2010 and to 
the list on 23.04.2010. There i explained that i modified the module. 
Now i pass it the '0' option but this option is not passed to the actual 
program executed by system. I just add my desired output with "> 
$output" to the parameter line passed to system. In the email mentioned 
above i attached the modified version of the module.
I was digging again a bit about the module. I found that - line(359):
  unless( $outfile ) {
         open(FASTARUN, "$para |") || $self->throw($@);#original
         $object=Bio::SearchIO->new(-fh=>\*FASTARUN, #original
         } else {

And here another one when the 'O' is used - line(371):
$object = Bio::SearchIO->new(-file=>$self->O,

May be the problem is here. Because i didnt see anywhere a 'close' for 
these filehandles. I can test and tell if i was right.


On 05/11/2010 11:57 AM, Chris Fields wrote:
> Addendum to that last post.
> On May 10, 2010, at 10:04 PM, Chris Fields wrote:
>> On May 10, 2010, at 9:03 PM, Dimitar Kenanov wrote:
>>> Hi guys,
>>> yesterday i got the following error:
>>>    'Too many open files at /usr/lib64/perl5/site_perl/5.10.0/Bio/Tools/Run/Alignment/StandAloneFasta.pm line 380'
>>> from the following code:
>>> ------------
>>>    my $ssout="my_seq_out.txt";
>>>    print "SS:$tquery:\n:$tseq:\n";
>>>    my @sargs=(
>>>        'q' =>  '',
>>>        'E' =>  '1',
>>>        'w' =>  '100',
>>>        'O' =>  "$ssout",
>>>        'program' =>  "ssearch36",
>>>        );
>>>    my $fac_ss=Bio::Tools::Run::Alignment::StandAloneFasta->new(@sargs);
>>>    $fac_ss->library($tmpseq);
>>>    my @sreport=$fac_ss->run($tqtmp);
>>> foreach my $sr (@sreport){
>>>      while(my $result=$sr->next_result){
>>>          while(my $hit=$result->next_hit){
>>>              while(my $hsp=$hit->next_hsp){
>>>                  my $iden=$hsp->frac_identical;
>>>                  $rv3=$iden;
>>> #               print "IDEN:$iden:$rv1\n";
>>>              }
>>>          }
>>>      }
>>> }
>>> --------------------
>>> I am using that code over several thousands of HSPs for which i get the sequence and then 'ssearch36' with it against another sequence. I was digging around the module StandAloneFasta but couldnt get where the problem is. There should be somewhere many opened filehandles but do not know where. I checked the module but couldnt find such filehandles. May be the problem is in the base modules.
>>> I also checked and my script for left open filehandles and i have not. I found only that i can actually close SeqIO streams with '$stream->close' which i didnt see on the web documentation. So something positive out of this :) So i closed all my SeqIO streams and i still had the same problem.
>>> Next i commented out the above code and rewrote my script into the following:
>>> --------------
>>>    my $ssout="my_seq_out.txt";
>>>    my @sargs=("ssearch36 -q -E 1 -d 1 $tqtmp $tmpseq>  $ssout");
>>>    system(@sargs) == 0 or die "system @sargs failed: $!";
>>>    my $sreport=Bio::SearchIO->new(-file =>  $ssout, -format =>  'fasta');
>>>    while(my $result=$sreport->next_result){
>>> #    print Dumper($result);
>>>        while(my $hit=$result->next_hit){
>>>            while(my $hsp=$hit->next_hsp){
>>>            my $iden=$hsp->frac_identical;
>>>            $rv3=$iden;
>>> #            print "IDEN:$iden:$rv1\n";
>>>            }
>>>        }
>>>    }
>>> ---------------
>>> Fortunately this code overcame the error message with too many filehandles. So the problem was indeed coming from the module or the modules behind it.
>>> I have also read that one can change the number of how many files can be opened on the system but i didnt want to mess with that for now because i do not know what could be the implications of that.
>>> Ok that is it. I just wanted to inform about my experience and to report the problem.
>>> Cheers
>>> Dimitar
>> Seems this is hitting the system ulimit somehow, but it's not immediately apparent how that's happening unless you are caching the IO objects somehow.  Can you file this as a bug, maybe with a fuller test script?  Might give us something to check against.
>> chris
> Dimitar,
> I think Peter had answered this before, might indicate the problem is actually using the 'O' option in output.  We can look at possibly just capturing STDOUT instead, but we may not support the use of 'O' if it's as buggy as indicated.
> http://groups.google.com/group/bioperl-l/msg/25c17748d1ac6ef4
> chris

Dimitar Kenanov
Postdoctoral research fellow
Protein Sequence Analysis Group
Bioinformatics Institute
A*STAR, Singapore
tel: +65 6478 8514

More information about the Bioperl-l mailing list