[Bioperl-l] bl2seq hang and its performace

Liu Haifeng lhaifeng at dso.org.sg
Fri Dec 12 01:49:04 EST 2003


Hi all,

I noticed that one of my program written using bioperl-1.2.3 runs very slow
and consumes huge memory, and I doubted that it is due to the call of bl2seq
in the program.  Thus, I wrote a small program (bl2seq sequences against
themselves from a fasta file) below to see if it is the ture:


#!/usr/bin/perl -w
       use Bio::SeqIO;
      use Bio::Tools::Blast;
       use Bio::Tools::Run::StandAloneBlast;
       use Bio::Tools::BPlite;

       my $infile =shift;
       my $sno=0;
       my $blastalgo="blastp"; #blastp ,blastx, tblastn, tblastx
       my $pin = Bio::SeqIO->new('-file' => "$infile", '-format' =>
'Fasta');
      while ( my $proseq = $pin -> next_seq()) {
          $sno++;
          print "bl2seq $sno ..............................\n";
          my @params=('program' => $blastalo);
          my $factory= Bio::Tools::Run::StandAloneBlast->new(@params);
          $factory->io->_io_cleanup();
          my $report=$factory->bl2seq($proseq, $proseq);
          while (my $hsp=$report->next_feature) {
              #only need the first hsp
              $report->close();
           }
          undef $report;
     }
      print "running is over\n";

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The program runs ok for the small fastat file.  However, when I input a
fasat file around 2.6M containing 10,000 protein sequences, the program
hangs when it compare the 1782th sequence.  Also I noticed that the program
has consume 12M of memory at that time.   I searched the archive that there
have been similar bl2seq problem occurred.  However, it should have been
solved in the latest version.

Anyone can show me some clues to improve the performance of calling bl2seq?
Thank you.

Regards
Haifeng Liu





More information about the Bioperl-l mailing list