[Bioperl-l] Standaloneblastplus: update

dimitark at bii.a-star.edu.sg dimitark at bii.a-star.edu.sg
Fri Sep 13 08:30:32 UTC 2013


Hi guys,
i managed to solve my problem by modifying the StandAloneBlastPLus.pm  
and BlastMethods.pm.

In 'sub run()' from BlastMethods added an option TEMPDIR:

sub run {
     my $self = shift;
     my @args = @_;
     # DIMITAR: added $tempdir so i can pass a tempdir for each thread i create
     my ($method, $query, $outfile, $outformat, $method_args,$tempdir)  
= $self->_rearrange( [qw(
					METHOD
                                          QUERY
                                          OUTFILE
                                          OUTFORMAT
                                          METHOD_ARGS
					 TEMPDIR
                                          )], @args);

Then line 261 in BlastMethods, passing the tempdir:

     $blast_args{-query} = $self->_fastize($query,$tempdir);

Then in StandAloneBlastPLus in _fastize():

    sub _fastize {
     my $self = shift;
     my $data = shift;
     my $tempdir=shift; # <--- ADDED THIS


And further changed here:

    		my $fh = File::Temp->new(TEMPLATE => 'DBDXXXXXXXXXX',
					 UNLINK => 0,
					 DIR => $tempdir, # <--- CHANGED HERE
					 SUFFIX => '.fas');


Well its quite dirty workaround but it works fine. Now i can do the following:
   In my script i can create start several threads which have the same  
factory and for each thread i create a separate TEMPDIR in which is  
created the temp .FAS(holding the query). That way i can make a better  
use of my CPU threads.
For example: instead of running a single blast with 40 CPU threads  
which process a fasta file with 250K seqs now i can start 5 instances  
of blast processing 50K seqs each. And each instance using 8 CPU  
threads.

I did this because:

a) when i run several instances of blast and they all create their  
temp files in the same directory. And even tho the temp files use this  
RAND mix of characters still some weird errors were happening and some  
blast instances were broken.

b) i noticed that when i process a large fasta file the blast at first  
starts well but is getting slower with time. I mean slower with each  
seq being blasted. The further down the fasta the slower the blast.

If someone else is interested in this kind of functionality i suppose  
i can edit further the file so that is cleaner and consistent  
throughout. Also now the tempdir must be explicitly  given i can make  
it like that:

if(! $tempdir){
    $tempdir=$self->db_dir;
}

which will default it as before in DB_DIR. Or some other way which  
achieves the same.

Cheers
D.




More information about the Bioperl-l mailing list