[Bioperl-l] BlastPlus usage inquiry

Chris Fields cjfields at illinois.edu
Fri Apr 2 19:41:16 UTC 2010


Glad to see it's not just me.  I just asked the same question the other
day and got a response (or non-response, dep. on how you look at it);
posted below.  This occurs irrespective of using the BioPerl BLAST+
modules or not, using -num-threads of 4 and 8. 

Interestingly, the legacy BLAST and -a works just fine (I'm using a dual
quad-core running ubuntu karmic 64-bit, 48G RAM).  It may be the
database size as they imply, but I would like to see this behavior
documented.  It's certainly unexpected.

chris

NCBI's response, followed by my post:
------------------------------------------------
Hello, 

In our 2.2.23+ linux x64 tests, threading is working. It may be that the
database is small enough not to invoke more than one thread, or that top
is not the best way to analyze this in blast+. With a large enough
search, user time should decrease with increasing cpu. Note that only
the search phases of the algorithms are multi-threaded; the traceback is
not. 
Best regards,
Wayne

_~___~___~__~__~_~
Wayne Matten, PhD
NCBI Public Services
mattenw at mail.nih.gov
------------------------------------------------
On Mar 31, 2010, at 6:12 PM, Chris Fields wrote: 
> I'm using BLAST+ and can't tell if threading is working properly (it
> doesn't appear to be if I can trust 'top').  With legacy BLAST I set
> -a
> to 4 (I'm running a dual quad-core ubuntu box, 9.10 64-bit) and can
> see
> the %CPU cranked up to 350-400% with top, but with BLAST+ this never
> exceeds 100%.  
> 
> Below is the actual command line:
> 
> /opt/local/blast-plus/bin/blastp -num_threads 4 -db
> ~/blastdb/apis_v2_aa
> -evalue 1e-5 -num_descriptions 10 -num_alignments 10 -query seqs.aa >
> out.blastp 2> error.log &
> 
> Snapshot of 'top':
> 
> Tasks: 292 total,   1 running, 291 sleeping,   0 stopped,   0 zombie
> Cpu(s):  1.0%us,  0.2%sy,  0.0%ni, 98.8%id,  0.0%wa,  0.0%hi,  0.0%si,
> 0.0%
> Mem:  49556108k total, 46352552k used,  3203556k free,   556224k
> buffers
> Swap: 29302536k total,    16852k used, 29285684k free, 42244628k
> cached
> 
>  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+
> COMMAND        
> 18195 cjfields  20   0  353m  19m  12m S  100  0.0   1:43.68
> blastp         
> 10536 cjfields  20   0  217m  18m  10m S    2  0.0   0:12.80
> gnome-terminal 
> 
> 
> chris


On Fri, 2010-04-02 at 15:15 -0400, Mark A. Jensen wrote:
> Hi Ross --
> Add the following option:
> 
> -num_threads 7
> 
> and see if it picks up the other CPUs--
> cheers Mark
> ----- Original Message ----- 
> From: "Ross KK Leung" <ross at cuhk.edu.hk>
> To: "'Janine Arloth'" <janine.arloth at googlemail.com>; 
> <bioperl-l at lists.open-bio.org>
> Sent: Wednesday, March 31, 2010 5:28 PM
> Subject: [Bioperl-l] BlastPlus usage inquiry
> 
> 
> > Dear all,
> >
> > I know it is inappropriate to raise this question in bioperl but as I
> > received no better response from NCBI and so have to ask in this group
> > (because finally I'll use bioperl to call blastplus). I have already been
> > using the latest blastplus (the command is blastn directly) and found the
> > problem of running slow and inability to run in a parallel/multithread
> > manner.
> >
> >
> > Previously I was using non blastplus version 2.2.22 with the command
> > blastall -p blastn -a 8 etc.
> >
> > With similar arguments as below except the word size was 12, my shell script
> > for the same input and database finishes almost instantly. I notice that
> > except word size and min raw gapped score were changed by me, nothing
> > appears to differ from the previous version parameters. Moreover, when I top
> > my process, I find it uses only one CPU instead of 7.
> >
> > What may be the problem for the script that makes the job running for a day
> > and still hasn't finished?
> >
> > blastn -query $1 -db $2 -out $1_$2.xml -num_threads 7 -word_size 4 -gapopen
> > 3 -gapextend 1 -penalty -2 -outfmt 5 -xdrop_ungap 30 -xdrop_gap 30
> > -xdrop_gap_final 30 -min_raw_gapped_score 10
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> > 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l





More information about the Bioperl-l mailing list