[BioRuby] Remote Blast

Naohisa GOTO ngoto at gen-info.osaka-u.ac.jp
Fri Jul 4 14:05:41 UTC 2008


Hi,

On Fri, 4 Jul 2008 13:13:45 +0100
Anthony Underwood <email2ants at gmail.com> wrote:

> Hi all, is remote blast broken for bioruby?

No.
Below script works fine with both bioruby-1.2.1 and CVS HEAD.
#---------------------------------------------------
  require 'bio'

  seq = Bio::Sequence::NA.new(
    "atgcagctctttgtccgcgcccaggagctacacaccttcgaggtgaccggccaggaaacg
     gtcgcccagatcaaggctcatgtagcctcactggagggcattgccccggaagatcaagtc
     gtgctcctggcaggcgcgcccctggaggatgaggccactctgggccagtgcggggtggag
     gccctgactaccctggaagtagcaggccgcatgcttggaggtaaagtccatggttccctg
     gcccgtgctggaaaagtgagaggtcagactcctaaggtggccaaacaggagaagaagaag
     aagaagacaggtcgggctaagcggcggatgcagtacaaccggcgctttgtcaacgttgtg
     cccacctttggcaagaagaagggccccaatgccaactcttaa")

  p "-m 7"

  remote = Bio::Blast.remote('blastn', 'genes-nt', '-e 0.01')
  blobj = remote.query(seq)
  blobj.each_hit do |hit|
    puts "#{hit.target_def} #{hit.evalue}"
  end

  p "-m 8"

  remote8 = Bio::Blast.remote('blastn', 'genes-nt', '-m 8 -e 0.01')
  blobj = remote8.query(seq)
  blobj.each_hit do |hit|
    puts "#{hit.target_def} #{hit.evalue}"
  end
#---------------------------------------------------

> I have some code
> 
> fasta_sequences.each do |seq_obj|
>    blast_report = remote_blast(seq_obj.seq, 'blastn', 'nr-nt')
>   ..........
> end
> 
> this calls the method
> 
> def remote_blast(seq_obj, program, db = 'nr-nt', options = '', server  
> = 'genomenet')
>    # create BLAST factory object
>    factory = Bio::Blast.remote(program, db, '-m 8' + options, server)
>    report = factory.query(seq_obj)
> end
> 
> This fails to return anything - just times out

BLAST search with database 'nr-nt' usually takes very very long time.
In Bio::Blast#exec_genomenet method, timeout is extended to 600 sec,
but if the calculation time is longer, timeout error will happen.

The timeout can be shorter than above value depend on your network,
because administrators of your network can limit maximum timeout
value in their router and/or proxy.

> If I include the -V option to limit the number of hits returned
> 
>    blast_report = remote_blast(seq_obj.seq, 'blastn', 'nr-nt', "-V 10")

'-V' option in blastall means:
|  -V  Force use of the legacy BLAST engine [T/F]  Optional
|    default = F"
(taken from blatall help message)

'-v' and '-b' can be used. Be careful they are case sensitive.
|  -v  Number of database sequences to show one-line descriptions for (V) [Integer]
|    default = 500
|  -b  Number of database sequence to show alignments for (B) [Integer]
|    default = 250

In addition, using '-e' option to limit e-value is better,
because default e-value (10.0) is generally too big.

> I get no hits , or a message "RuntimeError: cannot understand  
> response" which I think is due to the "holding page" that occurs when  
> a job is running

Perhaps because unrecognized option was specified.

> Does anybody know  a fix for this? Should I start to try and get a  
> ncbi remote blast working. Is there anybody who would like to help  
> with this?
> 
> Thanks
> 
> Anthony
> _______________________________________________
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


Naohisa Goto
ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org




More information about the BioRuby mailing list