[BioRuby] Blast with file as a query option?
donttrustben at gmail.com
Tue Apr 7 04:30:09 UTC 2009
And there is the -a flag, for specifying you want to use multiple CPUs.
2009/4/7 Naohisa GOTO <ngoto at gen-info.osaka-u.ac.jp>
> On Sun, 5 Apr 2009 14:13:37 -1000
> Kevin English <kenglish at gmail.com> wrote:
> > Hello,
> > I have to very large local fasta files that wish to blast against one
> > another and parse the results in bio ruby. I'm wondering if there is a
> > to mimic the behavior of this blast command:
> > blastall -p blastn -i Large_list_sequences_1.fasta -d
> > where Large_list_sequences_2 is a formatted fasta db. My current
> > implementation opens Large_list_sequences_1.fasta and goes through it
> > sequence by sequence. It seems to run pretty slow. I'm wondering if I can
> > some way do the above blast command and loop through the results and get
> > performance gain.
> To gain performance, adding options to BLAST is strongly recommended.
> -e Expectation value (E) [Real]
> default = 10.0
> -v Number of database sequences to show one-line descriptions for (V)
> default = 500
> -b Number of database sequence to show alignments for (B) [Integer]
> default = 250
> Changing above to smaller values will reduce output report size
> which means performance gain.
> Executing BLAST with multiple query sequences can also gain performance.
> In addition, when you have query sequences in a local file, calling
> blastall command directly without Bio::Blast may be good.
> For example,
> require 'bio'
> require 'tempfile'
> command = %( blastall -p blastn -i Large_list_sequences_1.fasta
> -d Large_list_sequences_2 -e 0.0001 -b 20 -v 20 )
> tempfile = Tempfile.new('blastout')
> command = command + [ "-o", tempfile.path ]
> # After system(), error checks will be needed but skipped.
> ff = Bio::FlatFile.open(tempfile)
> ff.each do |report|
> # For example, prints query_def and target_def
> report.each do |hit|
> print report.query_def, "\t", hit.target_def, "\n"
> > For any curious, my code is on github:
> > http://github.com/kenglishhi/bioflexrails/tree/master
> > The file that is doing the blasts is under app/model/biodatabase.rb.
> > I'm trying to write a rails app uses biosql db and allows this biologist
> > organize his sequences. I'm very new to bioinformatics but have a lot
> > experience with Ruby on Rails.
> > Thanks in advance for you help.
> In general, a BLAST search against a very large database takes
> very long time, and using batch queueing system might be needed.
> Naohisa Goto
> ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org
> BioRuby mailing list
> BioRuby at lists.open-bio.org
FYI: My email addresses at unimelb, uq and gmail all redirect to the same
More information about the BioRuby