[BioRuby] basic remote query and plans for NCBI Bio:Blast hook?

Anthony Underwood email2ants at gmail.com
Thu Nov 5 16:22:12 UTC 2009


Hi Matt

I have done a bit of work to get NCBI blast working within bioruby.


See this gist on github http://gist.github.com/227160

ncbi_blast.rb defines an exec_ncbi class for the Blast class in bioruby
The script ncbi_blast_test.rb illustrates its usage but uses a few  
functions defined in the blast_functions.rb file


essentially the following should work

require 'rubygems'
require 'bio'
require 'ncbi_blast'
ENV['http_proxy'] = "http://proxy_server_ip:port_numer" # use this if  
you are working from behind a proxy and enter ip and port number as  
appropriate

sequence = "ATGAATCCAAATCAGAAAATAATAA........"

factory = Bio::Blast.remote('blastn', 'nr', '', 'ncbi')
blast_report = factory.query(sequence)


blast_report will be a Bio::Blast::Report object which can be parsed  
as described in the bioruby api

The hit definitions are fairly uninformative containing just the  
accessions. This is why I then have to fetch the data fro embl as  
follows

     accession = definition.split("|")[3]
     accession.sub!(/\..+$/, "") # remove version number
     server = Bio::Fetch.new('http://www.ebi.ac.uk/cgi-bin/dbfetch')
     embl_text = server.fetch('embl', accession)
     embl_object = Bio::EMBL.new(embl_text)
     puts embl_object.description


This is still a work in progress but it worked OK for me. Hope it is  
of some use to you.


Anthony


On 4 Nov 2009, at 18:29, Matt wrote:

> Hi all,
>
> As far as I can tell there is yet no straightforward way to use
> Bio:Blast with the NCBI portal? I've seen this on the wiki: "Add
> remote BLAST search sites", and understand the basic concept, but
> don't have time at present to work on this.  Is anyone actively
> working on this? (just FYI see
> http://github.com/kwicher/ruby-blast-at-ncbi).
>
> I ask in part because I'm struggling to get a basic remote blast  
> working:
>
> seq =  
> Bio::Sequence::NA.new('GTCACAAAATCATGGTTTTGCGGTTAATGCTAATGATTTGCCAGCTGATTGGGAACCATTATTTACAAATGCGAACGACAATACAAATGAAGGAATTGTACACAAAACACATCCATTCTTTAGTGTACAATTTCATCCCGAACACACAGCCGGTCCAGAAGATTTAGAAATCTTATTTGATGTCTTTCTGGATGGAGTAAAAGCATTTAAAAATAAGGAAAAGTTCAYCATGAARGATAAATTGATCGAAAAATTGACTTACACGCCGGATGTACCCGTTTGCACTGAAAAACCTAAAAAGATATTGATTTTAGGTTCAGGCGGTTTATCCATAGGYCAAGCAGGCGAATTTGATTATTCCGGATCTCAGGCTATCAAGGCTCTTAAAGAAGAAAAAATACAAACGGTGYTAATAAATCCAAATATTGCAACGGTTCARACATCAAAAGGCCTTGCGGACAAAGTTTACTTCCTACCCATTACACCGGATTACGTTGAACAGGTTATAAAAGCCGAGCGACCTGATGGTGTGCTTTTAACTTTTGGCGGACAAACAGCTTTGAATTGTGGAATTGAATTAGAAAAAACTAAAGTGTTTCAACGATTCGGTGTTAAAGTGTTGGGTACRCCGATACAATCAATTATTGAAACTGAAGATAGAAAAATATTTTCGGATCGAGTACACGAAATCGGAGAAAAAGTAGCGCCGTCTGCCGCAGTTTATTCGGTGCAAGAAGCTCTAGATGCCGCTGAAATTCTTGGTTATCCCGTTATGGCTCGAGCTGCATTTTCATTAGGTGGACTAGGTTCTGGTTTTGCAAATAATATTGATGAATTAAAACATCTTGCACAACAGGCTCTTGCGCATTCCAACCAGTTAATCATTGATAAATCGCTTAAAGGTTGGAAGGAAGTTGAATACGAGGTCGTTCGTGATGCATATGACAATTGTATTACAGT!
> TTGTAATATGGAAAATGTAGATCCACTAGGAATTCATACAGGGGAGAGTATAGTAGTGGCACCGTCACAAACTCTCTCCAACAAGGAATATAATATGTTGCGTACTACAGCAATTAAAGTGATTCGGCATTTTGGCGTCGTCGGTGAATGTAATATACAATATGCCTTAAATCCACATTCYGAGCAATACTATATAATTGAAGTTAATGCTAGGTTATCGAGGAGTTCGGCACTAGCTAGTAAAGCGACAGGCTATCCATTAGCATACGTTGCGGCTAAACTAGCACTCGGTATCGCTTTACCTGATATTAAAAATTCGGTAACTGGAGTTACCACCGCCTGTTTTGAGCCAAGTTTAGATTACTGTGTGGTAAAAATTCCACGATGGGATTTAGCAAAATTTGTTCGCGTTTCAAAAAATATTGGAAGCTCTATGAAAAGTGTAGGTGAGGTCATGGCAATCGGCCGCCGATTTGAAGAAGCGTTCCAAAAA')
>
> blast_factory = Bio::Blast.new('blastn','nr-nt', '', 'genomenet')
> foo = blast_factory.query(seq)
>
> ... freezes, when I ctrl-C
>
> from /Library/Ruby/Gems/1.8/gems/bio-1.3.1.5000/lib/bio/appl/blast/ 
> genomenet.rb:224:in
> `call'
> from /Library/Ruby/Gems/1.8/gems/bio-1.3.1.5000/lib/bio/appl/blast/ 
> genomenet.rb:224:in
> `sleep'
> from /Library/Ruby/Gems/1.8/gems/bio-1.3.1.5000/lib/bio/appl/blast/ 
> genomenet.rb:224:in
> `exec_genomenet'
> from /Library/Ruby/Gems/1.8/gems/bio-1.3.1.5000/lib/bio/appl/ 
> blast.rb:368:in
> `__send__'
> from /Library/Ruby/Gems/1.8/gems/bio-1.3.1.5000/lib/bio/appl/ 
> blast.rb:368:in
> `query'
> from (irb):25
>
> any glaring problems with this? Is it just waiting for the results of
> the remote query?   I noticed that the genomenet blasts are much
> slower than NCBI in general (I'm in the US).
>
> thanks,
> Matt
>
> _______________________________________________
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby




More information about the BioRuby mailing list