[Bioperl-l] Remote Blast - Blast Human Genome

vrramnar at student.cs.uwaterloo.ca vrramnar at student.cs.uwaterloo.ca
Thu Jul 20 23:07:15 UTC 2006


Hi Malcolm,

Thanks for the help, I actually figured this out today the same way you did 
through discussions with NCBI help deskng.

He mentioned the main site is:
http://www.ncbi.nlm.nih.gov/staff/tao/URLAPI/

But specifically:
http://www.ncbi.nlm.nih.gov/staff/tao/URLAPI/remote_accessible_blastdblist.html

So all you would need to do while using remoteblast is set your $db to one of 
the following:

snp/human_9606/human_9606	Human SNPs
snp/human_9606/rs_ch1	Human chr 1 SNPs
snp/human_9606/rs_ch10	Human chr 10 SNPs
snp/human_9606/rs_ch11	Human chr 11 SNPs
snp/human_9606/rs_ch12	Human chr 12 SNPs
snp/human_9606/rs_ch13	Human chr 13 SNPs
snp/human_9606/rs_ch14	Human chr 14 SNPs
snp/human_9606/rs_ch15	Human chr 15 SNPs
snp/human_9606/rs_ch16	Human chr 16 SNPs
snp/human_9606/rs_ch17	Human chr 17 SNPs
snp/human_9606/rs_ch18	Human chr 18 SNPs
snp/human_9606/rs_ch19	Human chr 19 SNPs
snp/human_9606/rs_ch2	Human chr 2 SNPs
snp/human_9606/rs_ch20	Human chr 20 SNPs
snp/human_9606/rs_ch21	Human chr 21 SNPs
snp/human_9606/rs_ch22	Human chr 22 SNPs
snp/human_9606/rs_ch3	Human chr 3 SNPs
snp/human_9606/rs_ch4	Human chr 4 SNPs
snp/human_9606/rs_ch5	Human chr 5 SNPs
snp/human_9606/rs_ch6	Human chr 6 SNPs
snp/human_9606/rs_ch7	Human chr 7 SNPs
snp/human_9606/rs_ch8	Human chr 8 SNPs
snp/human_9606/rs_ch9	Human chr 9 SNPs
snp/human_9606/rs_chMT	Human chr Mitochondrial SNPs
snp/human_9606/rs_chMulti	Human SNPs mapped to multiple locations
snp/human_9606/rs_chNotOn	Human SNPs not mapped
snp/human_9606/rs_chUn	Human SNPs mapped to unplaced contigs
snp/human_9606/rs_chX	Human chr x SNPs
snp/human_9606/rs_chY	Human chr y SNPs

The web site has a more complete list of all other databases available using the 
remoteblast module.

Rohan

Quoting "Cook, Malcolm" <MEC at stowers-institute.org>:

> Rohan,
> 
> 'snp/human/human_snp' is the database name you need to use to blast into
> human snp database at NCBI
> 
> See the following document for the full list (which link was provided to
> me via personal correspondace with NCBI helpdesk).  Very useful...
> 
> Hmm, looming again, there appear now to be two versions:
> 
> http://www.ncbi.nlm.nih.gov/staff/tao/URLAPI/blastdblist.html (last
> updated 2/7/2006)
> http://www.ncbi.nlm.nih.gov/staff/tao/URLAPI/remote_accessible_blastdbli
> st.html (last uypdated 5/29/2006)
> 
> Neither are linked to by any other document on the internet (google sez)
> including anywhere else at NCBI.  Go figure.  It should be IMHO since
> this info is nowhere else collected.
> 
> Of course it may be out of date, but it always has got me through.
> 
> Good luck
> 
> Malcolm Cook - mec at stowers-institute.org - 816-926-4449
> Database Applications Manager - Bioinformatics
> Stowers Institute for Medical Research - Kansas City, MO  USA 
> 
> 
> 
> >-----Original Message-----
> >From: bioperl-l-bounces at lists.open-bio.org 
> >[mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Chris Fields
> >Sent: Monday, July 17, 2006 4:26 PM
> >To: vrramnar at student.cs.uwaterloo.ca; bioperl-l at lists.open-bio.org
> >Subject: Re: [Bioperl-l] Remote Blast - Blast Human Genome
> >
> >Okay, I think I may know what's going on a little more now 
> >with NCBI's BLAST
> >interface.  Looks like any NCBI BLAST query must use the 
> >default URL and so
> >must set up to proper GET/PUT commands to retrieve everything 
> >correctly.  
> >
> >Here's the API description for it all:
> >
> >http://www.ncbi.nlm.nih.gov/BLAST/Doc/urlapi.html
> >
> >You could try setting the database to 'snp' or something along 
> >those lines
> >instead of 'nr'; or you could see what the name of the 
> >database is when you
> >use the web form and try setting it to that.  According to 
> >this page, this
> >should be possible:
> >
> >http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=helpsnpfaq.sectio
> >n.SearchdbSNP
> >_test._Search_dbSNP_Using_B
> >
> >The Entrez Query limit was a recommendation for limiting your 
> >search to a
> >set of sequences for human, for instance.  
> >
> >I'll try looking into it a bit more but I'm pretty busy.  If you find
> >anything out you should probably post it here .
> >
> >Chris
> >
> >> Hi Chris,
> >> 
> >> 1. I have tried changing the database to snp or dbSNP but 
> >neither works.
> >> It
> >> seems that depending on which type of blast you use(ie, Genome Blast,
> >> Blast SNP,
> >> normal blast such as blastn, etc...) you see a different listing of
> >> databases
> >> available for querys. Since you mention that the Blast page I see was
> >> generated
> >> by Genome, where could I go to see a complete listing of 
> >databases I can
> >> query??
> >> Or if you knew off hand which database to search if I only 
> >wanted dbSNP
> >> hits?
> >> 
> >> 2. You also mention, I can limit the search by using Entrez 
> >terms. Do you
> >> mean
> >> like:
> >> $Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'} = 'abc';
> >> where 'abc' is the name of the subject with which you would 
> >only like to
> >> see
> >> result of. For example if you put it as 'Homo 
> >sapiens[Organism]' then only
> >> human
> >> sequences would be in hit lists.
> >> If this is what you mean, what would I change it to, to see 
> >only hits from
> >> dbSNP?
> >> 
> >> Thanks for the ongoing help,
> >> 
> >> Rohan
> >> 
> >> Quoting Chris Fields <cjfields at uiuc.edu>:
> >> 
> >> > I added a method to RemoteBlast in bioperl-live (CVS) if 
> >you want to
> >> play
> >> > with changing the URL.  I have been thinking about doing 
> >this for a bit
> >> now
> >> > but I already see problems.
> >> >
> >> > Here's the issue: the BLAST page you see is NOT the NCBI BLAST page
> >> (note
> >> > the differences in the URL) but a user-friendly request 
> >page, generated
> >> on
> >> > the fly by Genome, to submit BLAST requests for the 
> >relevant database.
> >> So
> >> > changing the URL will not work (even by adding extra 
> >parameters); you
> >> only
> >> > get the original HTML web page.
> >> >
> >> > You could try changing the database or limiting the search using an
> >> Entrez
> >> > term (which you should be able to include in the request, 
> >probably by
> >> adding
> >> > it to the HEADER).
> >> >
> >> > Chris
> >> >
> >> > > -----Original Message-----
> >> > > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> >> > > bounces at lists.open-bio.org] On Behalf Of
> >> vrramnar at student.cs.uwaterloo.ca
> >> > > Sent: Thursday, July 13, 2006 5:39 PM
> >> > > To: bioperl-l at lists.open-bio.org
> >> > > Subject: [Bioperl-l] Remote Blast - Blast Human Genome
> >> > >
> >> > >
> >> > > Hello Again,
> >> > >
> >> > > I have another question regarding Remote blast but this 
> >time using
> >> Genome
> >> > > Blast.
> >> > >
> >> > > Here is the link:
> >> > >
> >> > >
> >> 
> >http://www.ncbi.nlm.nih.gov/genome/seq/BlastGen/BlastGen.cgi?taxid=9606
> >> > >
> >> > > which again uses the main Blast web site:
> >> > >
> >> > > http://www.ncbi.nlm.nih.gov/BLAST/Blast.cgi
> >> > >
> >> > > Again I am not sure what to add or what HEADER 
> >information to change
> >> > > within my
> >> > > script.
> >> > >
> >> > > Here is my program, which was the same as the last email:
> >> > >
> >> > > #!/usr/bin/perl -w
> >> > >
> >> > > use Bio::Perl;
> >> > > use Bio::Tools::Run::RemoteBlast;
> >> > >
> >> > > my $prog = "blastn";
> >> > > my $db = "refseq_genomic";
> >> > > my $e_val = 0.01;
> >> > >
> >> > > my @params = (	'-prog' => $prog,
> >> > > 		'-data' => $db,
> >> > > 		'-expect' => $e_val);
> >> > >
> >> > > my $factory = new Bio::Tools::Run::RemoteBlast->new(@params);
> >> > > $Bio::Tools::Run::RemoteBlast::HEADER{'WWW_BLAST_TYPE'} 
> >= '????';  <--
> >> ---
> >> > > what
> >> > > do I put here
> >> > > #$Bio::Tools::Run::RemoteBlast::HEADER{'?????'} = 
> >'????';  <--- Do I
> >> need
> >> > > to add
> >> > > any other values to the form inputs
> >> > >
> >> > > $factory->submit_blast("blast.in");
> >> > > $v = 1;
> >> > >
> >> > > while (my @rids = $factory->each_rid)
> >> > > {  foreach my $rid ( @rids )
> >> > >    {  my $rc = $factory->retrieve_blast($rid);
> >> > >       if( !ref($rc) )
> >> > >       {  if( $rc < 0 )
> >> > >          {  $factory->remove_rid($rid);
> >> > >          }
> >> > >          print STDERR "." if ( $v > 0 );
> >> > >          sleep 5;
> >> > >       }
> >> > >       else
> >> > >       {  my $result = $rc->next_result();
> >> > >          my $filename = $result->query_name()."\.out";
> >> > >          $factory->save_output($filename);
> >> > >          $factory->remove_rid($rid);
> >> > >          print "\nQuery Name: ", $result->query_name(), "\n";
> >> > >       }
> >> > >    }
> >> > > }
> >> > >
> >> > >
> >> > > Both of my questions are very similiar as in I know how 
> >to use remote
> >> > > blast but
> >> > > not sure what to change to access the specific blast I want.
> >> > >
> >> > > Again, any help would be very appreciated!!
> >> > >
> >> > > Rohan
> >> > >
> >> > >
> >> > >
> >> > > ----------------------------------------
> >> > > This mail sent through www.mywaterloo.ca
> >> > > _______________________________________________
> >> > > Bioperl-l mailing list
> >> > > Bioperl-l at lists.open-bio.org
> >> > > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >> >
> >> 
> >> 
> >> 
> >> 
> >> ----------------------------------------
> >> This mail sent through www.mywaterloo.ca
> >
> >_______________________________________________
> >Bioperl-l mailing list
> >Bioperl-l at lists.open-bio.org
> >http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> 




----------------------------------------
This mail sent through www.mywaterloo.ca



More information about the Bioperl-l mailing list