[Bioperl-l] Get GIs from Taxonomy ID

Smithies, Russell Russell.Smithies at agresearch.co.nz
Thu Jun 10 21:43:55 UTC 2010


That's the way I usually do it as NCBI/eUtils can be a bit flakey.
Not BioPerls fault of course ;-)

    zgrep -w 9940 gi_taxid_nucl.dmp.gz | awk '{print $1}'



--Russell

> -----Original Message-----
> From: Chris Fields [mailto:cjfields at illinois.edu]
> Sent: Friday, 11 June 2010 9:36 a.m.
> To: Smithies, Russell
> Cc: 'Dave Messina'; 'armendarez77 at hotmail.com'; 'bioperl-l at lists.open-
> bio.org'
> Subject: Re: [Bioperl-l] Get GIs from Taxonomy ID
> 
> You can get up-to-date files mapping GI to TaxID here (nr and nt):
> 
> ftp://ftp.ncbi.nlm.nih.gov/pub/taxonomy/
> 
> chris
> 
> On Jun 10, 2010, at 4:11 PM, Smithies, Russell wrote:
> 
> > Eutils will do it with the right query:
> >
> >
> >
> >
> > use Bio::DB::EUtilities;
> >
> > my $factory = Bio::DB::EUtilities->new(-eutil => 'esearch',
> >                       -db => 'nucleotide',
> >                       -term => 'txid9940[Organism:noexp]',
> >                       -email => 'mymail at foo.bar',
> >                       -retmax => 1000000);
> >
> > # query hits
> > print "Count = ",$factory->get_count,"\n";
> > # UIDs
> > my @ids = $factory->get_ids;
> >
> >
> > --Russell
> >
> >
> >> -----Original Message-----
> >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> >> bounces at lists.open-bio.org] On Behalf Of Dave Messina
> >> Sent: Friday, 11 June 2010 8:01 a.m.
> >> To: armendarez77 at hotmail.com
> >> Cc: bioperl-l at lists.open-bio.org
> >> Subject: Re: [Bioperl-l] Get GIs from Taxonomy ID
> >>
> >> Hi Veronica,
> >>
> >> These days when you run BLAST at the NCBI server, you're running
> BLAST+,
> >> which is their complete rewrite of (and replacement for) BLAST.
> >>
> >> You can also download BLAST+ and do pretty much everything on your
> local
> >> machine that you can do on their server, including limit by taxonomy.
> >>
> >> I think this is the right parameter:
> >>
> >> 	http://www.ncbi.nlm.nih.gov/blast/html/blastcgihelp.html#entrez_que
> >> ry
> >>
> >> Incidentally, BLAST+ has this awesome feature whereby you can, from the
> >> command line, run searches remotely on their server against their
> >> databases from the command line, just by adding the --remote flag.
> >>
> >>
> >> (You can run BLAST+ via the BioPerl wrapper module StandAloneBlastPlus,
> by
> >> the way.)
> >>
> >> Dave
> >>
> >>
> >>
> >> On Jun 10, 2010, at 4:54 PM, <armendarez77 at hotmail.com> wrote:
> >>
> >>>
> >>> Hello,
> >>>
> >>> Is there a BioPerl method that will give a list of GIs for a specified
> >> NCBI taxonomy Id?
> >>>
> >>> I've previously tried using Urlapi to BLAST primers against the nr
> >> database on the NCBI server, but recently I keep getting a
> >>> 'Bad Gateway' error.  While my system admin is looking into this, I've
> >>> decided to go another route.  Therefore, I've downloaded the NCBI nr
> >> database.
> >>>
> >>> The problem I've run into is restricting the BLAST against the nr
> >> database to a subset of sequences.  The NCBI Blast tools have an option
> (-
> >> l) that does this, but it requires a list of GI's.
> >>>
> >>> When I was using Urlapi, I restricted sequences using Taxonomy Ids
> >> (Entrez Query).  Therefore, is there a way to get all GIs within a
> >> Taxonomy Id?  I've seen that woth Bio::Taxonomy I can give a GI and get
> a
> >> Tax Id, but not the reverse.
> >>>
> >>>
> >>> Thank you,
> >>>
> >>> Veronica
> >>>
> >>>
> >>>
> >>>
> >>> _________________________________________________________________
> >>> Hotmail is redefining busy with tools for the New Busy. Get more from
> >> your inbox.
> >>>
> >>
> http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON
> >> :WL:en-US:WM_HMP:042010_2
> >>> _______________________________________________
> >>> Bioperl-l mailing list
> >>> Bioperl-l at lists.open-bio.org
> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >>
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > =======================================================================
> > Attention: The information contained in this message and/or attachments
> > from AgResearch Limited is intended only for the persons or entities
> > to which it is addressed and may contain confidential and/or privileged
> > material. Any review, retransmission, dissemination or other use of, or
> > taking of any action in reliance upon, this information by persons or
> > entities other than the intended recipients is prohibited by AgResearch
> > Limited. If you have received this message in error, please notify the
> > sender immediately.
> > =======================================================================
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l





More information about the Bioperl-l mailing list