[Bioperl-l] taxonomy ID

Smithies, Russell Russell.Smithies at agresearch.co.nz
Tue Mar 31 20:06:35 UTC 2009


The taxonomy information isn't in the blast output unless you created custom fasta headers for your blast database.
The easiest way to get the tax_id for your accessions would be to download the gi->tax_id list from ftp://ftp.ncbi.nih.gov/pub/taxonomy/gi_taxid_nucl.dmp.gz.
If you load that file into a hash, parse the accessions out of the blast hits then lookup the tax_id from that hash, I think it should be fairly fast. 

Checking which are prokaryotes and which are eukaryotes based on tax_id is a separate problem  :-)
If you grab the taxdump.tar.gz file from the same site, the nodes.dmp file contained within lists what division each tax_id belongs to (Bacteria, Invertebrates, Mammals, Phages, Plants, etc) so you can probably work it out from that.

It's not a very BioPerly solution but sometimes just looking up the answer from a file/table/hash is the simplest way. 

Hope this helps,

Russell Smithies 

Bioinformatics Applications Developer 
T +64 3 489 9085 
E  russell.smithies at agresearch.co.nz 

Invermay  Research Centre 
Puddle Alley, 
Mosgiel, 
New Zealand 
T  +64 3 489 3809   
F  +64 3 489 9174  
www.agresearch.co.nz 





> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of shalabh sharma
> Sent: Wednesday, 1 April 2009 7:43 a.m.
> To: bioperl-l
> Subject: [Bioperl-l] taxonomy ID
> 
> Hi All,
>           I am writing a script, for one of its part i have to parse a blast
> report (refseq blast) and check how may organisms are eukaryotes and how
> namy of them are prokaryotes.
> I am using BIO::DB::taxinomy module:
> http://www.bioperl.org/wiki/Module:Bio::DB::Taxonomy
> 
> But for this i need a taxonomyid (like '33090') given in the example.
> So is it possible to get a taxonomyid from refseq balst report?
> If not then how i can deal with this problem?
> 
> i would really appreciate if anyone can help me out.
> 
> Thanks
> Shalabh
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================




More information about the Bioperl-l mailing list