[Biopython-dev] taxonomic labels

Peter biopython at maubp.freeserve.co.uk
Mon Oct 6 17:10:52 UTC 2008


On Mon, Oct 6, 2008 at 5:35 PM, Zac Brown <zac at zacbrown.org> wrote:
> Hi all,
>
> Just a quick question with regard to using the Entrez module. I am looking
> for a way to get a dictionary for an organism's taxonomy, that is something
> like:
>
> blah = {'domain':'xyz','family':'xyz','class':'xyz'...} and so on. Is there
> some uniform way to generate this type of information?
>
> Thanks,
>
> Zac

This isn't really a question for the dev-mailing list, the general
discussion list would be better.  Anyway, have you looked at the
taxonomy lineage entries?

from Bio import Entrez
ncbi_taxon_id = "9606"
handle = Entrez.efetch(db="taxonomy",id=ncbi_taxon_id,retmode="XML")
records = Entrez.read(handle)
assert len(records)==1
lineage = records[0]["LineageEx"]
print lineage

This should contain the information you want, but there are a number
of "no rank" entries.  To turn it into a dictionary as requested, try
something like the following (on python 2.4 or later):

answer =dict((x["Rank"],x["ScientificName"]) for x in lineage if
x["Rank"] <> "no rank")
print answer

Peter



More information about the Biopython-dev mailing list