[Bioperl-l] Bio::*Taxonomy* changes

Sendu Bala bix at sendu.me.uk
Mon Jul 17 21:33:26 UTC 2006


Chris Fields wrote:
> There was some interest in getting Bio::Species to delegate to
> Bio::Taxonomy::Node, so having scientific_name() would help quite a bit
> since the name used on the ORGANISM line is the scientific name (well, is
> supposed to be; famous last words).

Can you clarify exactly what you mean here? Preferably with an example? 
ORGANISM line of which file format?
The reason I ask is that I still feel we need to do parsing of the names 
for species rank and lower:

# The 'scientific name' for humans could be considered to be 'Homo sapiens'.
# Taxid 9606 in the NCBI taxonomy database has rank 'species' and 
ScientificName 'Homo sapiens'.
# For sanity, Bio::*Taxonomy* likes to interpret this ScientificName as 
'sapiens' so that the genus is not held redundantly. It provides a 
binomial() method to give you 'Homo sapiens' again if you want it.
# I plan on maintaining this; scientific_name() would give you the 
non-redundant sibling-unique name 'sapiens'. binomial() on a species 
rank and lower would give you 'Homo sapiens' (presumably grabbing the 
'Homo' from the parent node with rank 'genus', or similar).

Good, bad or ugly? I would prefer it works like this and we agree to 
differ with NCBI on what the 'scientific name' of a species node should 
be. Bio::Species can still delegate to Bio::Taxonomy::Node by calling 
binomial() (which I propose will actually give the correct answer, even 
for bacteria and viruses).

Perhaps the short-hand (and the classifier used in name()) shouldn't 
mention the word 'scientific' to avoid confusion? But a) what else would 
we call it?, and b) for all ranks above species it /is/ the scientific name.



More information about the Bioperl-l mailing list