[Bioperl-l] Bio::*Taxonomy* changes
Sendu Bala
bix at sendu.me.uk
Tue Jul 18 07:27:49 UTC 2006
Hilmar Lapp wrote:
> I don't think we should differ from NCBI in places where the
> connection between a method name and the NCBI data file is obvious or
> otherwise we will confuse people and send them into traps.
>
> $node->scientific_name() should simply report what NCBI reports. For
> simple species this will be identical to what $node->binomial()
> returns, but for others it may not, e.g., strains, varieties, etc or
> the weird world of viri and bacteria.
Ok, well this certainly seems to be consensus so I'll abide.
> This will also absolve us from retaining the business logic for how
> to construct the scientific name from genus, species, and possibly
> strain or whatever.
What about the existing genus(), species(), sub_species() and variant()
methods? There would be no need for any logic to join things together,
but I would still like to be able to get just 'sapiens' from somewhere.
Can I use species() for that purpose (though again, species is strictly
'Homo sapiens')? Likewise sub_species() and variant() could hold the
remaining non-redundant names. Or should all of these be deprecated
because they don't really have a place in a generic Node class?
What about node_name()? Yet another synonym of scientific_name? (right
now it grabs the common name(s)). Ugh.
What should I do with the classification array? Should it hold the raw
ScientificName like:
join(',', $node->classification) eq 'Homo sapiens, Homo,
Homo/Pan/Gorilla group [...]'?
Or should it be like:
join(',', $node->classification) eq 'sapiens, Homo, Homo/Pan/Gorilla
group [...]'?
The latter is how it currently works (when it works correctly); I would
rather fix it than lose the logic completely, but if we're staying true
to proper classification (vs. what a programmer might expect), I guess I
must use the raw ScientificName?
> binomial() isn't part of the NCBI taxonomy definition, so you have
> freedom there to report what suits you.
I don't think binomial() would serve any useful purpose now, however. I
can either deprecate it or make it a synonym of scientific_name() or
both. Or binomial() can be a version of scientific_name() that complains
if you use it on a rank higher or lower than species. As for species()
et al., it may have no place in a generic Node class. Thoughts?
More information about the Bioperl-l
mailing list