[Bioperl-l] Bio::Taxonomy confusion
Sendu Bala
sb at mrc-dunn.cam.ac.uk
Wed May 10 09:30:59 UTC 2006
Hi,
I'm a little confused as to how names are supposed to work in
Bio::Taxonomy::Node.
In the bioperl versions that I've looked at a Node doesn't seem to store
the most important information about itself - it's scientific name - in
an obvious place. bioperl 1.5.1 puts it at the start of the
classification list. I'd have thought sticking it in -name would make
more sense, but this is used only for the GenBank common name.
The Bio::Taxonomy docs still suggests:
my $node_species_sapiens = Bio::Taxonomy::Node->new(
-object_id => 9606, # or -ncbi_taxid. Requird tag
-names => {
'scientific' => ['sapiens'],
'common_name' => ['human']
},
-rank => 'species' # Required tag
);
and whilst Bio::Taxonomy::Node does not accept -names, it does have a
'name' method which claims to work like:
$obj->name('scientific', 'sapiens');
This kind of thing would be really nice, but afaics
Bio::Taxonomy::Node->new takes the -name value and makes a common name
out of it, whilst the name() method passes any 'scientific' name to the
scientific_name() method which is unable to set any value (and warns
about this), only get.
It seems like the need to have this classification array work the same
way as Bio::Species is causing some unnecessary restrictions. Can't the
more sensible idea of having a dedicated storage spot for the
ScientificName and other parameters be used, with the classification
array either being generated just-in-time from the hash-stored data, or
indeed being generated from the Lineage field?
Also, why does a node store the complete hierarchy on itself in the
classification array? If we're going that far, why don't the
Bio::DB::Taxonomy modules like Bio::DB::Taxonomy::entrez just have a
get_taxonomy() method instead of a get_Taxonomy_Node() method.
get_taxonomy() could, from a single efetch.fcgi lookup, create a
complete Bio::Taxonomy with all the nodes. Whilst most nodes would only
have a minimum of information, if you could simply ask a node what its
rank and scientific name was you could easily build a classification
array, or ask what Kingdom your species was in etc.
Are there good reasons for Taxonomy working the way it does in 1.5.1, or
would I not be wasting my time re-writing things to make more sense (to me)?
Cheers,
Sendu.
More information about the Bioperl-l
mailing list