[Bioperl-l] Bio::*Taxonomy* changes

Sendu Bala bix at sendu.me.uk
Wed Jul 26 12:49:05 UTC 2006


Chris Fields wrote:
> We're giving you the freedom to do what you want to Bio::Taxonomy.

I don't want to do anything with Bio::Taxonomy any more. I've already 
shown that it isn't suitable for the job. Regardless of how it is 
implemented, the entire idea of a class that contains Nodes isn't 
appropriate, for reasons already stated.


> Realize that the only contentious issue here is 
> that horrible lineage line in the GenBank file.  We should have a way to 
> rebuild it as it was from the original file (i.e. not rebuild it from 
> scratch with DB lookups by default).  However, you should also have the 
> option to rebuild it from lookups (i.e. correctly), which you could do 
> with a Taxonomy.  

And I've already shown how rebuilding with a Taxonomy is very far from 
ideal, while switching db_handle on a Node would be perfect. Why are you 
now advocating Taxonomy when there is no reason to?


> Note this Bio::Taxonomy method:
> 
>        classify
> 
>         Title   : classify
>         Usage   : @obj[][0-1] = taxonomy->classify($species);
>         Function: return a ranked classification
>         Returns : @obj of taxa and ranks as word pairs separated by "@"
>         Args    : Bio::Species object

Note that all this method does is let you combine a list of rank names 
with the classification array in a Bio::Species, spitting out some weird 
data structure. It is only of interest to Bio::Taxonomy::Tree.
We're in the situation where we don't know the rank names corresponding 
to the classification array in a Bio::Species generated by genbank et 
al. So classify() is of zero value.


> As Bio::Species will be deprecated, you can use that method in a dual, 
> sneaky way: 1) directly store the lineage information,

No. Lineage information must be in the form of Nodes or you can't answer 
lineage-related taxonomic questions.


> 2) return the real one (DB lookups) if needed

Messy. Doing it with Node would be far superior.


Again, Node works all the time, while Taxonomy would work badly or not 
at all some of the time. Rather than suggest ways of using Taxonomy, 
tell me what is wrong with my current Node plan.



More information about the Bioperl-l mailing list