[Bioperl-l] Bio::*Taxonomy* changes
Sendu Bala
bix at sendu.me.uk
Mon Jul 24 19:45:09 UTC 2006
Chris Fields wrote:
>> Hilmar Lapp wrote:
>>> Sounds good to me, except there is no Bio::TaxonomyI yet,
>> Indeed, I propose making one.
>
> So, Node would implement this, correct? Naming it Bio::TaxonomyI makes me
> think that Bio::Taxonomy implements TaxonomyI, not that Bio::Taxonomy::Node
> implements it.
No no, I guess the whole rest of you reply was confused by this one
point. Bio::TaxonomyI would be the interface for Bio::Taxonomy.
Definitely not a Node.
>> Yes, which is why Bio::Taxonomy is appropriate here. Assuming that
>> Bio::Species isa Bio::TaxonomyI:
>>
>> ...
>> SOURCE Saccharomyces cerevisiae (baker's yeast)
>> ORGANISM Saccharomyces cerevisiae
>> Eukaryota; Fungi; Ascomycota; Saccharomycotina;
>> Saccharomycetes;
>> Saccharomycetales; Saccharomycetaceae; Saccharomyces.
>>
>> ...
>>
>> ## the fully-manual way
>> my $species = new Bio::Species;
>> my $node = new Bio::Taxonomy::Node(-name => 'Saccharomyces cerevisiae',
>> -rank => 'species', -object_id => 1,
>> -parent_id => 2);
>> my $n2 = new Bio::Taxonomy::Node(-name => 'Saccharomyces',
>> -object_id => 2, -parent_id => 3);
>> # (no assumption that 'Saccharomyces' is the genus, so rank() undefined)
>> my $n3 = [etc]
>> $species->add_node($node);
>> $species->add_node($n2);
>> [etc]
>
>
> Hrmm... why would you add multiple nodes to a species object? A Species
> is-a Node, not a full Bio::Taxonomy.
In my proposal, a Bio::Species certainly is a full Bio::Taxonomy.
>> Bio::Species differs from Bio::Taxonomy only so it contains all the
>> legacy methods names that Bio::Species currently has, for backward
>> compatibility. Setting $species->classification() would delete all nodes
>> of self, use a GenbankFactory to make a new Bio::Species, then pull out
>> all its Nodes and add them to self.
>
> The idea is to replace Bio::Species with something that works well, so
> having it implement a Node-like interface works since it is-a Node. Having
> it implement a Taxonomy-like interface, though, doesn't make a lot of sense
> as a species is-not-a Taxonomy.
Right. So this is why we've been 'butting heads'. Up till now I had no
idea why you were so adamant about keeping things the old
Bio::Taxonomy::Node way.
Bio::Species very definitely has never been, nor do we want it to
become, a single node of a taxonomy. It has always been a complete
taxonomy. You can tell that by the fact it has a classification, and you
could ask what its genus is.
This is why I'm proposing that Bio::Species become a Bio::Taxonomy.
Because that's the correct object model for the kinds of things
Bio::Species wants to do.
> Using a factory in Bio::DB::Taxonomy should solve any issues about what
> object type is returned, since that could simply be made based on the rank
> itself (species rank or below == Bio::Taxonomy::Species, genus and above ==
> Bio::Taxonomy::Node).
Frankly, that idea makes me ill. A Node, at the fundamental level, is
just a very simple object that needs to associated a taxonomic rank with
a scientific name. If you start making different objects for different
ranks, you've departed from any semblance of meaning in the object model.
> Nope. Don't agree. Sorry. I can't see why you would force a Species to be
> a Taxonomy when it isn't. The object hierarchy doesn't make sense to me.
Does it make sense now?
> I'll repeat: a Node and a Species is-not-a Taxonomy.
I'll repeat: A Node is a Node and a Bio::Species is a Taxonomy ;)
> A Taxonomy object has-a Node or Species or combinations thereof ;
No, a Taxonomy contains Nodes. One of those Nodes might have a rank() of
'species'.
A Bio::Species contains Nodes. One of those Nodes definitely has a
rank() of 'species'. It /must/ have other nodes, because the job of
Bio::Species has in the past and will in the future be to store all the
other taxonomic levels in a Genbank file. For the same reason
Bio::Species can't be a Node itself, because you can't store other Nodes
inside a Node.
More information about the Bioperl-l
mailing list