[Bioperl-l] Bio::*Taxonomy* changes

Chris Fields cjfields at uiuc.edu
Tue Jul 25 12:28:01 UTC 2006


Look, you explaining this to me, as you see it, does not convince me  
that its the correct or right way to do it.  Okay?  Can we agree on  
that?  I do not think that Species and Taxonomy are the same thing.   
A species should not hold more than one node.  A species, by  
definition, is a rank in Taxonomy, and is a node, not a full  
Taxonomy, so Bio::Species should be a Node, not a Taxonomy.  I don't  
see how I can be any clearer...

The fact that it may work is beyond the point.  That's like putting  
duct tape on a leak to me.  Why not just simplify Bio::Species into a  
Node?  Or make it into a Node and get rid of it altogether.

You are going to do what you want to do, regardless of what I say.   
Seems to be par for the course here.  I'm REALLY tired of arguing the  
point.  Okay?  Just drop it.  I have other priorities in life besides  
goddamned bioperl right now...

Chris

On Jul 25, 2006, at 2:05 AM, Sendu Bala wrote:

> Chris Fields wrote:
>>
>> There is one thing I will make perfectly clear here: there should
>> never, ever be enforced lookups for SeqIO (even using caches), though
>> I have no problem having optional ones.  This is something I have
>> stated before and what you propose below steers dangerously in that
>> direction.  Where, for instance, do you store the lineage from a
>> GenBank file?  Do you want to do a series of Tax lookups to restore
>> that data?  I think that the number one complaint for sequence
>> parsing is speed, which would only get slower with lookups (even
>> cached).
>
> I already gave a code example of exactly how Bio::Taxonomy is perfect
> for storing the lineage data in a GenBank file with or without a
> database lookup. I think perhaps at the time you first read this you
> basically ignored it because you had trouble with the idea of adding
> nodes to a species. If you have been glossing over my argument, it may
> be instructive to go over what I've been saying with a clear eye.
> Anyway, here it is again, and remember in this example,  
> Bio::Species isa
> Bio::Taxonomy:
>
>
> ## the fully-manual way
> my $species = new Bio::Species;
> my $node = new Bio::Taxonomy::Node(-name => 'Saccharomyces  
> cerevisiae',
>                                     -rank => 'species', -object_id  
> => 1,
>                                     -parent_id => 2);
> my $n2 = new Bio::Taxonomy::Node(-name => 'Saccharomyces',
>                                   -object_id => 2, -parent_id => 3);
> # (no assumption that 'Saccharomyces' is the genus, so rank()  
> undefined)
> my $n3 = [etc]
> $species->add_node($node);
> $species->add_node($n2);
> [etc]
>
> ## Using a factory without db access
> # assume that Bio::Taxonomy::GenbankFactory implements
> # some modified Bio::Taxonomy::FactoryI
> my $factory = Bio::Taxonomy::GenbankFactory->new();
> my $species = $factory->generate(-classification => ['Saccharomyces
>               cerevisiae', 'Saccharomyces',  
> 'Saccharomycetaceae' ...]);
> # the generate() method above just does the fully-manual way for you
>
> ## Using a factory with db access
> # assume that Bio::Taxonomy::EntrezFactory implements some
> # modified Bio::Taxonomy::FactoryI and uses Bio::DB::Taxonomy::entrez
> # to get the nodes
> my $factory = Bio::Taxonomy::EntrezFactory->new();
> my $species = $factory->fetch(-scientifc_name => 'Saccharomyces
>                                                     cerevisiae');
>
>
> So now do you see how we're able to do the Genbank no-db way and the
> db-using way with the same object model? We're able to do it the same,
> sane way because a Node is just a node; you can make them yourself
> manually, or retrieve them from a database. Once you stick them in a
> Taxonomy you can then (potentially) ask all the questions of the data
> that you can with existing Bio::Species. No cruft is required anywhere
> at all. All the Taxonomy classes can be 'pure', while only  
> Bio::Species
> has to have backward-compatibility methods.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign






More information about the Bioperl-l mailing list