[Bioperl-l] getting/setting species names with Bio::Species

Chris Fields cjfields at illinois.edu
Fri Jan 15 16:00:21 UTC 2010


> FWIW, I'd prefer "binomial" = "genus" . "species"


That's the way Bio::Species is supposed to work, at least when it was refactored by Sendu.  But just a note: Bio::Species was considered deprecated (scheduled for the 1.7 release IIRC) for many very good reasons in favor of Bio::Taxon.  First and foremost among these is the fact we cannot consistently parse out the genus/species/strain/variant/etc for every organism in GenBank w/o knowing it's full lineage, which means including some taxonomic information.  And even then it's highly problematic.

We've had several heated discussions on list about how to handle this in a somewhat backwards-compatible way, and the main solution was to forego compatibility issues altogether and eventually deprecate Bio::Species altogether in favor of Bio::Taxon, a class that doesn't make the same assumptions.  Bio::Species, in the interim, is-a Bio::Taxon.  You'll note that a minimal Bio::DB::Taxonomy instance is constructed from the classification scheme in some instances, but if one had a proper DB link one could link to Entrez Taxonomy or a local flat file indexes DB and grab the info.  Bio::Taxon (correct me if I'm wrong on this Sendu, if you're out there) eschews various methods (species, etc) for simpler consistent ones based on Taxonomy, and doesn't force us to handle every exception to getting the genus/species out of a name.  That is left up to the user, at their peril.

For either one, if you are reproducing the fully qualified name, you probably should use something like node_name() for consistency.  Bio::Species also has scientific_name().  With a true Bio::Taxon one would need to be check this is performed on the species node.

chris

On Jan 15, 2010, at 9:31 AM, Mark A. Jensen wrote:

> I'm not that familiar with Bio::Species either, but this looks
> like conflicting semantics betwen Bio::Species and Bio::SeqIO.
> Bio::SeqIO sets the species accessor to the 'species' element of
> the lineage array, I believe.
> FWIW, I'd prefer "binomial" = "genus" . "species"
> MAJ
> ----- Original Message ----- From: "Dave Messina" <David.Messina at sbc.su.se>
> To: "BioPerl List" <bioperl-l at lists.open-bio.org>
> Sent: Friday, January 15, 2010 10:17 AM
> Subject: [Bioperl-l] getting/setting species names with Bio::Species
> 
> 
>> Hi everybody,
>> 
>> I'm having a little trouble with names in Bio::Species objects.
>> 
>> According to the Bio::Species documentation, if I have a species name as a string, like "Homo sapiens", I can get and set that using the species method:
>> 
>> my $my_species_obj = Bio::Species->new();
>> $my_species_obj->species('Homo sapiens');
>> 
>> print $my_species_obj->species;     # 'Homo sapiens'
>> 
>> 
>> That works fine if I create the Bio::Species object myself.
>> 
>> But if I try to get that string back out from a BIo::Species object created by SeqIO from a genbank file, I get just 'sapiens' back:
>> 
>> my $io = Bio::SeqIO->new('-format' => 'genbank',
>>                        '-file'   => 'hoxa2.gb');
>> my $seq_obj = $io->next_seq;
>> my $io_species_obj = $seq_obj->species;
>> 
>> print $io_species_obj->species;     # 'sapiens'
>> 
>> 
>> I think that happens because genbank records have more taxonomic info about the species name, like the genus (and in fact the whole taxonomic categorization: kingdom phylum order, etc). So the genus is stored separately.
>> 
>> Poking around a bit more in Bio::Species, I turned up the method 'binomial', which appears to do the right thing, returning genus and species in both cases. Except, as you can see, the space is stripped out for my species-name-is-just-a-string object:
>> 
>> print $my_species_obj->binomial;    # 'Homosapiens'
>> print $io_species_obj->binomial;    # 'Homo sapiens'
>> 
>> 
>> I'm not very familiar with Bio::Species (and its parent Bio::Taxon); am I using it correctly above, or is there a better way?
>> 
>> If not, this kinda looks like a bug to me. I've got a patch which works and passes the BioPerl test suite.
>> 
>> 
>> Thanks,
>> Dave
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l





More information about the Bioperl-l mailing list