[Bioperl-l] Bio::DB::Taxonomy:: mishandles species, subspecies/variant names
Nadeem Faruque
faruque at ebi.ac.uk
Mon May 15 19:47:27 UTC 2006
>> My personal view is that having it as an annotation would serve no
>> real
>> purpose. For me the whole point of any kind of species
>> representation in
>> bioperl is to allow you to compare species in a biologically
>> meaningful
>> way. If it's just some annotation then that means it's basically
I understand the need to find the species name of entries, especially
now that so many complete genomes have been given their own strain-
specific tax nodes, and I also think it is a shame that the ncbi tax
dump does not give a rank to entries such as these (they cannot
easily be distinguished from unofficial ranks higher in the tree
without ascending the tree).
Would it be useful for the species name to be included within EMBL
file headers, eg in a line called OB (OB is a terrible suggestion
based on 'Organism Binomial' since OS is already in use)?
eg two examples of the species 'Apple stem grooving virus', where the
second one would appear to be a different species without delving
into the tax tree or the inclusion of an OB line.
AC D14995; S47260;
DE Apple stem grooving virus genome, complete sequence.
OS Apple stem grooving virus
OB Apple stem grooving virus
OC Viruses; ssRNA positive-strand viruses, no DNA stage; Flexiviridae;
OC Capillovirus.
AC AY646511;
DE Citrus tatter leaf virus strain Kumquat 1, complete genome.
OS Citrus tatter leaf virus
OB Apple stem grooving virus
OC Viruses; ssRNA positive-strand viruses, no DNA stage; Flexiviridae;
OC Capillovirus.
> My point is, a large number of users do NOT use, nor care about,
> taxonomic
> information to the degree they need to know the entire
> classification of the
> organism; many are just as happy about getting the scientific name
> only,
> which is in the GenBank/EMBL file itself. To take one extreme, it
> is not
> productive to force every user to download the NCBI tax database
> and use
> lookups just to convert sequences from EMBL format to GenBank
> format. It's
> not productive to allow users to spam the NCBI tax database
> remotely either,
> so hardcoding lookups is, IMHO, a big mistake.
I don't think you need to add any information to turn an embl-format
file into a Genbank flatfile, but maybe I'm missing something obvious.
Nadeem
--
Dr S.M. Nadeem N. Faruque
9 Barley Court
Saffron Walden
Essex CB11 3HG
01799 500 120
More information about the Bioperl-l
mailing list