[Open-bio-l] Fwd: [BioSQL-l] Taxa

Matthew Pocock matthew_pocock@yahoo.co.uk
Fri, 30 Aug 2002 20:31:18 +0100


Hi Hilmar,

Unique key constraints are cool. I'm all for that. The current taxa 
scheima seems a shame to me. I may be out of sync with the way things 
are done now, but in feb/march there was no taxa tree. There was no way 
to walk up and down the organism classification hierachy. That made it 
difficult to ask questions like "find accessions for all species that 
are cousins of rat".

I see that in some cases all you want is a (normalized?) table of full 
organism strings. In other cases you want the full tree (one entity per 
node). Is this a case where BioSQL should just put it's hands up and say 
that there is a sequence/feature scheima & adapter, a range of possible 
taxa scheimas & adapters, and a service (or adapter) that bridges the 
two - getTaxaForSeqID and getSeqIDForTaxa?

Matthew

ps Taxa, Taxon - yeah, I got this wrong for the BioJava code. +1 for 
changing plural to singular

pps I have mySQL schemas for taxon support if you are interested but 
it's not rocket science - one table for taxon instance, properties & 
pairent pointer, and one table for taxon_bioentry associations

Hilmar Lapp wrote:
> Hmm. I could take zero response as consent to all changes proposed -- 
> would be great actually. To make double sure those who might have wanted 
> to see this proposal did see it I'm resending and cross-posting it ... 
> Shout now if you're concerned or don't complain later.
> 
> Also, if this is the first time you see this and it is important to you, 
> you're probably not on the biosql mailing list although you should be. 
> Please subscribe (see footer for list info) since this is the last time 
> I'm going to cross-post.
> 
>     -hilmar
> 
> Begin forwarded message:
> 
>> From: "Hilmar Lapp" <hlapp@gnf.org>
>> Date: Wed Aug 28, 2002  02:23:41  PM US/Pacific
>> To: "Biosql" <biosql-l@open-bio.org>
>> Subject: [BioSQL-l] Taxa
>>
>> I propose to make the following changes to tables Taxa and
>> Bioentry_Taxa, ranked by decreasing priority (still, I'd like to
>> make all of these changes).
>>
>> 1) Introduce UK constraints. This makes it much safer (and
>> potentially faster due to amenability to caching) to find records).
>> My suggestions for UK candidates are common_name, full_binomial (see
>> 2.), and ncbi_taxa_id. (There are no UKs yet.)
>>
>> 2) Introduce an additional column full_binomial ('full' in the
>> bioperl Bio::Species sense: with ssp if applicable). I think this is
>> going to be searched against quite frequently, and I just don't
>> think it's very practical to always extract this out of full_lineage.
>>
>> 3) Collapse Taxa and Bioentry_Taxa into one and add Taxa_Id as a
>> nullable FK to Bioentry. Having this additional association table
>> doesn't add any functionality we'd like to use, but instead can only
>> decrease performance and enforceability. Also, it necessitates
>> otherwise pointless code in the adaptors.
>>
>> 4) For consistency, rename Taxa and all Taxa related columns to
>> Taxon (Taxa is plural, all other table names are singular).
>>
>> What do people think? Any outcries? If I make these changes,
>> adaptors in different projects will have to be changed (I'll change
>> the ones involved in Bioperl-db).
>>
>>     -hilmar
>> -- 
>> -------------------------------------------------------------
>> Hilmar Lapp                            email: lapp at gnf.org
>> GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
>> -------------------------------------------------------------
>>
>>
>> _______________________________________________
>> BioSQL-l mailing list
>> BioSQL-l@open-bio.org
>> http://open-bio.org/mailman/listinfo/biosql-l
>>
> -- 
> -------------------------------------------------------------
> Hilmar Lapp                            email: lapp at gnf.org
> GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
> -------------------------------------------------------------
> 
> _______________________________________________
> Open-Bio-l mailing list
> Open-Bio-l@open-bio.org
> http://open-bio.org/mailman/listinfo/open-bio-l
> 


-- 
BioJava Consulting LTD - Support and training for BioJava
http://www.biojava.co.uk

__________________________________________________
Do You Yahoo!?
Everything you'll ever need on one web page
from News and Sport to Email and Music Charts
http://uk.my.yahoo.com