[Bioperl-l] Re: Comparative genomics

Fri, 28 Sep 2001 12:54:22 +0100

> We'll write the db schema, object layer,etc. to store those
> phylogenetic trees in the db. 

Is this a tree per gene ? How many species (nodes) per tree? If the trees are
biggish, are always loaded from complete flatfiles and query speed is
essential, do not use the common NODE(ID, PARENT_ID, NAME) type schema: it's
only efficient for getting the direct parents or children of a node. If
that's not enough (e.g. you want complete subtrees or lineages etc.), use
Celko's NODE(LFT, RGT, ID, NAME) schema. Or a hybrid of the two.

> Please shout if similar things exist already, I am thinkig of adding them
> to bioperl.

Perhaps NCBI taxonomy ? 

> Once the ortholgoues are identified we could also have a
> GeneClusterFinder, which looks at a gene and its ortholgoues, and walks on
> the sides of them to look for conserved gene clusters.

Sounds like a fun project. 

                                                                      Philip
-- 
The mail transport agent is not liable for any coffee stains in this message
-----------------------------------------------------------------------------
Philip Lijnzaad, lijnzaad@ebi.ac.uk \ European Bioinformatics Institute,rm A2-08
+44 (0)1223 49 4639                 / Wellcome Trust Genome Campus, Hinxton
+44 (0)1223 49 4468 (fax)           \ Cambridgeshire CB10 1SD,  GREAT BRITAIN