[Open-bio-l] Schema for genes & features & mappings to assemblies
Ewan Birney
birney@ebi.ac.uk
Tue, 23 Apr 2002 11:47:01 +0100 (BST)
On Tue, 23 Apr 2002, Thomas Down wrote:
> On Tue, Apr 23, 2002 at 10:09:09AM +0100, Ewan Birney wrote:
> >
> > We do need to discuss assemblies. I vote for "flat" one level assemblies
> > (set of contigs form a chromosome), ala Ensembl, as I believe that the
> > assummed heirarichal nature of assemblies is (a) mainly a consequence of
> > how it is put together and the intermedaites in the heirarchies between
> > contigs of DNA and chromosomes are nearly never stable (b) means you
> > always have to use software to do conversions and can never do it easily
> > with SQL (PL/SQL probably can...).
>
> I think that's actually the crux of the assembly debate. If you
> pick a schema which supports multi-level assemblies, nobody is
> actually forcing you to /use/ that capability. If you have a
> naturally one-level assembly, you can stick to that.
But you can't assumme people will make the same assumptions about this -
ie, to allow generic binding to *any* bioSQL database you have to go
multilevel.
>
> However, if you're keen to put as much of your assembly-munging
> as possible in SQL, that really forces a `fixed-number-of-levels'
> assembly, like Ensembl's denormalized two-level system. My thinking
> on this is coloured by the fact that I've personally always worked
> with code which does assembly in memory (in BioJava, we're
> keen to keep the feature projection code quite separate
> from any specific database technology -- we hate fixing off-by-one
> errors :-).
>
> Does Ensembl get any big performance boosts from using in-database
> assembly?
>
We percieved that we did, but didn't test it. It gives people options
about how to handle the coordinate mapping.
I still prefer 1 level as I think n levels is just asking for
obfustication and prevents people easily treating the database as "just
the data" without having any code dependancies.
> Thomas.
> _______________________________________________
> Open-Bio-l mailing list
> Open-Bio-l@open-bio.org
> http://open-bio.org/mailman/listinfo/open-bio-l
>
-----------------------------------------------------------------
Ewan Birney. Mobile: +44 (0)7970 151230, Work: +44 1223 494420
<birney@ebi.ac.uk>.
-----------------------------------------------------------------