[GMOD-devel] Re: [Open-bio-l] Schema for genes & features & mappings to assemblies
Lincoln Stein
lstein@cshl.org
Tue, 23 Apr 2002 11:20:35 -0400
On Tuesday 23 April 2002 07:07, Elia Stupka wrote:
> > Do you really want to special-case gene structures? I thought
>
> Hmm... I agree with you, I like that, I guess then what we need to work on
> is the clever code that would drive it. Coincidentally we are just
> discussing super-non-hierarchical features for our comparative analysis
> db, so we might end up coding this, if we want it all to run outside
> ensembl on the bioperl-pipeline.
>
> Elia
The way I took with Bio::DB::GFF is the following:
- all features are stored as tag/values in a single table (normalized for
tag names)
- a series of "aggregator" classes are responsible for taking certain
sets of tags and constructing rich objects from them. For example, the
Bio::DB::GFF::Aggregator::transcript class looks for tags named
"exon", "cds", "polyA-site" and so forth and uses them to construct a
transcript object.
- you can create your own aggregators on the fly using an aggregatorFactory,
or use "static" aggregators stored in .pm files.
I think this is similar to Jason's recent Builder interface. The strategy
has pluses and minuses. The plus is that you don't have to futz with the
schema every time you want to add a new component to your gene. The minus is
that it's easy for the database to drift -- no referential integrity.
There's also a whiff of the AceDB "magic tag" syndrome here.
Lincoln