[DAS] RFC for feature data model
David Block
dblock@gnf.org
Fri, 23 Aug 2002 10:13:47 -0700
On Friday, August 23, 2002, at 12:54 AM, Chris Mungall wrote:
>
> I like the decoupling. but I think we have to be careful about cases
> when
> data should be attached to the floating Gene entity, and when it should
> be
> attached to the Gene instance-on-a-sequence.
We have full-blown location objects. This means that we can attach
annotation anywhere we wish. In fact, we are attaching locations only
to transcripts, and I expect most annotation to go there, but some gene
expression data, etc., will not be transcript specific depending on the
probes used, etc., so some annotation will go on the gene. It will be
the job of the middleware to traverse the hierarchy of entities and give
all the relevant annotation to the user- which will be fun!
>
> For instance, it's always useful to have a gene-level summary of
> information such as function and cellular localisation that applies to
> all
> spliceforms / wild type forms, often you want to attach this information
> at the instance-on-a-sequence level. For example, different products
> have
> different functions.
>
> The way we're thinking about this with the new flybase schema (correct
> me
> if I'm wrong, Dave) is like this:
>
> Gene SET
> |
> +--- GeneStructure aka allele INSTANCE
> |
> +---- Transcript INSTANCE
> |
> +--------- Exon, Translation etc INSTANCE
>
Okay, we will make Gene and Transcript "Floating Entities" with zero or
more Location objects, and then Exons and other sub-gene pieces will be
simple SeqFeatures. I think that's more flexible for us, since we're
dealing with multiple assemblies.
<snip/>
> DAS itself doesn't care, the client just fires off different content
> handlers / xslt for different namespaces.
>
> I think this is roughly equivalent to what Matt was suggesting; i just
> see
> it as less of a decoupling as often the SeqFeatures (instances) are the
> biological objects themselves, they can't always be viewed independently
> of their location/sequence.
>
Our das services are likely to be on the level of transcripts and
smaller - genes will only be accessible from transcript annotation.
--
David Block dblock@gnf.org
GNF - San Diego, CA http://www.gnf.org
Genome Informatics / Enterprise Programming
Weblog: http://radio.weblogs.com/0104507/