[Open-bio-l] Best practice for modelling data in GFF

Dan Bolser dan.bolser at gmail.com
Tue Jul 6 10:10:51 UTC 2010


When you don't get a reply, you never know if your question was too
dumb, too smart, or totally off topic.

Any hints?

Cheers,
Dan.

On 1 July 2010 11:12, Dan Bolser <dan.bolser at gmail.com> wrote:
> On 29 May 2010 00:08, Dan Bolser <dan.bolser at gmail.com> wrote:
>> Thanks all for replies.
>
> <snip>
>
>> There is a canonical way to model a gene, so I was wondering if it
>> makes sense to describe similar 'biology' (or in this case molecular
>> biology) in standard ways (when the feature isn't simply described by
>> a single line of GFF)?
>>
>> Perhaps I've not understood SO properly, but I'm not sure how its
>> structure is translated into GFF structure ... is there a 1 to 1
>> mapping?
>
> Lack of replies lead me to believe that indeed, the GFF Parent
> attribute should reflect (or be strictly determined by) the SO
> 'relationships' (are they all 'part_of' relationships?)
>
> However, I was trying to get some concepts clear in my head, and I
> ended up creating a figure of a 'canonical gene' in SO [1], based on
> the one in the GFF docs [2].
>
> [1] http://imagebin.ca/view/Ni9BFbK.html
> [2] http://www.sequenceontology.org/gff3.shtml
>
>
> There is a transitive part_of relationships between 'mRNA' and 'gene',
> which explains line 4 to 6 of the canonical gene GFF [2].
>
> However, the figure shows that 'exon' is part_of 'transcript', and not
> part_of 'mRNA'. If I got the figure right, and if I understand
> correctly, there is no way to transitively infer that exon is part_of
> mRNA (line 7 to 11 of the GFF [2]).
>
> This implies that the 'structure' in GFF isn't strictly determined by SO.
>
> Or is it a mistake in SO?
>
>
> Sorry if this is a 'gotcha' that has been discussed before. Any links
> to help me understand would be great.
>
> Dan.



More information about the Open-Bio-l mailing list