[Bioperl-l] Re: [SO-devel] Re: GFF3 preliminary

Richard Durbin rd at sanger.ac.uk
Fri Feb 21 09:55:24 EST 2003



Chris Mungall wrote:
> I agree with RIchard that there should be a one-to-one relationship
> between features and lines.

Thanks!

> Also, we discovered at the meeting that we can't represent the full gene
> model by attaching coding_start/coding_end directly to mRNA, as this makes
> representation of dicistronics problematic.
> 
> What about this:
> CDS/ORF as children of mRNA with n:n cardinality between CDS and mRNA. The
> CDS feature indicates the coding_start, coding_end (actual coding exons
> are implicit from exons + mRNA) so coding_start, coding_end become
> redundant and are thus removed from the 'exchange-compliance' part of
> SOFA.
> 
> For the richer part of SOFA, we allow coding_start, coding_end but there
> are restricted to being 3bp long (representing Met and stop)

Absolutely not.  coding_start and coding_end should be junctions with 0 
length
just before the start of the ATG and after the stop codon.  There are SO 
terms for the start_codon and stop_codon but you really don't want to 
use them to define the start and end.  e.g. Michael's 4 base start 
codons, but also many start and stop codons are broken by introns. 
There is no reason not to use junctions.

Richard





More information about the Bioperl-l mailing list