[Bioperl-l] Proposal: SemanticMapping and call for info on Gene
Objects
Ewan Birney
birney@ebi.ac.uk
Sun, 12 May 2002 18:57:12 +0100 (BST)
I think this is a great idea, and needed.
Some quick thoughts -
(a) I think a SeqIO object can only have one semantic mapper (right?)
(b) Ensembl's model for translation/genes is as follows:
Gene has an (unordered) set of Transcripts
Transcript - has-a ordered list of Exons
- has-a translation object which
has-a start-exon (one of the above list)
has-a start-codon-position (relative to the exon)
has-a end-exon
has-a end-codon-position (relative to the exon)
The important thing to note here is that the start/end points are
properties of the transcript/translation, and not of an exon, mainly
because an exon could both be a fully UTR exon or a coding/UTR exon.
The drawback is that Ensembl cannot currently deal with start/ends
across introns which is bad (it could do with a little tweaking to the
conventions - ie, start codons mean the first base which always has to lie
in one or other of the exons).
I am tempted to advocate a more standard case where start/end is
relative to the transcript virtual cDNA. The drawback of this is that it
ends up being more complex to - for example - figure out where the start
exon is and in cases where you are *building* genes computationally
produces some nasty calculation overheads.
ewan
-----------------------------------------------------------------
Ewan Birney. Mobile: +44 (0)7970 151230, Work: +44 1223 494420
<birney@ebi.ac.uk>.
-----------------------------------------------------------------