[Bioperl-l] Re: Gene Structure / GenScan
hilmar.lapp@pharma.Novartis.com
hilmar.lapp@pharma.Novartis.com
Tue, 1 Aug 2000 16:09:10 +0100
> The Ensembl Genscan parser Ewan sent yesterday seems to be a good
starting
> point. However, I'd prefer to have a gene structure represented
optionally
> independent of the/an underlying sequence (object), that is, as a feature
> which may or may not have a sequence attached. In addition, a parser
should
> not need to rely on being provided with the source sequence, and the
> resulting gene structure representation can be attached to the pertaining
> source sequence by the client.
>
> I'd propose the following:
> Bio::SeqFeature::GeneStructure is-a Bio::SeqFeature::Generic (or just a
> Bio::SeqFeatureI ?)
> and offers specific support for gene structure related things, like
[...]
Aha. Now you want the appropiate Ensembl gene objects, not the genscan
parser. Look at
Bio::EnsEMBL::Gene
::Transcript
::Translation
Look at
http://www.ensembl.org/Docs/Pdoc/ensembl/modules/Bio/EnsEMBL/modules.html
Again, I would be happy if these moved "across" to bioperl.
you will want to add additional stuff to the Gene object to handle
promoters (or perhaps the transcript object). Don't forget about
alternative splicing.
Well, that's not really what I was aiming at. I thought about a
representation of the _data_ which make up a gene structure, as, e.g.
people find it or programs predict it. IMHO all that _interpretation_
of the data (features in this case) belongs to separate classes,
either derived ones, or within another hierarchy (you could think of a
GeneTranscriber who knows about alternative splicing). So, the modules
I proposed shouldn't do much with actual sequences apart from maybe
very basic things. They're just features, which in the first place is
all you need to represent e.g. GenScan results. And they should be
rich enough to allow other modules to make real stuff like protein
sequences out of it. So, lightweight, but heavy enough.
Am I missing something?
Hilmar