[Bioperl-l] Bio::SeqFeature::GeneStructure and Prediction

Hilmar Lapp hlapp@gmx.net
Wed, 02 Aug 2000 12:35:19 +0200


Following up the discussion from yesterday, I've added a
Bio::SeqFeature::GeneStructure module, offering specific support for
representing the structure or structural elements of a gene. I tried to keep
it rather generic, so there aren't really that fancy things. Most of the
methods deal with managing data elements in a convenient way. There are some
basic 'computations' though, like a introns() method returning the regions
intervening the exons as features, and a cds() method concatenating the
respective sequences of exons.

Note that this module doesn't know about prediction, and doesn't care about
the actual type of the exon objects or other structural elements being added,
as long as they implement Bio::SeqFeatureI.

Bio::SeqFeature::GeneStructure is-a Bio::SeqFeature::Generic.

Regarding prediction (my present goal is representing gene structure
prediction results), I opened a directory Bio/Tools/Prediction (make sure you
say 'cvs update -d'), meant to hold any modules specific to prediction
methods. There I've created two modules Bio::Tools::Prediction::Gene and
Bio::Tools::Prediction::Exon.

Bio::Tools::Prediction::Gene is-a Bio::SeqFeature::GeneStructure with support
for being a predicted gene structure: there are methods to store a predicted
CDS and a predicted peptide sequence. It is not required to but will usually
have objects of Bio::Tools::Prediction::Exon added as exons.

Bio::Tools::Prediction::Exon is-a Bio::SeqFeature::Generic with support for a
predicted exon. That is, there are several methods for various scores
(start-of-exon score, which may be an acceptor splice site score, end-of-exon
score which may be a donor splice site score, etc).

A Bio::Tools::Genscan will follow utilizing the above modules. Similarly, an
MZEF parser. ESTScan in principle predicts gene structural elements, too,
namely coding and uncoding regions, and predicts peptide sequences, so I hope
I can use this model for that, too.

Have a look if you're interested, and I'd appreciate feedback.

	Hilmar
-- 
-----------------------------------------------------------------------
Hilmar Lapp                                      email: hlapp@gmx.net
NFI Vienna, IFD/Bioinformatics                   phone: +43 1 86634 631
A-1235 Vienna                                      fax: +43 1 86634 727
-----------------------------------------------------------------------