[Bioperl-l] SeqFeature proposal

Ewan Birney birney@ebi.ac.uk
Sun, 19 Nov 2000 22:28:04 +0000 (GMT)


In a road-to-damascus like experience (on the M25, in driving rain if
anyone is interested) I saw a new layout of feature and location objects.

The layout would be 


   - simple for simple cases

   - complex for complex cases

   - could cope with the dreaded fuzziness.

   - strongly typed

Here goes:

At the interface level we have:

   RangeI            -> start/end/strand
   SeqFeatureI       -> implements RangeI, has-a ParentLocationI
   ParentLocationI   -> implements RangeI, has an array of SubLocationI
                     ParentLocation could be Fuzzy.
   SubLocationI      -> implements RangeI nothing else (could remove)


For simple start/end/strands features, eg, implemented with
SeqFeature::Simple, SeqFeature::Simple would implement:

   SeqFeatureI, returning $self to a ParentLocationI call
   ParentLocationI - returning nothing for SubLocation

For more complex cases, parsed from GenBank/EMBL, a full blown array'd
location object could be made.


For complex cases with complex subSeqFeature heirarchies, there would
be nothing stopping the, say , Exon objecs implementing the SubLocationI
interface, such that although one could have two routes to discover
the same information in the interface hierarchy it is implemented in 
the same object.


In other words, I am expanding the interface defintions, but sanctioning
(indeed encouraging) implementations to support multiple interfaces and
therefore provide consistency which is not guranenteed by teh interface
definition.


What do people think?


-----------------------------------------------------------------
Ewan Birney. Mobile: +44 (0)7970 151230, Work: +44 1223 494420
<birney@ebi.ac.uk>. 
-----------------------------------------------------------------