[Biojava-l] FeatureHolder.containsFeature()

Matthew Pocock matthew_pocock@yahoo.co.uk
Tue, 09 Apr 2002 17:57:03 +0100


Hi David,

This is getting into a discusion about what we mean by a feature. There 
seems to be a more abstract and platonic entity floating arround that is 
in some way the 'ideal feature'. Then, there are projections of this 
into our mortal and imperfect world of sequences, locations, feature 
objects and annotations. Perhaps this could be fixed (if we could rip 
everything up and start again) by having one more type of entity that 
decouples hierachy from annotation and which is in a 1-1 relationship 
with those pesky feature IDs. The new data-structures (in hokey syntax) 
would look like:


seq
   has:
     Set<FeatureImages> features
     SymbolList symbols
     any number of properties

FeatureImages
   herachy API:
     (FeatureImage or Sequence) parent
     Set<FeatureImage> features
     Location location
     any number of properties - specific to the image
   has:
     Feature

Feature
   has:
     any number of properties - defining or describing this feature

where "any number of properties" could be a mixture of get/set pairs and 
annotation bundles.

For the cases we are describing, it is the Feature objects that get 
compared for equality (e.g. forward/reverse strand projections of 
features), probably using something similar/identical to the current 
rules. Things like feature strand would go on the FeatureImage, where as 
things like blast scores would go on the Feature.

Ah - the benefit of hindsight. If only data structures could be 
symultaneously fluid and compile-time checked.

Matthew (wishing that all programming was more ontology driven)

> What about the case where you have 2 sequences one of which is a
> sub-sequence of the other such, as a chromosome and a BAC clone? Should the
> chromosome 'contain' all the features of the BAC? Is this another case for a
> different containsFeature() method?
> 
> At present RevCompSeq.containsFeature() returns true for all features that
> are contained by it, and by the underlying non-Revcomp sequence. That is to
> say for all features in origSeq, revSeq.containsFeature(origFeature) returns
> true. But the reverse is not true, because the original sequence of course
> knows nothing about its RevComp brother. So although both sequences actually
> have the same features (one is just a projection of the other) the original
> sequence does not know to look for features that are really the same just
> backwards. I guess this is another case for some clarity about what we mean
> when two features are equivalent.
> 
> David
> 
> 
>