[Biojava-l] Last change to sequence package

Matthew Pocock mrp@sanger.ac.uk
Tue, 18 Jul 2000 18:44:53 +0100


Ewan Birney wrote:

> On Tue, 18 Jul 2000, Matthew Pocock wrote:
>
> > Oops - forgot to mention that this will change the GFF API slightly. The gff
> > strand constants (as ints) would be dropped, and replaced by the
> > StrandedFeature constants. This way there is only one instance of the strand
> > concept in BioJava.
> >
> > Scream now or I will change it thursday morning.
>
> Matt - In bioperl/Ensembl we have the concept of strand being one of
> "-1,1 or 0" with 0 for strand agnostic things, eg, low complexity regions.
> I am not sure that your current scheme supports this.
>
> It is a useful concept - otherwise you have some nasty code that has to
> look at the "type" of sequence feature to realise that you shoudl be
> strand agnostic.

The feature interface hierachy already helps with this

Feature
  (location, type, source, annotation, parent, children, sequence)
   |
   \ StrandedFeature
       (Strand { POSITIVE, NEGATIVE, UNKNOWN }

We do need to make sure that StrandedSequence has the possibility to be UNKNOWN to
avoid some hoop-jumping, but in general things like low complexity regions should
be modeled directly as a Feature instance.

StrandedFeature.Strand would have the methods
 toString() -- POSITIVE, NEGATIVE or UNKNOWN
 getValue() -- +1, -1 or 0
 getToken() -- +, - or .

What do you think?