[Biojava-l] circular sequences

Thomas Down td2@sanger.ac.uk
Sun, 28 Jan 2001 12:19:02 +0000


On Sun, Jan 28, 2001 at 09:40:54PM +1300, Mark Schreiber wrote:
> On Fri, 26 Jan 2001, Thomas Down wrote:
> 
> > 
> > Well, there's no problem doing a CircularSymbolList which overrides
> > subList and subStr (I'd be tempted to write a class which gives
> > a circularized view onto any underlying SymbolList, rather than
> > subclassing a specific implementation).  Point to debate: should
> > symbolAt(1001) for a 1000-symbol circular sequence return the
> > value of symbolAt(1), or is this an error?
> >
> 
> It depends. In some ways it would be nice if iterators etc could just
> carry on around the sequence although at some point it would get a bit
> stupid unless a signal is given to signify the end. Might just be best to
> throw an exception and let the implementing program decide what to do
> about it. On the other hand you could probably use the standard symbol
> list in this way. 6 of one ....
> 
> My gut feeling is that we should allow indexing of residues greater than
> the length of the sequence and if need be less than one. Zero in this
> instance should be an invalid argument.

Yeah, that sounds about right.  /me still wishes sequences
were indexed from zero though (the one thing I still miss from
my pre-biojava sequence library).

> > Circular Sequence objects are slightly more of a pain, since I
> > might want a Feature running from, say, 900 - 100.  Not sure
> > what the best way to handle this is -- it's not a case recognized
> > by out current Location objects.
> 
> Maybe make a subclass of stranded feature since only DNA can be circular,
> can anyone see a reason why not.

Circularity is an issue that's orthogonal to the current system
of feature types.  Certainly, there's nothing to say that all
features on DNA are StrandedFeatures (a CpG island, for instance,
is fairly clearly not stranded, at least in an idealized world).  Also,
we'll probably want to use all the other feature types on circular
sequences (Exon, Transcript, whatever).

My vote goes for Matthew's solution of having CircularLocations, and
keeping the circularity issue out of the feature system itself.  It's
not a 100% clean solution, but I think it should work out okay.

  Thomas.