CircularLocations (was Re: [Biojava-l] Re: [Biojava-dev] Feature at position 0)

Bradford Powell bcpowell at email.unc.edu
Fri May 14 09:42:29 EDT 2004


On Fri, 14 May 2004, Thomas Down wrote:

> 
> On 14 May 2004, at 11:21, Matthew Pocock wrote:
> 
> > I think for the sake of useability, we could either relax the location 
> > constraint to allow point locations at 0, or in the sp parser, 
> > re-write these as <1 - what do people think?
> 
> Ugh.
> 
> I'm not sure "<1" will appeal to people who are into round-tripping.
> 
> I think my (slightly) favoured option would be to remove the location 
> constraints on Features completely.  I know this is pretty horrible, 
> but off-sequence locations do seem to be things people use.
> 

This brings up another issue that I have been thinking about recently. I'm
not really comfortable with how circular sequences and circular
locations are handled in biojava. For those who aren't familiar,
CircularLocations are mapped as CompoundLocations.

I would prefer for features on a circular sequence (a CircularView of a
sequence) to have coordinates that are outside of the sequence
coordinates. Usually I use the convention that the larger coordinate is
within 1..length. This makes it easy to check for features that overlap
the origin (their min values are <= 0).

While I'm thinking of it, there are a couple of bugs I've seen in
CircularView:

1--
If subList or subString are called with start and end values that
should produce a list longer than the original sequence, things don't
happen as I would expect. Suppose you have a sequence 'seq':
	CircularView cv = new CircularView(seq);
	SymbolList subL = cv.subList(1, seq.length() + 3)
subL.length() has the value 3 instead of seq.length()+3 (i.e. it holds
just the first three symbols of seq because the start and stop coordinates
are immediately translated to be within 1..length() upon entry to
subList().

I can see two ways to resolve this-- one would be to check to see if
start-stop > length() and add appropriate numbers of copies of the source
symbolList to the sublist as seen above. The other way would be to use a
WrappedSymbolListView (this is what I have done for my purposes, code
available if people think this would be a good idea).

2--
I almost hesitate to report this "bug" because I like to use negative
coordinates in locations on circular sequences-- createFeature() in
CircularView throws an exception if getMax() > length() but not if
getMin() <= 0. I guess it is a bug one way or the other because it is
inconsistant. Since the topic of removing location constraints for
features came up, I would say that it would at least make sense to remove
the restrictions for circular sequences.

Whew, that message was longer than I thought it would be.

-- Bradford Powell




More information about the Biojava-l mailing list