[Biojava-l] GappedSymbolList behaviour is wierd, bug ?

Matthew Pocock mrp@sanger.ac.uk
Wed, 12 Dec 2001 14:21:35 +0000


Hi Kalle,

This is a bug. I'll take a look at it today. It's probably due to issues 
with the bespoke binary search code in the guts of GappedSymbolList.

Matthew

Kalle Näslund wrote:

> HI!
> 
> I am writing some small app that uses GappedSymbolList and i see some 
> wierd behaviour.
> 
> The first "problem" is when i have a GappedSymbolList and i insert a gap 
> into the View sequence ( the one that shows  gaps ). As long as i insert 
> a gap/ gaps at a positin where there isnt any gap, all is fine. On the 
> other hand, if i inserta  gap at a position where there is a gap, the 
> gap gets inserted into the NEXT block of gaps, and if there isnt any 
> next block of gaps, the gap gets appended at the end of the sequence. A 
> simple text example will describe this much better. the example basicly 
> just inserts a gap at position 3 in the view, a couple of time, and then 
> prints the output, and it looks like this:
> 
> aattggcc        Initial sequence
> aa-ttggcc       1 gap inserted at position 3
> aa-ttggcc-      1 additional gap inserted at position 3
> aa-ttggcc--     1 additional gap inserted at position 3
> aa-ttggcc---    1 additional gap inserted at position 3
> 
> for me, this is not the way i think anyone would expect it to work. I 
> think most people would just expect that gap insertion should work the 
> same, irrespectively of what symbol is at the position where the gap 
> gets inserted. And that the end result should look like this.    
> aa----ttggcc
> 
> 
> 
> The second ting i am having some thoughts about is the viewToSource 
> function, if you try to convert from view to source coordinates, and the 
> view coordinate contains a gap, you get a return value of -1. The 
> JavaDoc dont mention anything about what happens when you try to go from 
> view to source coordinates and the view coordinate contains a gap, but 
> it returns a -1 and that is ok i guess. But, this gives me lots of 
> problems, as i have users graphicly specify an intervall on the 
> GapedSequenceList, and i then want the source coordinates. If the user 
> chooses one endpoint that is a gap, i will have to start scaning symbol 
> for symbol, in the View coordinates, and then use the first non gap 
> symbol.So would it be wrong, to change the viewToSource method to not 
> return -1, but to actualy return the source position where the gap is 
> inserted, multiplied by -1 ? This would most likely dont break any code 
> that just checks if viewToSource returns -1 as most people will have 
> done it like if( x < 0 ) and not like if( x == -1 ). And then you can 
> get a meaningfull conversoin from view to source, and if you dont care, 
> you can only chec if the return value is negaitve.
> 
> to clarify what i mean, i will give a short eample here aswell.
> 
> aa---ttggcc
> 
> as it is now, viewToSoruce( 4 ) will return -1, i would propose that it 
> should return -3 instead, because it is at position three in the source 
> sequence, the gaps are inserted. And the value shold be negative, to 
> indicate that there is no direct link between the view position and the 
> source, as the view is a gap.
> 
> I do understand that there might be things this little proposal does to 
> other parts, that are not wanted, and therefore, this should only be 
> seen as a little question / proposal, and nothing more, if there is a 
> reason to only return -1 and nothing else, i will just do the dirty 
> solution of walking along the view sequence until i find a non gap symbol.
> 
> Anyway, i have tested this on linux ( jdk 1.3.1 from sun ) and windows ( 
> jdk 1.4.0b3 ), using both the binary biojava-20010920.jar release aswell 
> as one of the older releases, and the behaviour is the same in all 
> combinations.
> 
> 
> to finnish this off, i would just like to say thanks to all who have 
> contributed to biojava as it simplifies many nasty tasks a lot.
> 
> Sincerely, Kalle Näslund
> 
> 
> 
> 
> 
> 
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
> .
>