[Biojava-l] removeGap problem with SimpleGappedSequence
Matthew Pocock
matthew_pocock at yahoo.co.uk
Thu Feb 12 06:02:39 EST 2004
Hi,
Seems like we have a bit of an 'expected behavior' and 'implemented
behavior' gap. If we decide to modify the GappedSymbolList constructor
to find all gaps in the original sequence, I think we should add it as
an option:
new GappedSymbolList(origSyms, mergeOriginalGaps)
and make the current constructor equivalent to this(syms, false).
Finding all these gaps, making an ungapped underlying symbol list, and
building the gap insertion data structures is a potentialy expensive
operation (imagine gapping a genome! you would pull the whole thing into
memory and do a linear scan), so we should be careful not to force it
upon the world.
This would also change the contract of getSourceSymbolList() and also
what happens if that source is modified, wether changes to it are tracked.
This could be worked around by implementing an "UnGappedView" class that
does the oposite mapping of GappedSymbolList - removes all gaps in the
source - then we could gap this putting them all back, making it
editable. I don't wan't to be the one to write it though - writing
GappedSymbolList made my brain hurt.
Matthew
mark.schreiber at group.novartis.com wrote:
>Sounds like a pretty sensible suggestion. Can anyone think of why this
>might not be a 'good idea'?
>
>If not, i'll add it to the list of things to fix :)
>
>- Mark
>
>
More information about the Biojava-l
mailing list