[Biojava-l] "packed" SymbolLists?

Thomas Down td2@sanger.ac.uk
Thu, 24 Oct 2002 09:55:25 +0100


On Thu, Oct 24, 2002 at 11:34:40AM +1300, Schreiber, Mark wrote:
> Hi -
> 
> There is a PackedSymbolList and a PackedDNASymbolList in
> org.biojava.bio.symbol. Can't say I've used them but I think they
> operate pretty much the same as normal SymbolLists.

I've used PackedSymbolList on a few occasions.  It works
pretty well, but it's main defect at the moment is that you
need to know in advance how long the sequence is.  This either
means writing custom two-step sequence loading code (where the
first step just counts up the length...), or loading the sequence
as a normal (full-size) SymbolList then constructing a PackedSymbolList
for this.

There's a fairly clear solution to this -- refactor PackedSymbolList
to use a memory allocation scheme similar to ChunkedSymbolList
(currently the default implementation for large sequences), so
that new chunks can be allocated on the fly as the sequence grows.
Anyone feel like taking this on?

     Thomas.