[Biojava-l] Sequence Indexers

Matthew Pocock mrp@sanger.ac.uk
Fri, 19 Jan 2001 15:45:39 +0000


Hi.

Are there a large number of people using the IndexedSequenceDB class?
Thomas has made some changes to it that give significant performance
enhancements, and I was about to add the ability to plug in custom
index-stoorage back ends (like a tab-delimited back end, or berkley-db
or SQL). Both of these will make current index files obsolete. The
upside is that the process of building index files should be come only
slightly more expensive than counting the entries per file (near to the
cost of grepping them).

The current scheim serializes the entire hash of offsets to disk - this
is very time consuming & not a good solution for single-sequence
fetches. It also makes the index java-specific.

Scream now, or I will make the changes...

Matthew