[Bioperl-l] Performance of Bio::Index methods

Wed Aug 11 13:29:15 EDT 2004

Greetings

I'm working on a task in which I need to index Genbank files. (Alas,
putting them in mysql is not a option at the moment.) The indexing option
out of the box (SDBM?) is slow. Spreading the index across several files
speeds it up significantly which suprises me. Can anyone tell me the
algorithm used? I would expect it to be linear in the number of records,
but it seems to be otherwise. 

The documentation recommends Berkeley DB as being faster. Does anyone have
any experience with how much faster? I've used Berkeley in the past, but
never tried to compare the two.

Cheers

Mike