[Bioperl-l] Comparing DB_FILE and SDBM
Josh Lauricha
laurichj at bioinfo.ucr.edu
Thu Aug 12 17:10:48 EDT 2004
On Thu 08/12/04 13:44, Mike Muratet wrote:
> Greetings
>
> I did a comparison myself of Bio::Index::GenBank between DB_FILE and SDBM
> on the latest version of the files from the Genbank primate division using
> a Compaq with 376K of memory and a 2.4GHz Pentium 4 Xeon. I used the
Wow, now thats a machine in desprate need of a memory upgrade.
> environment variable to control the indexer. I got the latest release of
> Berkeley from SleepyCat.
>
> Using DB_FILE
>
> real 38m32.751s
> user 6m36.070s
> sys 1m16.650s
>
> Using SDBM
>
> real 46m13.856s
> user 6m34.400s
> sys 1m15.010s
How loaded was that machine? (I'm assuming just the tests.)
>
> A negligible difference. Has anyone tried to compare the libraries (or
> knows where someone has?)
I never compared the DB libraries, but I did some comparison between
regexp and XML::SAX for tigr.pm, and found that using the XML::SAX
module on top of Expat was horribly slow. The start to finish times for
regexp were <5m while XML::SAX was >25m, even though the regexp were
parsing considerably more data out at the time. And regexp took 4-5m of
CPU time, XML::SAX would do 5-8m meaning that XML::SAX would idle for
quite a while during its run.
I have a feeling that perl yeilds its timeslice for whatever reason when
switching from a library to perl-code, making libraries that do a lot of
function calls (such as XML parser and DBs) very slow in terms of
latency.
Can anyone confirm or deny this?
Thanks,
--
------------------------------------------------------
| Josh Lauricha | Ford, you're turning |
| laurichj at bioinfo.ucr.edu | into a penguin. Stop |
| Bioinformatics, UCR | it |
|----------------------------------------------------|
| OpenPG: |
| 4E7D 0FC0 DB6C E91D 4D7B C7F3 9BE9 8740 E4DC 6184 |
|----------------------------------------------------|
More information about the Bioperl-l
mailing list