[Biojava-l] Elapsed time of feature filtering

Thomas Down thomas at derkholm.net
Tue Jun 10 10:55:51 EDT 2003


Once upon a time, Y D Sun wrote:
> I find that the elapsed time of filtering CDS for a sequence is
> proportional to the total number of sequences stored in a database. For
> exmaple, when there is only one sequence in the database, the filtering
> takes 5 seconds. If one more sequence is added to the DB, the filtering
> time for one sequence will take about 10 seconds. When there are 3
> sequences in DB, the filtering time will be about 15 seconds.

Hi...

I'm trying to reproduce this.  I've started with a clean
database and loaded the schema I sent to you last week.  I've
then inserted multiple copies of BA000040.embl (renamed each time,
obviously) and tried fetching CDS for one of the sequences.
With two copies loaded, this takes 6.3 seconds.  With three
copies it does increase slightly, but only to 6.9 seconds.
Loading a fourth copy at the moment, but so far I'm not really
seeing the problem you're reporting.

(It would, of course, be nice if things went faster than 6
seconds.  On the other hand, I'm running this in a completely
untuned PostgreSQL installation on my laptop [256Mb memory,
slowish disk].  On a decent server with a RAID of modern disks,
the time would be negligable.  And even just doing a bit of
basic postgres-tuning would help).
 
     Thomas.


More information about the Biojava-l mailing list