[Biojava-l] Elapsed time of feature filtering

Thomas Down thomas at derkholm.net
Mon Jun 9 19:27:11 EDT 2003


Once upon a time, Y D Sun wrote:
> I find that the elapsed time of filtering CDS for a sequence is
> proportional to the total number of sequences stored in a database. For
> exmaple, when there is only one sequence in the database, the filtering
> takes 5 seconds. If one more sequence is added to the DB, the filtering
> time for one sequence will take about 10 seconds. When there are 3
> sequences in DB, the filtering time will be about 15 seconds.
> 
> What is the cause of such a problem? 

Just to double-check...  I presume you're still talking about code
of the form:

    SequenceDB seqdb = new BioSQLSequenceDB(...);
    Sequence seq = seqdb.getSequence("foo");
    FeatureHolder cds = seq.filter(new FeatureFilter.ByType("CDS"));

If the time to do that shows any strong dependancy on the
number of sequences in the database, I'd suggest that this
is (very) strong evidence of missing/corrupted indices.  Are
you using the tweaked schema I sent to you last week?  Are
you still of PostgreSQL, or have you tried MySQL?

    Thomas.


More information about the Biojava-l mailing list