[Biojava-l] Different implementation of Sequence?
Simon Foote
simon.foote at nrc-cnrc.gc.ca
Mon Jun 9 09:01:46 EDT 2003
Hi George,
Here's the result of tests with my database searcher webapp against
several genomes individually. Unfortunately, it's not a pure CDS search
as I recently modified the app to also return the DNA sequence for the
feature, so that adds additional time depending upon the size of the genome.
The search gene is dnaA, the search filter consists of:
FeatureFilter ff1 = new FeatureFilter.ByType("CDS");
FeatureFilter ff2 = new
FeatureFilter.AnnotationContains("gene", geneId);
FeatureFilter ff3 = new
FeatureFilter.AnnotationContains("ibs_id", geneId);
FeatureFilter ff5 = new
FeatureFilter.AnnotationContains("gene_id", geneId);
FeatureFilter ff4 = new FeatureFilter.Or(ff2, ff3);
FeatureFilter ff6 = new FeatureFilter.Or(ff4, ff5);
FeatureFilter ff7 = new FeatureFilter.And(ff1, ff6);
FeatureHolder fh = seq.filter(ff7, false);
Bacteria Search Time
E.coli K12 6 seconds
H. influenzae 4 seconds
C. jejuni 6 seconds
H. pylori J99 6 seconds
Note: Which version of the biosql schema are you using and which
version of biojava
Regards,
Simon
Y D Sun wrote:
>
>
>>-----Original Message-----
>>From: Simon Foote [mailto:simon.foote at nrc-cnrc.gc.ca]
>>Sent: 05 June 2003 12:59
>>To: Y D Sun
>>Cc: biojava-l at biojava.org
>>Subject: Re: [Biojava-l] Different implementation of Sequence?
>>
>>
>>Just to add my 2 cents worth.
>>
>>I'm using the latest version of the BioSQL schema within
>>MySQL and the
>>filters are quite fast. On a database containing 18 complete
>>bacterial
>>genomes, fetching a given gene by name which uses a combination of 5
>>filters in my case, takes approx. 1-2 seconds.
>>
>>
>>
>
>Simon,
>
>Have you tried to filter all CDS sections of a complete bacterial
>genome? In my experience with PostgreSQL, it takes only a few seconds to
>filter a simple feature. However, it needs more than one minute to
>filter thousands of CDS's in a genome.
>
>George
>
>
--
Bioinformatics Specialist
Institute for Biological Sciences
National Research Council of Canada
[T] 613-990-0561 [F] 613-952-9092
simon.foote at nrc-cnrc.gc.ca
More information about the Biojava-l
mailing list