[Biojava-l] Different implementation of Sequence?

Y D Sun Yudong.Sun at newcastle.ac.uk
Mon Jun 9 14:23:53 EDT 2003


Your results are very fast. I am now using the latest versions including
biojava-1.3pre4-jdk14.jar, etc. and PostgreSQL 7.3.3 on Linux 2.4.20. 

It needs 30 seconds to run your filter on my system.

Cheers,

George

> -----Original Message-----
> From: Simon Foote [mailto:simon.foote at nrc-cnrc.gc.ca] 
> Sent: 09 June 2003 13:02
> To: Y D Sun
> Cc: biojava-l at biojava.org
> Subject: Re: [Biojava-l] Different implementation of Sequence?
> 
> 
> Hi George,
> 
> Here's the result of tests with my database searcher webapp against 
> several genomes individually.  Unfortunately, it's not a pure 
> CDS search 
> as I recently modified the app to also return the DNA 
> sequence for the 
> feature, so that adds additional time depending upon the size 
> of the genome.
> 
> The search gene is dnaA, the search filter consists of:
> 
>             FeatureFilter ff1 = new FeatureFilter.ByType("CDS");
>             FeatureFilter ff2 = new 
> FeatureFilter.AnnotationContains("gene", geneId);
>             FeatureFilter ff3 = new 
> FeatureFilter.AnnotationContains("ibs_id", geneId);
>             FeatureFilter ff5 = new 
> FeatureFilter.AnnotationContains("gene_id", geneId);
>             FeatureFilter ff4 = new FeatureFilter.Or(ff2, ff3);
>             FeatureFilter ff6 = new FeatureFilter.Or(ff4, ff5);
>             FeatureFilter ff7 = new FeatureFilter.And(ff1, ff6);
>             FeatureHolder fh = seq.filter(ff7, false);
> 
> Bacteria                Search Time
> E.coli K12            6 seconds
> H. influenzae        4 seconds
> C. jejuni                6 seconds
> H. pylori J99        6 seconds
> 
> Note:  Which version of the biosql schema are you using and which 
> version of biojava
> 
> Regards,
> Simon
> 
> Y D Sun wrote:
> 
> >  
> >
> >>-----Original Message-----
> >>From: Simon Foote [mailto:simon.foote at nrc-cnrc.gc.ca]
> >>Sent: 05 June 2003 12:59
> >>To: Y D Sun
> >>Cc: biojava-l at biojava.org
> >>Subject: Re: [Biojava-l] Different implementation of Sequence?
> >>
> >>
> >>Just to add my 2 cents worth.
> >>
> >>I'm using the latest version of the BioSQL schema within
> >>MySQL and the 
> >>filters are quite fast.  On a database containing 18 complete 
> >>bacterial 
> >>genomes, fetching a given gene by name which uses a 
> combination of 5 
> >>filters in my case, takes approx. 1-2 seconds.
> >>
> >>    
> >>
> >
> >Simon,
> >
> >Have you tried to filter all CDS sections of a complete bacterial 
> >genome? In my experience with PostgreSQL, it takes only a 
> few seconds 
> >to filter a simple feature. However, it needs more than one 
> minute to 
> >filter thousands of CDS's in a genome.
> >
> >George
> >  
> >
> 
> -- 
> Bioinformatics Specialist
> Institute for Biological Sciences
> National Research Council of Canada
> [T] 613-990-0561  [F] 613-952-9092
> simon.foote at nrc-cnrc.gc.ca
> 
> 
> 



More information about the Biojava-l mailing list