[Biojava-l] Ensembl gene parsing

Thomas Down td2 at sanger.ac.uk
Wed Jan 29 12:48:15 EST 2003


On Wed, Jan 29, 2003 at 01:41:46PM +0100, Stein Aerts wrote:
> 
> The BioJava-Ensembl should be ideal. However, retrieving a gene with 
> flanking sequence based on gene_stable_id using the code below takes a 
> million years.
> 
>        Ensembl ens = new Ensembl(
>              org.ensembl.db.sql.SQLDatabaseAdaptor.connectSQL(dbURL, 
> dbUser, dbPass, dbSchemaVersion)
>        );
>        SequenceDB chromos = ens.getChromosomes();
>        FeatureHolder transHolder = chromos.filter(
>              new FeatureFilter.ByAnnotation("ensembl.gene", 
> "ENSG00000167779")
>        );

Try "ensembl.gene_id" instead (or better still, the convenience
constant Ensembl.TRANSCRIPT_GENEID).  That said, it should be
able to prove by schema comparison that no features have
the ensembl.gene property, and return the empty set instantly
rather than trawling the database.  I'll check up on this later.

> The output gives:
> 
> Querying:  where contig_id = '592075'
> Querying:  where contig_id = '162233'
> Querying:  where contig_id = '162238'
> Querying:  where contig_id = '162241'

Are you sure you're using an up-to-date version?  That debugging
output looks old.

     Thomas.


More information about the Biojava-l mailing list