[Biojava-l] Ensembl gene parsing

Arne Stabenau stabenau at ebi.ac.uk
Wed Jan 29 13:17:24 EST 2003


Hi Stein,

The EMBL export function on the current website used to work when we 
released the site. For some reason the mistakes you spotted got 
introduced. We tested the new release website which will come out next 
monday and it doesnt seem to have the problem (yet). So I would like to 
take the easy route for us and wait for the next release. We will 
however be careful not to reinvent the bug on that one.

If there is any pressing reason for a fix earlier than that, please let 
us know. Please consider to use ensj for what you want to do, its as 
fast as the perl code for most of the stuff it does. It just doesnt give 
you biojava objects.

Arne


Stein Aerts wrote:

>
> The BioJava-Ensembl should be ideal. However, retrieving a gene with 
> flanking sequence based on gene_stable_id using the code below takes a 
> million years.
>
>         Ensembl ens = new Ensembl(
>               org.ensembl.db.sql.SQLDatabaseAdaptor.connectSQL(dbURL, 
> dbUser, dbPass, dbSchemaVersion)
>         );
>         SequenceDB chromos = ens.getChromosomes();
>         FeatureHolder transHolder = chromos.filter(
>               new FeatureFilter.ByAnnotation("ensembl.gene", 
> "ENSG00000167779")
>         );
>
> The output gives:
>
> Querying:  where contig_id = '592075'
> Querying:  where contig_id = '162233'
> Querying:  where contig_id = '162238'
> Querying:  where contig_id = '162241'
> etc.
>
> So that is not very efficient.
>
> Would there an alternative here that is similar to the export data 
> function (based on any feature: gene, contig, clone, cDNA, peptide...) 
> which runs via HTTP and is very very fast.

If you want to see fast, construct URLs for the Mart and extract the 
data you want from the result ...

>
>     
> Stein.
>
>
> Thomas Down wrote:
>
>>On Wed, Jan 29, 2003 at 09:58:18AM +0000, Ewan Birney wrote:
>>  
>>
>>>(c) If you don't like Perl ( ... this is the biojava mailing list...) then 
>>>there is a pretty complete and stable Java binding to Ensembl - it doesn't 
>>>use BioJava - it is more just a vanilla Java binding to Ensembl. Craig 
>>>melsopp is the lead on that. The web page is
>>>
>>>http://www.ensembl.org/java/
>>>    
>>>
>>
>>(d) There's also a completely different BioJava-based mechanism
>>for accessing Ensembl databases:
>>
>>   http://biojava.org/pipermail/biojava-l/2002-December/003418.html
>>
>>Unlike ensj, this is 100% read-only.  It does give you access
>>without an additional API, though, and as far as I know it's the
>>only thing which supports multiple versions of the Ensembl database
>>schema off a single codebase.
>>
>>     Thomas.
>>
>>  
>>
>
>-- 
>Stein Aerts BioI at SISTA
>K.U.Leuven ESAT-SCD Belgium
>http://www.esat.kuleuven.ac.be/~dna/BioI
>
>

-- 
------------------------------------------------
Arne Stabenau         Phone: +44 (0) 1223 494413
<stabenau at ebi.ac.uk>  Fax:   +44 (0) 1223 494468

EMBL-EBI
Wellcome Trust Genome Campus, Hinxton
Cambridge CB10 1SD UK
------------------------------------------------





More information about the Biojava-l mailing list