[Biojava-l] Ensembl gene parsing
    Ewan Birney 
    birney at ebi.ac.uk
       
    Wed Jan 29 09:58:18 EST 2003
    
    
  
On Wed, 29 Jan 2003, Stein Aerts wrote:
> Hi Ewan,
> I know of Mart (and I like it) but it is not suited for automated 
> sequence retrieval using gene_stable_id's (a SOAP web service for the 
> export data function would be nice). Anyway, the Mart output would have 
> currently the same faults I guess. Do you reckon that the fixing of the 
> Ensembl bugs is a short term matter?
(a) we know that people want to script against Mart and are working 
towards this - Arek might be able to fill you in (over to Arek)
(b) Scripting against the core is best done probably with a specific
ensembl script (perl) that doesn't bounce through genbank format - tell us
what you want and I suspect Arne or Graham can whip up a (perl) script
quickly
(c) If you don't like Perl ( ... this is the biojava mailing list...) then 
there is a pretty complete and stable Java binding to Ensembl - it doesn't 
use BioJava - it is more just a vanilla Java binding to Ensembl. Craig 
melsopp is the lead on that. The web page is
http://www.ensembl.org/java/
(d) Almost certainly, parsing GenBank/EMBL format is one of the worst ways 
to get information out of Ensembl - there is lots of stuff inside Ensembl 
which we can't dump due to format and/or space issues; we don't consider 
it to be a primary route of information...
... it doesn't change the fact there are bugs on our side ;) and we will 
fix those.
k
    
    
More information about the Biojava-l
mailing list