[Biojava-l] Ensembl gene parsing
Ewan Birney
birney at ebi.ac.uk
Wed Jan 29 09:58:18 EST 2003
On Wed, 29 Jan 2003, Stein Aerts wrote:
> Hi Ewan,
> I know of Mart (and I like it) but it is not suited for automated
> sequence retrieval using gene_stable_id's (a SOAP web service for the
> export data function would be nice). Anyway, the Mart output would have
> currently the same faults I guess. Do you reckon that the fixing of the
> Ensembl bugs is a short term matter?
(a) we know that people want to script against Mart and are working
towards this - Arek might be able to fill you in (over to Arek)
(b) Scripting against the core is best done probably with a specific
ensembl script (perl) that doesn't bounce through genbank format - tell us
what you want and I suspect Arne or Graham can whip up a (perl) script
quickly
(c) If you don't like Perl ( ... this is the biojava mailing list...) then
there is a pretty complete and stable Java binding to Ensembl - it doesn't
use BioJava - it is more just a vanilla Java binding to Ensembl. Craig
melsopp is the lead on that. The web page is
http://www.ensembl.org/java/
(d) Almost certainly, parsing GenBank/EMBL format is one of the worst ways
to get information out of Ensembl - there is lots of stuff inside Ensembl
which we can't dump due to format and/or space issues; we don't consider
it to be a primary route of information...
... it doesn't change the fact there are bugs on our side ;) and we will
fix those.
k
More information about the Biojava-l
mailing list