[Bioperl-l] retrieve ensembl sequence using bioperl

Mark Johnson johnsonm at gmail.com
Thu Jul 24 21:38:35 UTC 2008


On Thu, Jul 24, 2008 at 3:29 PM, Laurent Manchon
<lmanchon at univ-montp2.fr> wrote:
>
> try this: ENSTRUP00000005947

That's an Ensembl Peptide ID.  As Chris pointed out (I missed the
reference to Ensembl at the top of your original message), Ensembl !=
EMBL.

This is what I end up with:

use Bio::Perl;

my $seq = get_sequence('embl', 'ENSTRUP00000005947');

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: EMBL stream with no ID. Not embl in my book
STACK: Error::throw
STACK: Bio::Root::Root::throw
/gsc/lib/perl5/site_perl/5.8.7/Bio/Root/Root.pm:359
STACK: Bio::SeqIO::embl::next_seq
/gsc/lib/perl5/site_perl/5.8.7/Bio/SeqIO/embl.pm:189
STACK: Bio::DB::WebDBSeqI::get_Seq_by_acc
/gsc/lib/perl5/site_perl/5.8.7/Bio/DB/WebDBSeqI.pm:180
STACK: Bio::Perl::get_sequence /gsc/lib/perl5/site_perl/5.8.7/Bio/Perl.pm:507
STACK: -:3
-----------------------------------------------------------

The response to a GET to this URL

http://www.ebi.ac.uk/cgi-bin/dbfetch?db=embl&style=raw&format=embl&id=ENSTRUP00000005947

is 'No entries found.'

Which Bio::SeqIO::embl of course does not consider valid EMBL format.

You can query Ensembl via the Ensembl Perl API.  You can find
documentation for that here:

http://www.ensembl.org/info/using/api/index.html

Here are a couple tutorials I found via a quick Google search
http://www.bioperl.org/wiki/Getting_Genomic_Sequences#Using_the_Perl_API_at_ENSEMBL
http://www.ebi.ac.uk/2can/pdf/ensembl_tutorial.pdf

Good luck!



More information about the Bioperl-l mailing list