[Bioperl-l] retrieve ensembl sequence using bioperl
Mark Johnson
johnsonm at gmail.com
Thu Jul 24 21:38:35 UTC 2008
On Thu, Jul 24, 2008 at 3:29 PM, Laurent Manchon
<lmanchon at univ-montp2.fr> wrote:
>
> try this: ENSTRUP00000005947
That's an Ensembl Peptide ID. As Chris pointed out (I missed the
reference to Ensembl at the top of your original message), Ensembl !=
EMBL.
This is what I end up with:
use Bio::Perl;
my $seq = get_sequence('embl', 'ENSTRUP00000005947');
------------- EXCEPTION: Bio::Root::Exception -------------
MSG: EMBL stream with no ID. Not embl in my book
STACK: Error::throw
STACK: Bio::Root::Root::throw
/gsc/lib/perl5/site_perl/5.8.7/Bio/Root/Root.pm:359
STACK: Bio::SeqIO::embl::next_seq
/gsc/lib/perl5/site_perl/5.8.7/Bio/SeqIO/embl.pm:189
STACK: Bio::DB::WebDBSeqI::get_Seq_by_acc
/gsc/lib/perl5/site_perl/5.8.7/Bio/DB/WebDBSeqI.pm:180
STACK: Bio::Perl::get_sequence /gsc/lib/perl5/site_perl/5.8.7/Bio/Perl.pm:507
STACK: -:3
-----------------------------------------------------------
The response to a GET to this URL
http://www.ebi.ac.uk/cgi-bin/dbfetch?db=embl&style=raw&format=embl&id=ENSTRUP00000005947
is 'No entries found.'
Which Bio::SeqIO::embl of course does not consider valid EMBL format.
You can query Ensembl via the Ensembl Perl API. You can find
documentation for that here:
http://www.ensembl.org/info/using/api/index.html
Here are a couple tutorials I found via a quick Google search
http://www.bioperl.org/wiki/Getting_Genomic_Sequences#Using_the_Perl_API_at_ENSEMBL
http://www.ebi.ac.uk/2can/pdf/ensembl_tutorial.pdf
Good luck!
More information about the Bioperl-l
mailing list