[DAS] Ensembl via SOAP
Tony Cox
avc@sanger.ac.uk
Wed, 5 Jun 2002 13:26:25 +0100 (BST)
On Wed, 5 Jun 2002, Brian Gilman wrote:
+>This is great!!
+>
+> Do you need help with a java implementation?? I'd be willing to
+>help you out in a week or so...
Hi Brian,
Help would be welcome - as I mentioned below I got a java client working but I
fell over on having to write an object deserializer. Hopefully you could pick it
up there...?
Tony
+>
+> -B
+>
+>-----------------------
+>Brian Gilman <gilmanb@genome.wi.mit.edu>
+>Group Leader Medical & Population Genetics Dept.
+>MIT/Whitehead Inst. Center for Genome Research
+>One Kendall Square, Bldg. 300 / Cambridge, MA 02139-1561 USA
+>phone +1 617 252 1069 / fax +1 617 252 1902
+>
+>
+>On Wed, 5 Jun 2002, Tony Cox wrote:
+>
+>>
+>> I've been playing over the weekend with SOAP access to Ensembl objects. I have a
+>> test server running that can handle queries.
+>>
+>>
+>> Nutshell:
+>> =========
+>>
+>>
+>> use SOAP::Lite +autodispatch =>
+>> uri => 'Bio::EnsEMBL::Remote::Object',
+>> proxy => 'http://services.ensembl.org:7070/cgi-bin/ensembl_rpcrouter';
+>>
+>> my $trans =
+>> Bio::EnsEMBL::Remote::Object->new('type'=>'transcript','id'=>'ENST00000225283');
+>> print $trans->seq(), "\n";
+>> print $trans->translate(), "\n";
+>>
+>>
+>>
+>> It is a pretty niave implementation but allows fairly reasonable access to
+>> Ensembl objects via RPC. A remote "proxy" class takes care of creating and
+>> manipulating ensembl objects on the server side and allows you to make direct
+>> calls locally. General id/start/end type calls all work. Since ensembl objects
+>> are all intimately tied to DB connections which all goes horribly wrong over
+>> SOAP the proxy object takes care creating these connections as necessary on the
+>> server. Calls that returns an object have been changed to return an ID - which
+>> can be used to create a remote object. I've only done the main stuff - features,
+>> SNPs etc, are missing
+>>
+>> The good thing is that you don't need any local databases or even ensembl code,
+>> just a working copy of SOAP::Lite from CPAN. The bad thing is that it is pretty
+>> __slow__ at the moment. The server is not running under mod_perl so most of the
+>> response time is taken up in module compilation and XML transport. I'll try to
+>> get it running under mod_perl and with transport compression enabled.
+>>
+>> I don't see this as an interface of choice for the bioinformatician! - it is too
+>> slow and anyway they will have the "real" ensembl code to turn to. This is much
+>> more of a lightweight interface for conveniently fetching sequences, genes etc
+>> where speed is not a critical issue, and the convenience of a simple programming
+>> interface is the important factor.
+>>
+>> I'd be very interested to see interoperability tested. I did write a very small
+>> java client to make requests but rapidly got out of my depth when having to
+>> write a deserializer for the remote object. After looking into the Omnigene code
+>> I see how these work but I'm rather hoping that somebody on the omnigene team
+>> might have a go at doing this.
+>>
+>> Following is a simple script that provides examples of manipulating remote
+>> objects. You "get" a remote object on the server be creating a new
+>> Bio::EnsEMBL::Remote::Object and giving a it a type and ID. At the moment you
+>> can only fetch virtualcontigs, genes, transcripts, exons, clones, contigs and
+>> translations (peptides). By the magic of "autodispatch", if you get a "thingy"
+>> back, you can just treat it as a normal object and make calls on it. Perl's
+>> autoloader will try and satisfy calls that are not overloaded in the remote
+>> object (I know this sucks). If they are simple get/property calls they will
+>> probably work - if the call returns an object/objects, bad things will probably
+>> happen. Trying to write to the object may work (I havn't tried it) but is likely
+>> not to be a useful thing to do! Remember this is a transaction-type system where
+>> all the responses need to be marshalled before transport takes place so it will
+>> not "stream" data to you as if it were a socket-style connection.
+>>
+>> In the event of an error, you usually end up with undef (the code is pretty raw
+>> at the moment). If you really want you can track down errors, use the following
+>> block:
+>>
+>> if(SOAP::Lite->self->call->fault) {
+>> print "Fault code: ", SOAP::Lite->self->call->faultcode, "\n";
+>> print "Fault string: ", SOAP::Lite->self->call->faultstring, "\n";
+>> print "Fault detail: ", SOAP::Lite->self->call->faultdetail, "\n";
+>> print "Fault actor: ", SOAP::Lite->self->call->faultactor, "\n";
+>> exit;
+>> }
+>>
+>>
+>> comments and suggestions welcome,
+>>
+>> cheers
+>>
+>> Tony
+>>
+>>
+>>
+>>
+>>
+>>
+>> To try the server out enable one or more of the following blocks:
+>>
+>>
+>> #!/usr/local/bin/perl
+>>
+>> package MySoapClient;
+>>
+>> use strict;
+>> use SOAP::Lite +autodispatch =>
+>> uri => 'Bio::EnsEMBL::Remote::Object',
+>> proxy => 'http://services.ensembl.org:7070/cgi-bin/ensembl_rpcrouter';
+>>
+>> if(1){
+>> my @g = (qw(ENSG00000131591 BRCA1));
+>> foreach my $g (@g){
+>> print "Getting gene: $g...\n";
+>> $g = Bio::EnsEMBL::Remote::Object->new('type'=>'gene','id'=>$g);
+>> print "\tGene ID: ", $g->id(), "\n";
+>> foreach my $t ($g->transcripts()){
+>> my $t =
+>> Bio::EnsEMBL::Remote::Object->new('type'=>'transcript','id'=>$t);
+>> print "\t\tTranscript ID: ", $t->id(), "\n";
+>> print "\t\tTranscript length: ", $t->length(), "\n";
+>> #print "\t\tTranscript seq: ", $t->seq(), "\n";
+>> print "\t\tTranscript protein: ", $t->translate(), "\n";
+>> }
+>> }
+>> }
+>>
+>> if(0){
+>> print "Getting remote clone AP000869...\n";
+>> my $cl =
+>> Bio::EnsEMBL::Remote::Object->new('type'=>'clone','id'=>'AP000869');
+>> print "Clone: ", $cl->embl_id(), "\n";
+>> print "Version: ", $cl->version(), "\n";
+>>
+>> foreach my $c ($cl->contigs()){
+>> my $c = Bio::EnsEMBL::Remote::Object->new('type'=>'contig','id'=>$c);
+>> my $id = $c->id();
+>> if($c->is_static_golden()){
+>> print "\tContig ID: $id (golden)\n";
+>> print "\tContig length: ", $c->length(), "\n";
+>> print "\tContig is golden?: yes\n";
+>> print "\t\tContig global start: ", $c->static_golden_start(), "\n";
+>> print "\t\tContig global end: ", $c->static_golden_end(), "\n";
+>> print "\t\tContig global ori: ", $c->static_golden_ori(), "\n";
+>> #print "\tContig seq: ", $c->seq(), "\n";
+>> } else {
+>> print "\tContig ID: $id (non-golden)\n";
+>> }
+>> }
+>> }
+>>
+>>
+>> if(0){
+>> my $chr = 1;
+>> my $start = 100000;
+>> my $end = 200000;
+>> print "Getting remote virtualcontig for $chr, $start-$end...\n";
+>> my $v =
+>> Bio::EnsEMBL::Remote::Object->new('type'=>'virtualcontig','chr'=>$chr,
+>> 'start'=>$start, 'end'=>$end);
+>> print "Virtual contig ID: ", $v->id(), "\n";
+>> print "Virtual contig length: ", $v->length(), "\n";
+>> print "Virtual contig chromosome: ", $v->_chr_name(), "\n";
+>> print "Virtual contig chromosome length: ", $v->fetch_chromosome_length(),
+>> "\n";
+>>
+>> foreach my $g ($v->genes()){
+>> $g = Bio::EnsEMBL::Remote::Object->new('type'=>'gene','id'=>$g);
+>> print "\tGene ID: ", $g->id(), "\n";
+>> foreach my $t ($g->transcripts()){
+>> my $t =
+>> Bio::EnsEMBL::Remote::Object->new('type'=>'transcript','id'=>$t);
+>> print "\t\tTranscript ID: ", $t->id(), "\n";
+>> print "\t\tTranscript length: ", $t->length(), "\n";
+>> #print "\t\tTranscript seq: ", $t->seq(), "\n";
+>> #print "\t\tTranscript protein: ", $t->translate(), "\n";
+>> foreach my $e ($t->exons()){
+>> my $e =
+>> Bio::EnsEMBL::Remote::Object->new('type'=>'exon','id'=>$e);
+>> print "\t\t\tExon ID: ", $e->id(), "\n";
+>> print "\t\t\tExon start: ", $e->ori_start(), "\n";
+>> print "\t\t\tExon end: ", $e->ori_end(), "\n";
+>> print "\t\t\tExon strand: ", $e->strand(), "\n";
+>> print "\t\t\tExon seq: ", $e->seq(), "\n";
+>> }
+>> }
+>>
+>> }
+>> }
+>>
+>> if(0){
+>> my $p = "ENSP00000223439";
+>> print "Getting remote peptide $p...\n";
+>> my $p = Bio::EnsEMBL::Remote::Object->new('type'=>'translation','id'=>$p);
+>> print $p->seq();
+>> }
+>>
+>>
+>>
+>>
+>> ******************************************************
+>> Tony Cox Email:avc@sanger.ac.uk
+>> Sanger Institute WWW:www.sanger.ac.uk
+>> Wellcome Trust Genome Campus Webmaster
+>> Hinxton Tel: +44 1223 834244
+>> Cambs. CB10 1SA Fax: +44 1223 494919
+>> ******************************************************
+>>
+>> _______________________________________________
+>> DAS mailing list
+>> DAS@biodas.org
+>> http://biodas.org/mailman/listinfo/das
+>>
+>
******************************************************
Tony Cox Email:avc@sanger.ac.uk
Sanger Institute WWW:www.sanger.ac.uk
Wellcome Trust Genome Campus Webmaster
Hinxton Tel: +44 1223 834244
Cambs. CB10 1SA Fax: +44 1223 494919
******************************************************