[DAS] Ensembl via SOAP

Brian gilmanb@Jforge.net
Wed, 5 Jun 2002 08:34:09 -0400 (EDT)


Yeah,

	We have lot's of experience writing serializers and
deserialiaers...If you need exaples you can look at OmniGene's omnitide
package. There are about 20 (d/s)erializers checked in. 

	as an aside we had the crazy idea of writing serializers for
biojava objects but looked at the amount of work involved and thought that
we'd have more support to do this. 

	Is anyone else interested in helping write sers/desers for
biojava/bioperl objects?? 

	I think we'd need to talk about the object model and then write
the XSD's. From their we could use castor or jaxb or axis (I'd rather do
axis) to get the objects flowing back and forth over the wire...Let me
know...We are very interested in doing this with a partner. 
			
				Best, 

					-B


 On Wed, 5 Jun 2002, Tony Cox wrote:

> On Wed, 5 Jun 2002, Brian Gilman wrote:
> 
> +>This is great!!
> +>
> +>	Do you need help with a java implementation?? I'd be willing to
> +>help you out in a week or so...
> 
> Hi Brian,
> 
> Help would be welcome - as I mentioned below I got a java client working but I
> fell over on having to write an object deserializer. Hopefully you could pick it
> up there...?
> 
> Tony
> 
> 
> +>
> +>			-B
> +>
> +>-----------------------
> +>Brian Gilman <gilmanb@genome.wi.mit.edu>
> +>Group Leader Medical & Population Genetics Dept.
> +>MIT/Whitehead Inst. Center for Genome Research
> +>One Kendall Square, Bldg. 300 / Cambridge, MA 02139-1561 USA
> +>phone +1 617  252 1069 / fax +1 617 252 1902
> +>
> +>
> +>On Wed, 5 Jun 2002, Tony Cox wrote:
> +>
> +>> 
> +>> I've been playing over the weekend with SOAP access to Ensembl objects. I have a
> +>> test server running that can handle queries.
> +>> 
> +>> 
> +>> Nutshell:
> +>> =========
> +>> 
> +>> 
> +>> use SOAP::Lite +autodispatch =>
> +>>    uri      =>   'Bio::EnsEMBL::Remote::Object',
> +>>    proxy    =>   'http://services.ensembl.org:7070/cgi-bin/ensembl_rpcrouter';
> +>> 
> +>>    my $trans =
> +>> Bio::EnsEMBL::Remote::Object->new('type'=>'transcript','id'=>'ENST00000225283');
> +>>    print $trans->seq(), "\n";
> +>>    print $trans->translate(), "\n";
> +>> 
> +>> 
> +>> 
> +>> It is a pretty niave implementation but allows fairly reasonable access to
> +>> Ensembl objects via RPC. A remote "proxy" class takes care of creating and
> +>> manipulating ensembl objects on the server side and allows you to make direct
> +>> calls locally. General id/start/end type calls all work. Since ensembl objects
> +>> are all intimately tied to DB connections which all goes horribly wrong over
> +>> SOAP the proxy object takes care creating these connections as necessary on the
> +>> server. Calls that returns an object have been changed to return an ID - which
> +>> can be used to create a remote object. I've only done the main stuff - features,
> +>> SNPs etc, are missing
> +>> 
> +>> The good thing is that you don't need any local databases or even ensembl code,
> +>> just a working copy of SOAP::Lite from CPAN. The bad thing is that it is pretty
> +>> __slow__ at the moment. The server is not running under mod_perl so most of the
> +>> response time is taken up in module compilation and XML transport. I'll try to
> +>> get it running under mod_perl and with transport compression enabled. 
> +>> 
> +>> I don't see this as an interface of choice for the bioinformatician! - it is too
> +>> slow and anyway they will have the "real" ensembl code to turn to. This is much
> +>> more of a lightweight interface for conveniently fetching sequences, genes etc
> +>> where speed is not a critical issue, and the convenience of a simple programming
> +>> interface is the important factor.
> +>> 
> +>> I'd be very interested to see interoperability tested. I did write a very small
> +>> java client to make requests but rapidly got out of my depth when having to
> +>> write a deserializer for the remote object. After looking into the Omnigene code
> +>> I see how these work but I'm rather hoping that somebody on the omnigene team
> +>> might have a go at doing this.
> +>> 
> +>> Following is a simple script that provides examples of manipulating remote
> +>> objects. You "get" a remote object on the server be creating a new
> +>> Bio::EnsEMBL::Remote::Object and giving a it a type and ID. At the moment you
> +>> can only fetch virtualcontigs, genes, transcripts, exons, clones, contigs and
> +>> translations (peptides). By the magic of "autodispatch", if you get a "thingy"
> +>> back, you can just treat it as a normal object and make calls on it. Perl's
> +>> autoloader will try and satisfy calls that are not overloaded in the remote
> +>> object (I know this sucks). If they are simple get/property calls they will
> +>> probably work - if the call returns an object/objects, bad things will probably
> +>> happen. Trying to write to the object may work (I havn't tried it) but is likely
> +>> not to be a useful thing to do! Remember this is a transaction-type system where
> +>> all the responses need to be marshalled before transport takes place so it will
> +>> not "stream" data to you as if it were a socket-style connection.
> +>> 
> +>> In the event of an error, you usually end up with undef (the code is pretty raw
> +>> at the moment). If you really want you can track down errors, use the following
> +>> block:
> +>> 
> +>>    if(SOAP::Lite->self->call->fault) {
> +>>         print "Fault code: ", SOAP::Lite->self->call->faultcode, "\n";
> +>>         print "Fault string: ", SOAP::Lite->self->call->faultstring, "\n";
> +>>         print "Fault detail: ", SOAP::Lite->self->call->faultdetail, "\n";
> +>>         print "Fault actor: ", SOAP::Lite->self->call->faultactor, "\n";
> +>>         exit;
> +>>    }
> +>> 
> +>> 
> +>> comments and suggestions welcome,
> +>> 
> +>> cheers
> +>> 
> +>> Tony
> +>> 
> +>> 
> +>> 
> +>> 
> +>> 
> +>> 
> +>> To try the server out enable one or more of the following blocks:
> +>> 
> +>> 
> +>> #!/usr/local/bin/perl
> +>> 
> +>> package MySoapClient;
> +>> 
> +>> use strict;
> +>> use SOAP::Lite +autodispatch =>
> +>>    uri      =>   'Bio::EnsEMBL::Remote::Object',
> +>>    proxy    =>   'http://services.ensembl.org:7070/cgi-bin/ensembl_rpcrouter';
> +>>    
> +>> if(1){
> +>>     my @g = (qw(ENSG00000131591 BRCA1));
> +>>     foreach my $g (@g){
> +>>         print "Getting gene: $g...\n";
> +>>         $g = Bio::EnsEMBL::Remote::Object->new('type'=>'gene','id'=>$g);
> +>>         print "\tGene ID: ", $g->id(), "\n";
> +>>         foreach my $t ($g->transcripts()){
> +>>             my $t =
> +>> Bio::EnsEMBL::Remote::Object->new('type'=>'transcript','id'=>$t);
> +>>             print "\t\tTranscript ID: ", $t->id(), "\n";
> +>>             print "\t\tTranscript length: ", $t->length(), "\n";
> +>>             #print "\t\tTranscript seq: ", $t->seq(), "\n"; 
> +>>             print "\t\tTranscript protein: ", $t->translate(), "\n";
> +>>         }
> +>>     }
> +>> }
> +>> 
> +>> if(0){
> +>>     print "Getting remote clone AP000869...\n"; 
> +>>     my $cl =
> +>> Bio::EnsEMBL::Remote::Object->new('type'=>'clone','id'=>'AP000869');
> +>>     print "Clone: ", $cl->embl_id(), "\n";
> +>>     print "Version: ", $cl->version(), "\n";
> +>>     
> +>>     foreach my $c ($cl->contigs()){
> +>>         my $c = Bio::EnsEMBL::Remote::Object->new('type'=>'contig','id'=>$c);
> +>>         my $id = $c->id();
> +>>         if($c->is_static_golden()){
> +>>             print "\tContig ID: $id (golden)\n";
> +>>             print "\tContig length: ", $c->length(), "\n";
> +>>             print "\tContig is golden?: yes\n";
> +>>             print "\t\tContig global start: ", $c->static_golden_start(), "\n";
> +>>             print "\t\tContig global end: ",   $c->static_golden_end(), "\n";
> +>>             print "\t\tContig global ori: ",   $c->static_golden_ori(), "\n";
> +>>             #print "\tContig seq: ", $c->seq(), "\n"; 
> +>>         } else {
> +>>             print "\tContig ID: $id (non-golden)\n";
> +>>         }
> +>>     }
> +>> }
> +>> 
> +>> 
> +>> if(0){
> +>>     my $chr = 1;
> +>>     my $start = 100000;
> +>>     my $end = 200000;
> +>>     print "Getting remote virtualcontig for $chr, $start-$end...\n"; 
> +>>     my $v =
> +>> Bio::EnsEMBL::Remote::Object->new('type'=>'virtualcontig','chr'=>$chr,
> +>> 'start'=>$start, 'end'=>$end);
> +>>     print "Virtual contig ID: ", $v->id(), "\n"; 
> +>>     print "Virtual contig length: ", $v->length(), "\n"; 
> +>>     print "Virtual contig chromosome: ", $v->_chr_name(), "\n"; 
> +>>     print "Virtual contig chromosome length: ", $v->fetch_chromosome_length(),
> +>> "\n"; 
> +>> 
> +>>     foreach my $g ($v->genes()){
> +>>         $g = Bio::EnsEMBL::Remote::Object->new('type'=>'gene','id'=>$g);
> +>>         print "\tGene ID: ", $g->id(), "\n";
> +>>         foreach my $t ($g->transcripts()){
> +>>             my $t =
> +>> Bio::EnsEMBL::Remote::Object->new('type'=>'transcript','id'=>$t);
> +>>             print "\t\tTranscript ID: ", $t->id(), "\n";
> +>>             print "\t\tTranscript length: ", $t->length(), "\n";
> +>>             #print "\t\tTranscript seq: ", $t->seq(), "\n"; 
> +>>             #print "\t\tTranscript protein: ", $t->translate(), "\n";
> +>>             foreach my $e ($t->exons()){
> +>>                 my $e =
> +>> Bio::EnsEMBL::Remote::Object->new('type'=>'exon','id'=>$e);
> +>>                 print "\t\t\tExon ID: ", $e->id(), "\n";
> +>>                 print "\t\t\tExon start: ", $e->ori_start(), "\n";
> +>>                 print "\t\t\tExon end: ", $e->ori_end(), "\n";
> +>>                 print "\t\t\tExon strand: ", $e->strand(), "\n";
> +>>                 print "\t\t\tExon seq: ", $e->seq(), "\n";
> +>>            }
> +>>         }
> +>>     
> +>>     }
> +>> }
> +>> 
> +>> if(0){
> +>>     my $p = "ENSP00000223439";
> +>>     print "Getting remote peptide $p...\n"; 
> +>>     my $p = Bio::EnsEMBL::Remote::Object->new('type'=>'translation','id'=>$p);
> +>>     print $p->seq();
> +>> }
> +>> 
> +>> 
> +>> 
> +>> 
> +>> ******************************************************
> +>> Tony Cox			Email:avc@sanger.ac.uk
> +>> Sanger Institute		WWW:www.sanger.ac.uk
> +>> Wellcome Trust Genome Campus	Webmaster
> +>> Hinxton				Tel: +44 1223 834244
> +>> Cambs. CB10 1SA			Fax: +44 1223 494919
> +>> ******************************************************
> +>> 
> +>> _______________________________________________
> +>> DAS mailing list
> +>> DAS@biodas.org
> +>> http://biodas.org/mailman/listinfo/das
> +>> 
> +>
> 
> ******************************************************
> Tony Cox			Email:avc@sanger.ac.uk
> Sanger Institute		WWW:www.sanger.ac.uk
> Wellcome Trust Genome Campus	Webmaster
> Hinxton				Tel: +44 1223 834244
> Cambs. CB10 1SA			Fax: +44 1223 494919
> ******************************************************
> 
> _______________________________________________
> DAS mailing list
> DAS@biodas.org
> http://biodas.org/mailman/listinfo/das
> 

-- 
----------------
Brian Gilman <gilmanb@jforge.net>