[MOBY-dev] Re: [MOBY] join operations

Ken Steube steube at sdsc.edu
Thu Jan 23 06:42:26 UTC 2003


Hi everybody.

I'm not sure it's that bad of a problem to implement joins.  Say we
want to do this (SQL pseudo-code)

  SELECT X(I) WHERE CONDITION(I)

In my example it would be select proteins by keyword (service1)
subject to the condition that they have knockouts (service2).  Join
index I is any kind of ID such as a genbank ID, and X and CONDITION
are results from different MOBY services possibly at different sites.

A SOAP message can have multiple parts, each satisfied by a different
web service.  So package the request for X, CONDITION, and a name of
the join index into one multi-part SOAP message.  Then, forward it to
one of the services to fill in X values.  That services fills in its
part and sends to the second service.  The second service fills the
CONDITION values, performs the join and returns the result to the
client.

The return value is no longer a simple MOBY object like X or
CONDITION, but rather it's an extension of both.  As such it can be
used as an input to any service that requires an X or a CONDITION as
input.

Currently our service request contains serialized data.  With this
scheme it would have to also include the addresses of all the services
so that a request can be forwarded from one service to the next.

I don't believe this requires much change in our protoMOBY.  Just have
a client search through the SOAP message for the part it understands
and fill in that part.  Then forward it to some other service if
necessary.

Being new to MOBY I may have some things wrong in what I said above,
but I'm sure a solution is in there!

Ken

-------------------------------------
Ken Steube            steube at sdsc.edu
San Diego Supercomputer Center @ UCSD
San Diego, California             USA

On 22 Jan 2003, Mark Wilkinson wrote:

> Hi Ken,
>
> A bit of a rambling response... as it is 10:00Pm on my last day of work
> (Phil L. knows exactly what this means ;-) )
>
> I'm c.c'ing this response to the moby-dev list because I think it jarrs
> a lot of nerves that need to be jarred...
>
> I have a big smile on my face right now... not because I have an answer,
> but because I *don't* have an answer and wish I did (or more
> importantly, insist that the final MOBY spec does!).;
>
> At the moment MOBY handles *only* queries of the type:
>
> (discover, and) select n from foreignservicen where value=x
>
> I can't state strongly enough how crappy MOBY is at solving any more
> complex problem than that!
>
> I have submitted a grant proposal to study more complex boolean
> queries... using set theory (ack!)...  but what we really need are
> parameters in MOBY queries, and at the moment we don't have them.
>
> Quite frankly, the problem of *representing* even simple biological
> information and service type information in  our ontology is
> sufficiently large to have commandeered most of our time :-(
>
> I think the solution to parameter-based services (e.g. Blast)  perhaps
> lies somewhere in the PISE project, but again, I haven't had time to
> think through it fully - that might be complete bullshit.
>
> Regardless, for the service you want to create, given the existing MOBY
> system, it can't be done... you have to fudge it with combinations of
> simple queries.
>
> I'm sorry  :-(  One day we will have solved these problems, but so far
> we are still infants and are still teething on the deeper complex
> problems...
>
> This leads to the bane of my life right now... MOBY is wonderful at
> solving the most basic (and common) search and retrieve problem; however
> under MOBY, today, you could legitimately create a service that takes
> keywords as input and returns proteins with keyword-annotations that
> also have knockouts as output... but what do you *call* that
> service?!!?  What is the transformation type??  You are allowed (today)
> to call it whatever you like - you can register the
> "superknockoutkeywordservice" as a service type, and then register your
> service as a service of that type... but who the hell knows what your
> service actually *does* to the input data to get that output
> datatype?!?!?
>
> MOBY services (today) have the signature INPUT+TRANSFORM+OUTPUT, and
> this is supposed to be sufficient for a client to identify the desired
> service... but it is obviously not.  Your desired service is a perfect
> example of that!  How do you describe the "transform" that you are
> making?!?
>
> I think, in parallel with the fantastic work that Damian and Andrew are
> doing v.v service description and data transport technology, we need to
> spend energy thinking deeply about service type description - I see this
> as a critical problem (and I think that myGrid has a lot to teach us
> about this!!).   I believe this will be a deadly problem for MOBY,
> regardless of the data representation and/or transport layer that we
> finally decide to use - at the end of the day the service description
> must be both human readable and machine readable... ack!!!!   Damian
> mentioned that he might have a solution to this today during the
> conference call, and I would like to pursue this either on the list, or
> in the calls, or even between the two of us personally (??) because this
> is the big octopus on my brain right now and it is driving me nuts!...
> I can't sleep!!
>
> Anyway, the bad news for you personally is that the service you want to
> set up is a bugger... we can fudge it, but it will always be
> unsatisfactory.
>
> Regardless, I can help you fudge it!... or not.  Whatever you need :-)
> I'm just sure you will not be happy with your service until we have
> clear solution to the service description problem.  I desperately want
> us to find a good solution to this problem (perhaps before all other
> problems we are dealing with?)
>
> M
>
>
>
> On Wed, 2003-01-22 at 20:17, Ken Steube wrote:
> > Hey Mark, what are your thoughts on handling join operations?
> >
> > To be specific I'll give an example:  Say I write a service that
> > provides protein data given keywords and another service that tells
> > which proteins have known knockouts.  I then want to find all the
> > proteins which have both the keywords and knockouts.  So I run both
> > services and perform a join on the results.
> >
> > Since both services reside on one server it would be really beneficial
> > to do the join on that server.  Is it possible to pass the two
> > requests to some kind of "join service" that would run the two, join
> > on the protein IDs, and return a joined object? This joined object
> > would be a superclass of the first two.
> >
> > Of course I could write a third service that takes args of keywords
> > and a "require knockouts" flag but as I understand it that's not
> > really MOBYesque.
> >
> > Ken
> >
> > -------------------------------------
> > Ken Steube            steube at sdsc.edu
> > San Diego Supercomputer Center @ UCSD
> > San Diego, California             USA
> >
> --
> =======================================================================
>                                     |--==\
> Mark Wilkinson                       \==-|       1001010010010001001010
> Bioinformatics Consultant             \=/        0010010010100101110010
> Illuminae Media                       /-\        0010101110110100100101
> 727 6th Ave. N.                      /-==|       0010100100111101010010
> Saskatoon, SK, Canada               |==-/        0101001000100101001011
> S7K 2S8                              \=/         0100100100010010010101
> +1 (306) 373 3841                     /\         1110101101110101001001
> markw at illuminae.com                  /=-\        1101001010100101010101
>                                     |--==\
> =======================================================================
>




More information about the MOBY-dev mailing list