[moby] [MOBY-dev] Re: Problems with Biomoby services in Taverna1.2

José María Fernández González jmfernandez at cnb.uam.es
Fri Jul 8 11:12:37 UTC 2005


Hi everyone,
	the Taverna v1.1 behaviour for MOBY should be the default one for
services which only takes one input parameter. But, which collection
semantics should have Taverna MOBY plugin for services with more than
one parameter? All of this is related to iteration strategies.

	For instance, imagine this scenario: we have a service which needs two
input Simple parameters, A and B. What should we do?

*	If both input parameters receive a Simple, no problem, default behaviour.
*	If one of those parameters receive a Collection with N Simples, and
the other a Simple, the default iteration strategy should be either
invoking the MOBY service N times with N separate SOAP submissions, or
invoking the MOBY service N times with only a SOAP submission (or
something intermediate).
*	If each one of the parameters receive a Collection, then the iteration
strategy *should* be explicitly set by the workflow creator: either
cartesian product (first of A with each one of B, second of A with each
one of B) or dot product (first for A with first for B, second for A
with second for B, and so on, either discarding the last elements from
the longest Collection, or firing an exception/error/message/whatever).

	The other two scenarios are: a two-input service with both parameters
defined as Collection; and a two-input service with one the parameters
as Collection and the other as Simple. I leave them as an exercise for
the reader 8-), so I'm avoiding to write now the most boring e-mail in
BioMOBY story ;-)

	Best Regards,
		José María

Rebecca Ernst wrote:
> Hi Tom and others
> 
> This Taverna behaviour described below makes perfect sense to me! Does
> Taverna then check in MOBY-Central how this service is registered
> (taking simples or collections)?
> 
> I guess the way to go would be to:
> 1. change the Taverna behaviour back to how it is supposed to work (and
> did in 1.1)
> 2. - change the BioMoby Perl code to allow more than one simple in the
> MobyData AND
>    - change almost all services that output collections to the way we
> discussed yesterday (only output collections if the objects build an
> entity. )
> 
> Either of the two changes solves the problems but I believe both are due
> to changes and should take place.
> 
> Best,
> Rebecca
> 
> 
> 
> 
> 
> Tom Oinn wrote:
> 
>> That's kind of true, meaning actually not. There are three cases
>> involving collections (Taverna 1.1 behaviour) :
>>
>> 1) Consumer declares it consumes singles, Producer emits a collection.
>> In this context Taverna iteratively calls the Consumer with each item
>> from the collection. This is probably what you'd expect to happen, the
>> result is that the Consumer effectively emits a collection of whatever
>> it would emit normally.
> 
> 
>> 2) Consumer declares it consumes a collection, Producer emits a
>> collection. In this case Taverna will indeed split the output
>> collection (because we always do) but it will be magically reassembled
>> before being given to the Consumer.
>>
>> 3) Consumer declares it consumes a collection, Producer emits a single
>> item. Taverna wraps the single item in a single element collection and
>> gives it to the Consumer.
>>
>> The intent is that as with the other plugin types Taverna guarantees
>> that the service sees the inputs it has asked for. Our experience with
>> other service types suggests that this is an extremely powerful
>> mechanism as it allows interoperability between services that would
>> otherwise mismatch - it's worth noting that our users expect these
>> services to match, while a CS perspective regards ProteinSequence and
>> ProteinSequence[] as completely different types most of our biologists
>> don't! Taverna behaves the way _they_ expect it to, remember who your
>> user are.
>>
>> Taverna data types are pretty much trivial, they're opaque data blobs
>> with the exception of collection structure which is exposed. We only
>> expose the collection structure to ensure the above properties, other
>> than that the framework is data agnostic (as it should be).
>>
>> Tom
>>
> 

-- 
José María Fernández González		e-mail: jmfernandez at cnb.uam.es
Tlfn:	(+34) 91 585 54 50		Fax:	(+34) 91 585 45 06
Grupo de Diseño de Proteinas		Protein Design Group
Centro Nacional de Biotecnología	National Center of Biotechnology
C.P.: 28049				Zip Code: 28049
C/. Darwin nº 3 (Campus Cantoblanco, U. Autónoma), Madrid (Spain)



More information about the MOBY-dev mailing list