[MOBY-dev] Reminder: Vote async proposal

Mon Oct 2 23:14:15 UTC 2006

Hi Johan,

On  02 Oct 2006, at 21:36, Johan Karlsson wrote:

> Hi Pieter,
>
> Sorry to hear that you have a negative opinion of the proposal.

It's not that dramatic. Like I said I'm very happy the people at the  
INB took the effort to write a proposal (twice!) for asynchronous  
service behaviour. I also think using WS Adressing and the WSRF  
standards is a good idea and the big picture of how to implement this  
for BioMOBY looks good! But you have called a vote on the current  
version of the proposal and I can not vote yes for 97%. It's either a  
yes or a no.

> From our point of view, the details you are mentioning are minor (in
> the sense that they are simply choices where to place the  
> information).

Correct.

> The same information is sent, the number of SOAP calls are the same.

Indeed.

> It is not more (or less) optimal to send the queryIDs in the SOAP  
> header
> than it is to send it in the SOAP body, so we do not understand  
> what you
> mean with "sub-optimal".

Ok, let my try to explain with another example then. One of the  
strengths of BioMOBY is the Object ontology which greatly improves  
interoperability. The objects in this ontology are described using  
"self-describing" XML. It is possible though to send for example  
"lagacy" tab delimited data inside a String object. We do have such  
objects for backwards copatibility, but it's not an elegant way to  
send data around. Using tab delimited data defeats the purpose of the  
object ontology as a client can not figure out the relationships  
between the data within the tab delimited text block. A client has no  
way to figure what is in column 1,2,3, etc. and the data from column  
2 can not easily be parsed using an XML parser to send it as input to  
another service. Surely you can write a rather simple piece of code  
that uses a regular expression to fetch the values from column 2 wrap  
it in new XML tags and send it to the next service and although this  
is not rocket science it's not elegant and it hampers interoperability.

In the current proposal the queryIDs jump around the XMl as part of  
an element name, as an attribute and as part of raw text. Especially  
the resource properties are problematic in my point of view. You are  
doing something similar to wrapping tab delimited data when you combine:
* the kind of resource property you want to request (status or result)
* with the ID of an individual job of a batch (queryID) and send  
these merged as a text string like in:
<GetResourceProperty>status_queryID01</GetResourceProperty>,
<GetResourceProperty>status_queryID02</GetResourceProperty>,
<GetResourceProperty>status_queryID03</GetResourceProperty>, etc.
and
<GetResourceProperty>result_queryID01</GetResourceProperty>,
<GetResourceProperty>result_queryID02</GetResourceProperty>,
<GetResourceProperty>result_queryID03</GetResourceProperty>, etc.

Surely you can write code to fetch the raw text from a node and use a  
regular expression to split on "_" and hence seperate the kind of  
resource property requested from the job ID, but just like with tab  
delimited data it's an ugly approach to put different types of data  
inside raw text. You lose the semantics and it's not necessary! Some  
time ago I send some XML examples to show that with some minor tweaks  
to the XML passed around we can still use WS addressing and WSRF and  
make sure the queryIDs remain attributes like with the current  
synchronous services. It makes the XML more consequent and less  
confusing.

> As we wrote before, it is possible (you agreed with this also earlier)
> to implement a getResourcePropertyDocument operation in the future  
> with
> the approach in the proposal.

True, but with the current proposal you would have to generate a  
ResourcePropertyDocument dynamically for each service invocation,  
whereas with a few minor tweaks you can have a static  
ResourcePropertyDocument for a service which is the same for each  
invocation. It's not rocket science to create a  
ResourcePropertyDocument for each service request, but it is simply  
unnecessary overhead.

> With only two properties named "status"
> and "result", the structure would be more "fixed", but the values must
> still be put there by the service, so the document is not "static" but
> must be dynamically generated.

No, the ResourcePropertyDocument would be static if the queryIDs move  
to the EPR. Making the actual request to get resource properties  
would be a dynamic thing as the queryIDs have to be there somewhere  
in the data structure, but if they are part of the EPR a client can  
simply take them from the successful service invocation and echo them  
back. This would simply require less logic.

Finally I feel the current proposal is sub-optimal because, the  
result XML from an asynchronous service is slightly different from  
the XML produced by a synchronous service. For an asynchronous  
service result there are extra and redundant tags. Having a few more  
tags makes life not impossible - the prototype surely works - but  
again I think it's ugly and most of all not necessary. With a few  
minor tweaks the results of asynchronous and synchronous services can  
be exactly the same which is more transparent in my point of view.

Since it requires only minor tweaks to the XML that is send around to  
make the asynchronous service behavior more transparent, I assume it  
won't be a big deal to change the proposal and the prototype, but  
please correct me if I overlooked something here...

Hope it's more clear now and with kind regards,

Pi

> With dynamic property-names the client
> must construct these names by appending the queryID to status or  
> result
> but this is really, as you put it, far from "rocket-science".
>
> All these details are hidden by API functions (that we are providing),
> so it is not critical to change in the future if necessary.

> Kind regards,
> Johan
>
>
> Pieter Neerincx wrote:
>> Hi,
>>
>> Well, I read the proposal and the involved standards. I think it's
>> very important to have a standard for asynchronous services and the
>> process of getting there already took a lot of time. I also think
>> that such a standard should be very robust and ready for the future.
>> Adding things shouldn't be too much of a hassle, but once we
>> implement this it will be a pain if we have to modify it in such a
>> way that all asynchronous services "break". So I think this standard
>> should be damned good from the start :).
>>
>> As mentioned before I feel the way the queryIDs are passed around in
>> the XMl is sub-optimal, making asynchronous service behaviour
>> unnecessarily complicated. More explicitly with what I understand
>> from the WSRF standard I'm not comfortable with "dynamic" resource
>> properties (individual resource properties for status / fetching
>> results for each individual queryID). Therefore - although I like the
>> big picture - I vote NO on the current proposal.
>>
>> I will support any proposal that gets accepted for the sake of
>> interoperability as this is paramount for BioMOBY, but I would prefer
>> a more elegant solution.
>>
>> Cheers,
>>
>> Pi
>>
>> On 2-Oct-2006, at 3:14 PM, Martin Senger wrote:
>>
>>
>>> Well, I am still not sure that I understand the proposal fully (not
>>> because
>>> it is a bad proposal but because I have  not spent enough time on
>>> it). But I
>>> believe fully in the wisdom of our Spanish colleagues, the wisdom I
>>> hope to
>>> learn when I will be implementing the async behaviour in Moses) -  
>>> and
>>> therefore I vote YES.
>>>
>>> Martin
>>>
>>> -- 
>>> Martin Senger
>>>    email: martin.senger at gmail.com
>>>    skype: martinsenger
>>> _______________________________________________
>>> MOBY-dev mailing list
>>> MOBY-dev at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/moby-dev
>>>
>>
>>
>> Wageningen University and Research centre (WUR)
>> Laboratory of Bioinformatics
>> Transitorium (building 312) room 1034
>> Dreijenlaan 3
>> 6703 HA Wageningen
>> The Netherlands
>> phone: 0317-483 060
>> fax: 0317-483 584
>> mobile: 06-143 66 783
>> pieter.neerincx at wur.nl
>>
>>
>>
>> _______________________________________________
>> MOBY-dev mailing list
>> MOBY-dev at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/moby-dev
>>
>
> _______________________________________________
> MOBY-dev mailing list
> MOBY-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/moby-dev

Wageningen University and Research centre (WUR)
Laboratory of Bioinformatics
Transitorium (building 312) room 1038
Dreijenlaan 3
6703 HA Wageningen
phone: 0317-484 706
fax: 0317-483 584
mobile: 06-143 66 783
pieter.neerincx at wur.nl