[MOBY-dev] RFC - Asynchronous Service Call Proposal

Thu Feb 9 20:10:41 UTC 2006

(Martins first email with comments is a good starting point, I know that 
there has been discussions after he sent the email and some things have
changed, but it is an excellent overview of the "hottest" points.)

Martin Senger wrote:

 > I agree with Mark that it is a good proposal - we desperately need
 > it. I just hope that more people will join this discussion (please, do
 > so!). I suggest that we postpone the voting for about two more
 > weeks. Mainly because some of my comments (see below) would require
 > more details to be put in the document. And also to allow other people
 > to jump the board (please, do so!).  

Yes, this is reasonable. We still have some very fundamental discussions 
open and we should give time for more people to give comments.

 > * Data returned by xxx_async and xxx_poll
 > -----------------------------------------
 >
 > Why does the proposal treat them differently (comparing to normal
 > Biomoby data)? Why to introduced a new XML element 'mobyStatus' when
 > we can easily - and keeping many software untouched - send these data
 > in usual 'mobyData' element?  

The answers to both your questions are really implied by your first 
question:

The proposal treats them differently because they are very different 
(not normal Biomoby data). Status information is not Moby data (clearly
not biological data or results) and not related to service discovery. It 
is data about a service execution.

But more about "Data returned by xxx_async and xxx_poll" in a separate 
email (this email is already long enough...).

 > * Dependency on LSID service metadata
 > -------------------------------------
 >
 > I do not like it (the RDF predicate indicating that a service can be
 > asynchronous). Here is why:
 >
 > For me, the divider between service data and service metadata was
 > always that metadata are good but are not crucial for a service
 > execution. But the flag "this service can be called asynchronously" is
 > very fundamental flag for a service, and it changes its behaviour and
 > the behaviour of its clients. Also, it must be known to a registry -
 > because registry is supposed to generate WSDL with xxx_async and
 > xxx-poll method included. So register should have it, anyway!
 >  
This is an interesting discussion (also referring to later emails about 
this subject).

Three (?) sub-topics:

1) Should the information be in the RDF?

The RDF predicate "isAsynchronous" is information about how a service 
can be used, in the same way as information about a service instance´s
input/output parameters, object types etc (just some of the information 
available as RDF for service instances) . They are all necessary clues
of how to call a service (not just good to have but actually needed, 
although they are _also_ provided in another way as WSDL). Same thing
goes for information if a service can be called asynchronously, it is 
needed and should be in the RDF.

We could offer this information both as WSDL and as RDF?

2) Should this information be stored in the registry?

Information if a service is asynchronous must be stored by the registry, 
this we agree on (it was in the proposal). An excellent reason to keep
the information in the database of the registry was given by Martin in 
later emails; basically that the registry can't be expected to check a
RDF file (LSID metadata) every time WSDL is generated (even if this 
information is cached). This information must be in the database, yes,
but what is said in the proposal about LSID resolution is not how the 
registry finds out if a service is asynchronous, it says one way how a
client can find out.

Also, regarding what Mark asked in his first comments. There is no real 
value added by doing a search for asynchronous services if that is the 
only search term (other than for statistical reasons maybe). Simply the 
fact that the service is asynchronous does not  (or at least should not) 
motivate clients to call it. But the information still belongs in the 
registry.

3) How does a client tell if a service is async?

We agree that the information should be in the information from the Moby 
Central (retrieveService), but as said above, it should
also be possible to get as RDF. So, maybe it is better not to specify 
how the client tells if the service is async? We need to define places
where to put the information but we do not have any control over if a 
client finds out that a service is asynchronous by reading the WSDL (or
the RDF) or by just simply calling xxx_async, hoping that it will work 
(not saying that this is a good option, but it would be one possible way
to do it). We have to consider how to do it, but maybe not specify in 
the RFC how to do it?

Regarding "category":

One possible place to put this information would be, as Martin 
suggested, in the "category"-field. If everyone agrees, then we will add
this to the proposal. Right now, the allowed values are "moby", "wsdl" 
and "cgi"? Does "moby-async" makes it clear that "yes, it is a moby-type
service, but it should be called asynchronously"? We need some form of 
flag, either way. What do others think?

 > b) "Id unique to each... in a message". No, the Id must be unique to
 >    the whole service provider (at least to the invoked service). If
 >    the unique-ness is just in a message there would be problems when I
 >    invoke the same service several times before the first one
 >    finished.  

You are completely right, this will be corrected in the RFC document.

 > c) How many times am I allowed to call xxx_result? I suggest just once
 >    - and any next invocation fails with an exception. This means that
 >    this call also serve as a cleaning call - the service
 >    implementation can clean the session.
 >  

This has also been further discussed and the agreement was to allow 
several calls to _result?

As was pointed out also by Robert Buels, results could take a long time 
to produce and transfer (and it would probably be very annoying for a
client to loose the result he/she has been waiting for if the connection 
is broken during a call to _result because of network errors or whatever).

It should be (in principle) possible to call _result as many times as 
the client wants, but the service providers must be allowed to clean old
results after a (service provider-specific) time period. After that time 
period, calls to _result with the asyncID should return an empty
mobyData with a mobyException (saying "result not found"?). However, as 
Martin pointed out, this probably does not belong in the RFC, it is more 
an implementation detail (a service provider with large resources might 
choose to never clean the results, unless specifically asked to do so).

Of course, it would also be good to add an explicit xxx_clean method to 
allow clients that want to say "ok, I will not need this result ever
again, you may remove it". Naturally messages to and the behavior of 
_clean must be documented if there is a consensus that it should be
added to the RFC. This possibility has been discussed earlier in INB but 
we decided not to include it in the RFC to keep the proposal as simple
as possible, but there seems to be a need for a clean-method, even if 
the service provider automatically cleans services (after some time).

Actually, we have been thinking about how to provide ways to stop, pause 
and restart service executions (in similar ways to _async, _poll, etc)
but, again, we did not want to complicate discussions. However, while 
this discussion is important, it would be better if we agreed on the
current proposal first.

 > c) More must be written about exception states: What (if any)
 >    exception to raise when a given session handler is
 >    invalid/expired. There may be other states to document by an
 >    exception code.
 >  

Yes, we are adding some example(s) to the proposal, mainly to show how 
it can be done. It is, however, difficult to list all possible exceptions,
anyone with suggestions is welcome to let us know.

 >    I think that this is enough for now. Again, I thank very much INB
 > for the proposal, and please give us slightly more time to discuss and
 > make changes. Then we need a new document, and few days to read it
 > again.

Thank you for your comments and suggestions. We are working on a new 
version, so please stay tuned!

Kind regards,
Johan