[MOBY-dev] BioMOBY Asynchronous Service Call Proposal v2.2 - The location of queryIDs

Wed Sep 6 14:13:27 UTC 2006

Pieter,

Thank you for a well written letter (and sorry for the delay in answering).
> There is only one thing that I don't like about the current proposal:  
> the location of queryID. For our current synchronous services it's an  
> attribute of the mobyData element. In the current async services  
> proposal the queryID jumps around the XML taking several identities:
> * in <GetResourceProperty>status_queryID01</GetResourceProperty> it  
> is part of raw text.
> * in <lsae:status_queryID01><!-- LSAE block --></ 
> lsae:status_queryID01> it is part of an element name in the lsae  
> namespace.
> 	(By the way: Should this element really be in the lsae namespace? I  
> don't think our status_queryIDxx elements are part of the LSAE specs...)
>   
True, we need to put a better namespace, moby?
> It would be  
> much more convenient if the result from a asynchronous service  
> invocation would contain both the ServiceInvocationID *AND* the  
> associated queryIDs. In that case I only have to parse the service  
> response to create GetResourceProperty requests. Therefore I propose  
> to supply the queryIDs as wsa:ReferenceParameters just like the  
> ServiceInvocationID.
>   
I am not sure that I understand the problem completely... The clients 
must internally store, somehow, the connection between the input 
(identified by the queryID) and the output? The jobs could, potentially, 
take a very long time to finish and without knowing the input, getting 
the output would not be so interesting. Anyway, it is not so complicated 
to handle the queryIDs for the client (see some of the example code of 
the client at the prototype page). Maybe it is another situation that 
you are describing than the one in the example? Can you give some 
examples where it would be necessary to return the queryIDs? Again, not 
sure if I understand.

http://bioinfo.pcm.uam.es/prototype/

> 2.
> WSRF contains an *optional* method to request a resource properties  
> document. With this method a client can figure out which resource  
> properties are available and hence what it can request. Although this  
> method is optional and the current proposal doesn't mention it, I  
> think it would good to keep the option open to supply such a method.  
> WSRF does not put any limitations on how a service generates and  
> provides such a document, so you can generate it dynamically or it  
> can be a static thing. If we would want to supply such a resource  
> properties document in the future it would be the easiest if it can  
> be a static one. However in the current proposal the queryIDs are  
> part of the resource properties (status_queryIDxx and  
> result_queryIDxx). This means that the available resource properties  
> depend on the amount of queries/jobs that were sent to a service and  
> hence we can not use a static resource properties document. It would  
> be more convenient if we can strip the queryIDs from the resource  
> properties and provide them as wsa:ReferenceParameters. In that case  
> there are only two resource properties (status and result) and we can  
> describe those in a static resource properties document.  

At least until now, we have tried to only include exactly what is needed 
and avoid many, potentially, useful but maybe more complicated WSRF methods.

Yes, the WSRF method GetResourcePropertyDocument could be useful but it 
is possible to manage without it since the clients would always be able 
to construct the property qnames as long as they keep track of the 
queryIDs. But of course, if there is a great demand for this optional 
WSRF-method we could add it to the documentation.

> Therefore I propose a translocation of BioMOBY queryIDs from the  
> resource properties to wsa:ReferenceParameters. As far as I  
> understand, with all the specifications involved this would be legal,  
> but please correct me if I am wrong. Below I included some examples  
> of what the XML might look like when the queryIDs are moved to the  
> SOAP header as wsa:ReferenceParameters. Let me know what you think.... 
The problem (?) is that the EPR is supposed to be opaque, or in 
particular, the ReferenceParameter (<moby:ServiceInvocation>) should be 
"assumed to be opaque" for the clients.

"Reference parameters are also provided by the issuer of the endpoint 
reference and are otherwise assumed to be opaque to consuming applications."

(quoting from the WS-Addressing standard that WSRF builds upon)

At least my interpretation of this is that clients are not supposed to 
understand or parse or manipulate the reference-parameter but instead 
just echo it back (if I am confused please correct me)? Yes, the 
reference-parameter can be given as XML but this XML should not be 
modified by the clients (I assume that you mean that the clients should 
just include the <moby:Job> tags that they need to find status or 
results for particular jobs in the batch-call). The issuer of the 
endpoint reference naturally must handle the EPR but the clients should 
not try to understand the EPR.

Also, conceptually, the EPR refers to a specific resource (in this case 
what we call "batch-call", many jobs). If we manipulate the EPR we 
"change" its original reference. We tried to clearly define in the 
proposal what the EPR refered to (what the "resource" was). Manipulating 
the EPR in some way confuses what it refers to.

-------------------

Regarding "dynamic" property names (status_{queryID}); the official WSRF 
specification mandates that all properties of a resource MUST be 
described by a XML Schema  but this is not strictly enforced in the 
library we used for the Perl examples (WSRF::Lite) (or at least, in the 
examples of WSRF::Lite that I have seen there is no such XML schema file) .

Just to give an example to give an idea of what I am talking about (non 
BioMOBY...):

<!-- Resource property element declarations -->
<xsd:element name="NumberOfBlocks" type="xsd:integer"/>
<xsd:element name="BlockSize" type="xsd:integer" />
<xsd:element name="Manufacturer" type="xsd:string" />
<xsd:element name="StorageCapability" type="xsd:string" />

<!-- Resource properties document declaration -->
<xsd:element name="GenericDiskDriveProperties">
    <xsd:complexType>
        <xsd:sequence>
            <xsd:element ref="tns:NumberOfBlocks"/>
            <xsd:element ref="tns:BlockSize" />
            <xsd:element ref="tns:Manufacturer" />
            <xsd:any minOccurs="0" maxOccurs="unbounded" />
            <xsd:element ref="tns:StorageCapability" minOccurs="0" 
maxOccurs="unbounded" />
        </xsd:sequence>
    </xsd:complexType>
</xsd:element>

This resource has four properties (tns:NumberOfBlocks, tns:BlockSize, 
tns:Manufacturer and finally tns:StorageCapability). The qnames of these 
four properties are pre-defined/fixed and not like what we need 
"status_q1", "status_q2" etc etc.

We would need that the resource properties schema allows open content 
(using a xsd:any element). This means that the list of valid qnames for 
the resource properties is "open". See "3.3.1.1 Establishing a List of 
Valid Resource Properties" in "WSRF Application Notes" 
(http://docs.oasis-open.org/wsrf/wsrf-application_notes-1.2-cd-02.pdf) 
for more information.

Kind regards,
Johan Karlsson

-- 
Johan Karlsson
Instituto Nacional de Bioinformática (INB)
Integrated Bioinformatics Node (GNV-5)
Dpto. de Arquitectura de Computadores
Campus Universitario de Teatinos, despacho 2.3.9a
29071 Málaga (Spain) 
+34 95 213 3387