[MOBY-dev] data by reference - a request for comments

Martin Senger martin.senger at gmail.com
Tue Jul 1 18:41:49 UTC 2008


Hi all,

Yesterday, Mark, Eddie and I, we spent some time to evaluate what was
proposed during our last meeting about sending data by references. Here are
some thoughts that may crystalize into a request fro comments.

*What is the purpose of sending data by reference
*
Well, the first purpose (A) is obvious: we want to be able to deliver huge
data from a service. So the service returns only a reference instead of the
real data and the client can fetch the real data using some memory-friendly
protocol (usualy a simple HTTP/GET).

But it appeeared that it was not the only purpose. The second purpose (B) is
to be able to send around already existing references (such as URLs of the
EMBL or NCBI records). The existence of this B purpose makes the problem a
bit harder but it is a valid puspose. So the first question is:

*Do you agree that we pursue both purposes in this requests?
*
*The machinery
*
a) A service claims (in the registration time) that it can provide data by
reference.

b) A client asks for getting back references by including "acceptRefs"
attribute in mobyData tag. The attribute lists one or more protocol names
that the client can accept.

c) A service *can* obey such request and send one or more *primitive
data*as references (the focus on primitive type is new, originaly we
thought
about allowing references on any level, but now, mainly becuse of the
purpose B we do not propose it anymore). It can use any of the protocols
mentioned in the client's "acceptRefs" attribute. It can send references
only if at least one protocol matches.

*How does a client knows what protocols a service supports?*

This is a fundamental question that goes closely with "use existing
standards rather than inventing your own". An ideal solution is perhaps
this: A service returns not a reference to data itself but a reference to a
WSDL document that contains all supported protocols, including the endpoints
for this particular data. It is a nice idea but it breaks the purpose B - we
cannot use existing references without wrapping them first in a WSDL
document. The WSDL is strong because it gives us actually an *interface* how
to get data, but it is weak because the references cannot be used as *
indexes* (e.g. for further caching). Also, it does not solve the client
side: the "acceptRefs" attribute still needs to use a list of protocol names
(and not a WSD document because clients cannot make WSDL documents visible
to the world).

After going there and back, we concluded (and it is now our proposal for the
request of comments) that the service returns a reference to data, and
clients can deduce what protocol to us by looking at the protocol part of
the returned URL. We are aware that this is fine for usual protocol, such as
HTTP and FTP, but it cannot serve data, for example, by a SOAP. But, as
Eddie pointed out, if somebody wants SOAP for data, she can return data
directly in the Moby message.

*The remaining questions
*
Dmitry suggested to use WSRF. We think that he meant something else: It
could be used instead of the whole Moby message - but that is not what we
are looking for. We are looking for replacing just data part by references,
and we want still to keep the original Moby message as it is used now. So we
have concluded: no WSRF.

*How can a client tells a service that she is sending a reference instead of
data?* This could be useful for chaining services. We have not talked about
it. Ideas welcome.

The machinery described above may not allow to find, in advance, what
protocols a service is able to provide. It depends on what a service can
register into a moby central registry. It can be just a boolen flag ("I can
provid references"), or a list of supported protocols ("names"), or actually
nothing. The latest option has an advantage that no change in the registry
is needed. *Can we live with this simple option?*

I am not sure if I covered all, but better to send ti now and wait for your
comments.

Cheers,
Martin

-- 
Martin Senger
email: martin.senger at gmail.com,m.senger at cgiar.org
skype: martinsenger



More information about the MOBY-dev mailing list