[MOBY-l] Re: MOBY and the "REST/SOAP" debate

Fri Oct 4 18:06:20 UTC 2002

Alas, the common practice of POSTing form data to a CGI script is not RESTful 
because:

	1) it is not performing an update operation, but really performing a get
	2) it cannot be correctly cached by caching proxies
	3) it is not stable nor unique; the same POST may result in a different
		 document each time it is executed

Indeed, POST was misused almost from day one.  In the original documentation, 
POST was described as creating a "subdocument" below a URL, and was designed 
with the posting of news article in mind.

Actually, the debate is between the document-centric mindset of the hypertext 
and hyperpublishing communites, and the call-centric mindest of the remote 
procedure call community.  From my point of view, the hard part is agreeing 
on a set of data formats that correctly balance these criteria:

	- general:  They will satisfy the needs of at least two research groups.
	- extensible: Other research groups can extend them to meet their needs
		without breaking their utility to others.
	- modular: More complex formats can be built up from simpler ones in
		Leggo fashion.

If we have the data formats in hand, it doesn't matter whether we fetch them 
by GETing or URL or POSTing a SOAP message.

Lincoln

On Thursday 03 October 2002 03:32 pm, Andrew D. Farmer wrote:
> Hi all-
>
> I wanted to try to get some discussion going here with respect to some
> interesting debates that have been going on in various places around the
> basic theme of web services. Here, I'll try to provide an overview of the
> issues (based on my own limited understanding of them at this point), and
> give some links to useful starting points for further details. Despite
> its being at times maddeningly subtle and hair-splitting, I do think there
> are points raised by the debate that are quite relevant to MOBY, especially
> in terms of our choice of the best technologies to fit our needs. I know
> Mark has already been quite skeptical of UDDI, and it's interesting to see
> that many others are voicing concerns about some of the other components of
> the "standard" web services approach (or indeed, questioning whether in
> fact the whole idea of "web services" should be thought of as distinct from
> "the web", and whether we need these new standards, or to re-think our use
> of the old standards).
>
> My thoughts on the subject are still rather nebulous, but I have the sense
> that a lot of the fundamental issues are very much related
> to those involved in the different approaches to service description that
> are embodied in the "static service" vs.  "dynamic service" approaches
> which I have described with respect to ISYS.
>
> The basic focus of the debate seems to center on the SOAP protocol, and the
> question of how it relates to the basic design of the web architecture.
> The criticisms of SOAP largely stem from a view that it represents
> "yet another" CORBA/DCOM/RMI/RPC architecture, that is (ab)using HTTP as a
> convenient way of being firewall-friendly and universally available.
> Their criticisms about being a new RPC mechanism run deeper than the
> "heaviness" or lack of openness of the older approaches, which most folks
> seem to agree will be alleviated by the use of XML-derived standards.
> Instead, they seem to claim that the very nature of an RPC-based approach
> is at odds with the fundamental architectural principles of the web that
> made it "successful"- or at least enabled it to have a more significant
> impact than any distributed processing model has ever done... To quote Paul
> Prescod (from "Second Generation Web Services" at
> http://www.xml.com/pub/a/2002/02/06/rest.html):
> "These technologies achieved only limited success before they adapted for
> the Web. Some believe that the problem was that Microsoft and the OMG
> supporters could not get along. I disagree. There is a deeper issue. RPC
> models are great for closed-world problems. A closed world problem is one
> where you know all of the users, you can share a data model with them, and
> you can all communicate directly as to your needs. Evolution is
> comparatively easy in such an environment: you just tell everybody that the
> RPC API is going to change on such and such a date and perhaps you have
> some changeover period to avoid downtime. When you want to integrate a new
> system you do so by building a point-to-point integration.
>
> On the other hand, when your user base is too large to communicate
> coherently you need a different strategy. You need a pre-arranged framework
> that allows for evolution on both the client and server sides. You need to
> depend less on a shared, global understanding of the rights and
> responsibilities of a participant. You need to put in hooks where compliant
> clients and serves can innovate without contacting you. You need to leave
> in explicit mechanisms for interoperating with systems that do not have the
> same API. RPC protocols are usually poorly suited for this kind of
> evolution. Changing interfaces tends to be extremely difficult. Integrating
> services typically takes complicated software "glue".
>
> I believe this is the reason no enterprise has ever successfully unified
> all of their systems with DCOM, CORBA, or RMI.
>
> Now we come to the crux of the problem: SOAP RPC is DCOM for the
> Internet...."
>
> Now, while the rhetoric along these lines tends to get a bit thick, I think
> they have some good insights into some fundamental (and non-obvious)
> differences between this paradigm and alternative approaches to the same
> "web services" problems that are more consistent with the architecture of
> the web (and are therefore claimed to be more ripe for the same explosive
> and innovative growth seen by the web).
>
> The SOAP critics rally around yet another acronym: REST. Unlike the myriad
> of other acronyms floating around "web services", however, REST is not
> a proposed standard, but rather an "architectural style". It stands for
> REpresentational State Transfer, and was coined in the PhD dissertation of
> Roy Fielding (co-founder and director or the Apache Software Foundation,
> and one of the leading lights of the W3C- for example, he coauthored the
> HTTP specs). The meaning of this phrase is probably not worth explaining
> here (see the references if you're really interested in getting a view of
> web applications as finite state machines), but its "disciples" argue that
> it represents the cornerstones of the web architecture, and that the main
> reason for the web's "success" lie in its embodiment of REST principles.
> So, what are these principles? The goals of the REST "style" are stated as:
>
> 	-scalability of component interactions
> 	-generality of interfaces
> 	-independent deployment of components
> 	-intermediary components to reduce latency, enforce security and
> 		encapsulate legacy systems
>
> Phrased in this way, it may or may not sound like an obvious fit for our
> situation, but my take is that the essential theme here is a system that is
> as decentralized as possible (in the sense of having an authority for
> prescribing how components do their business), while at the same time
> allowing interesting and unforeseen interactions to evolve between
> components, and making it possible for infrastructural components (caches,
> proxies, gateways) to do their job without needing to know the details of
> every component.
>
> It is claimed that the web architecture achieves these goals
> by relying on these "core components":
>
> 	-The URI as a universal addressing scheme for resources.
> 	-HTTP as an ultra-generic stateless protocol for accessing and
> manipulating resources.
> 	-Representation of resources as self-descriptive and linked
> 		hypertext (mostly HTML now, but transitioning to XML is perceived
> 		as fundamentally necessary)
>
> Another way it is often articulated is that the "web" has been designed as
> a way of representing data-centric "resources" (anything designated by a
> URI, and ideally dereferenceable to give back some representation of that
> resource, e.g. an XML document). The set of "operations" on those
> "resources" has been designed to be extremely limited- the 4 basic HTTP
> methods, GET, POST, PUT, DELETE are seen to correspond to the basic generic
> operations on a piece of data: retrieve, update, create, destory. The idea
> seems to be that by focusing on the data (and developing rich XML
> vocabularies for representing it), and keeping the "operations" restricted
> to basic data manipulation operations, one has a much better chance of
> enabling integration/interoperation in a wildly decentralized, "open world"
> situation, than by trying to get people to agree on proper "behavioral"
> semantics.
> (Note that by embedding transitions to other resources into the
> representations given back, the creator of the resource is essentially
> defining a sort of operation set into the document.)
>
> The problem they see with SOAP as I understand it is that it
> is essentially a "framework" for developing application-specific
> protocols that live in their own shadowy
> world, invisible to components that are outside the agreed-upon standards
> of the "subprotocol". The classic example seems to be the use of a
> SOAP-RPC call for the equivalent of an HTTP "GET". Using the SOAP approach,
> one would define some "special message" (getSequence) which took an
> identifier parameter (in the namespace of the service provider) and
> returned an XML document describing the data. Some of the "ill-effects" of
> this seemingly straightforward approach are:
>
> 	Since there is no URI for the returned document, it effectively
> 	does not belong to the web. This implies that you can't use it
> 	in webby ways, e.g.: bookmark it; create links to it in documents;
> 	use it in URI-based standards such as RDF, XLink, XInclude, etc.;
> 	In effect, only SOAP-enabled technologies can get at the data, and
> 	further than that, only those who have through some process learned
> 	about your custom SOAP messaging semantics. Furthermore, it's not
> 	clear how you would "guide" consumers of the data to
> 	related information (e.g. the annotations of the sequence) accessible
> 	via other SOAP messages by encoding these "related SOAP messages"
> 	into the returned XML data, as you could by including URI links
> 	to the related info in your document.  The only way to "discover" the
> 	existence of the data is via the rather complicated journey through
> 	UDDI/WSDL/SOAP or some point-to-point agreement between requester
> 	provider.
>
> 	The infrastructural components of the web have no insight into
> 	the SOAP messaging semantics. They only know it is an HTTP POST
> 	to a certain URL. Thus, even though it's "really only a GET",
> 	it won't be seen that way to caches and proxies.
>
> Well, I'm probably not doing a very good job at articulating the arguments
> (especially given my own naive understanding of "the web", I'm hoping more
> experienced webheads will weigh in!);
> perhaps I'll just refer you to the resources at the end of this letter.
>
> I should note however that it seems important to distinguish between
> "the REST approach" and "the way any given website happens to work"- just
> as they claim that SOAP uses HTTP without being RESTful, so too you'll find
> a lot of discussion of points that are clearly not enforced or observed by
> most websites.
>
> Finally, I just wanted to throw out a few thoughts on how this whole
> business relates to MOBY.
>
> First off, one of the key points in the discussion is that there is general
> agreement that no matter what approach you take, you're going to have to
> develop a common "data" language, so it seems reasonable to pursue that
> thread regardless of how we proceed with respect to the other issues
> of how best to represent "services" on the data.
>
> Second, in keeping with the general themes of "keep it simple", and "keep
> the entry-bar low" and a general approach of incremental evolution from
> how service providers are doing things now, it seems quite appealing to
> be able to think about developing a system around the core concepts of
> the web that everyone has basically already bought into, rather than
> introducing new-fangled, relatively untried and still evolving
> technologies.
>
> Finally, the basic tenets of the REST approach seem to lean towards an
> extreme skepticism about the extent to which you can (or should even
> try to) get consensus on "operational" sorts of issues; on the other hand,
> it is acknowledged (by some REST proponents) that if this agreement
> can be reached by having some reasonably like-minded community, it can
> be effective, and that it is more straightforward for "traditional desktop
> programmers" (as opposed to "network programmers") to think in these terms.
> This same distinction seems very much related to the whole question
> of "static vs dynamic services" that we've been talking about (i.e.
> static service representation creating a very RPC-like representation,
> vs. dynamic service representation being a more ultra-generic and
> encapsulated, though self-descriptive approach). Just as I've been
> trying to explore the idea of how these two approaches might be
> connected, I think it's worth thinking about whether there is a similar
> way of bridging the gap embodied by the RPC interface vs resource-centric
> views of SOAP and REST.
>
> If you made it this far, thanks for your patience; I wish I could have
> organized my own thoughts a little better, but it seemed appropriate
> to get this out on the table for discussion...
>
>
>
>
> Various references:
>
> Some good starting points:
> reasonably high-level articles:
> http://www.xml.com/pub/a/2002/02/06/rest.html
> http://www.xml.com/pub/a/2002/02/20/rest.html
> http://www.xml.com/pub/a/2002/07/10/rest.html
> http://www.xml.com/pub/a/2001/10/03/webservices.html
> http://www.xml.com/pub/a/2002/05/08/deviant.html
>
>
>
> Sites with plenty of info:
> Many essays by one of the main REST proponents:
> http://www.prescod.net/rest/
> in particular,
> http://www.prescod.net/rest/standardization.html
> http://www.prescod.net/rest/rest_vs_soap_overview/
>
>
> "Wiki pages":
> http://conveyor.com/RESTwiki/moin.cgi
>
> Some REST "tutorials" and skeptical views on migrating to web services:
> http://www.xfront.com
>
>
> Some dedicated discussion lists:
> http://groups.yahoo.com/group/rest-discuss/
> http://groups.yahoo.com/group/rest-explore/
>
> Andrew Farmer
> adf at ncgr.org
> (505) 995-4464
> Database Administrator/Software Developer
> National Center for Genome Resources

-- 
========================================================================
Lincoln D. Stein                           Cold Spring Harbor Laboratory
lstein at cshl.org			                  Cold Spring Harbor, NY
========================================================================