[DAS2] arbitrary data in writeback

Thu Feb 9 18:28:48 UTC 2006

The DAS spec for features looks something like this

<FEATURES>
  <FEATURE>
   ...
   <PROP key="name" value="some data goes here" />
   <PROP key="homepage" href="http://blah/" />
   <PROP key="icon" mimetype="image/png">
iVBORw0KGgoAAAANSUhEUgAAAAoAAAAFAQAAAABUH0DFAAAANUlEQVR4nGIM1AcAAAD// 
2LiYgAA
AAD//2I6wAAAAAD//2JycAAAAAD//2L6dgAAAAD//wMAEoUDipTjFscAAAAASUVORK5CYII=
   </PROP>

   <some_non_das_namespace:curation-history>
     ...
   </some_non_das_namespace:curation-history>

   <flybase:substitution>
      ..
   </flybase:substitution>
  </FEATURE>
</FEATURES>

There are two points for extension.  One is the PROP table
which is meant to be simple.  Clients can do substring searches
of PROP elements with "value"s, as in

    prop-name=blah+blah

All clients should be able to understand these data formats, though
there is no constraint for the key names.  They are convention only.

Right now a key gets either a string, a URL, or a chuck of binary data
which is uuencoded.  (The key can be present many times; is that
a problem with Apollo?)  The latter two (URL and binary data)
are *proposals*.  They are neat, but not based on user demand.  No
one has told me that they will use it.

Allen wants one more possibility, "existence", with no associated
value at all.  Nomi says that Apollo can't round-trip that data
except by also tracking the input XML.  I don't want a "it just
exists" field and would prefer those stored with an empty string.

Then there is the support for non-DAS elements as extensions.
These can contain arbitrary XML, so long as they are not in the
DAS XML namespace.

A client can ignore elements it doesn't understand.  However,
if it does writeback of a feature it *MUST* include all elements
it doesn't understand.  I can write that into the spec.

It doesn't need to do anything with that data.  It can keep it
around as a chunk of text.  It just needs to send it back to
the server when it does the writeback.

For that matter, it doesn't even need to keep it around.  It
can throw the unknown data to the wind and work with the stuff
it does know.  Just before doing the writeback, go back to the
server and get the features again.  From the documents get the
unknown extension elements and insert them into the data - as
text! - to be sent back to the server.

Clients may mess up and commit records without these elements.
The server will treat those as delete of those records.  Because
it cannot tell if the client really knows what to do with that
data.

This is the easiest solution as a spec writer.  We have nearly
all of the format for that transaction, excepting a bit about
being able to delete.

NOTE: a server may ignore the uploaded data.  For example, it
may modify the transaction history and throw out whatever the
client sent to it -- if that's how the <transaction-history>
element is specified.

The other solution is to be more fine grained, so that clients
send deltas, like

<FEATURES>
  <FEATURE>
   ...
   <PROP key="name" value="some data goes here" />
   <PROP key="homepage" href="http://blah/" />
   <PROP key="icon" mimetype="image/png">
iVBORw0KGgoAAAANSUhEUgAAAAoAAAAFAQAAAABUH0DFAAAANUlEQVR4nGIM1AcAAAD// 
2LiYgAA
AAD//2I6wAAAAAD//2JycAAAAAD//2L6dgAAAAD//wMAEoUDipTjFscAAAAASUVORK5CYII=
   </PROP>

   <delete>
     <some_non_das_namespace:curation-history />
   </delete>

   <replace>
     <flybase:substitution>
        ..
     </flybase:substitution>
   </replace>
  </FEATURE>
</FEATURES>

but that gets complex.  You end up with a grammar for the
deltas.  Eg, "delete the first 'some_non_das_namespace:curation-history'
but not the others".  It's a harder grammar to write and a
harder semantic to implement on client and server.

I don't understand the case where complete writeback is a problem.
There was the mention of if a client deletes a feature when it
shouldn't have because of extra data that it just didn't know about.

I didn't follow that at all.

Please enlighten me!  :)

					Andrew
					dalke at dalkescientific.com