[DAS2] Fwd: DAS2 meeting today (Sunday), 12:30-2:30

Tue Dec 14 01:34:37 UTC 2004

Begin forwarded message:

From: Andrew Dalke <dalke at dalkescientific.com>
Date: September 26, 2004 2:09:12 AM PDT
Subject: Re: DAS2 meeting today (Sunday), 12:30-2:30

BTW, when is Monday's meeting?

Gregg:
> Topics I'd personally like to cover:
>     Spec overview:
>         General coherence

I found the spec somewhat confusing to read.  As written there are two
parts to the spec -- one an overview and the other the details of the
request.  But some of the details (like the use of /0 to mean the 
current
database, which I still don't like ;) are in the first part but not in
the second.  So there isn't one place to read to learn everything about
a given interface.

I would rather the overview section be much shorter and not contain
specification information.  I would also like it to include some
concrete information to show how it exists for a (hypothetical)
database.

   $entry                      -- main entry point to a DAS/2 server
   http://www.biodas.org/das2/das-genome

   $entry/$sourceid            -- data about a given source
   http://www.biodas.org/das2/das-genome/volvox

   $entry/$sourceid/$version  -- data about a given version of the source
   http://www.biodas.org/das2/das-genome/volvox/1

   $entry/$sourceid/$version/type -- description of various types of 
genomic features
   http://www.biodas.org/das2/das-genome/volvox/1

    ...

and perhaps with an abbreviation like $db = $entry/$sourceid/$version

The final version should support in-document hyperlinks to
the actual section.  That's a detail understandably left out
of the current version.

One topic I would like to revisit is the use of DTDs.  My
DAS/1 and my NCBI EUtils client both used DTDs to make parsers
for the data returned from the server.  That was part of
what made the client code easy to implement.  The problems
I ran into were 1) the old DTDs weren't correct, 2) DTDs
don't support non-text fields (like start/stop sequence
positions) and 3) the DTD-based parser I used was validating
and I ran into problems dealing with extensions and other
variances from the spec.

I want to investigate the use of RelaxNG instead.  That
supports datatypes, to say that the sequence start/stop
positions can be longs (64 bit integers).  It should help
handle these non-text fields better, so better for Java
people.  I might help with my validation. And I see there's
a way to annotate fields so it might be possible to
convert from a RelaxNG schema directly into documentation.

That may slide too much into implementation details.  I'll
restrict to the question "should the spec declare that DTDs
are going to be provided and will be considered normative?"

>         Complexity comparison to DAS/1 and other alternatives

I only wrote a client to it, not a server.

Looking over my DAS/1 client code, I'm happy to see that we
no longer have "X-DAS-Capabilities".  My code basically ignored
it.  It also ignored the X-DAS-Version because just about
every server seemed to return a different version number.

The client-side implementation was pretty easy (except that
servers didn't meet the spec and several didn't even return
valid XML).

I don't know about other alternatives.

One of the fuzzy places in the spec is how to deal with issues
related to the underlying HTTP protocol.  I see that several
places didn't implement stylesheet so the HTTP request ended
up with a '404 Not Found' instead of using the DAS error code.

I see the DAS/1 spec talked about being able to leverage
things like content-encoding for sending compressed data back
and forth.  DAS/2 doesn't yet mention that.

>         Technology choices -- I'd like to have some coherent
>  strategy for explaining DAS/2 technology choices to others

What I just mentioned about interactions with HTTP might go
in this document.  I would also like to see comments like
"we considered WebDAV but because of its complexity and our
lack of experience ... " and "this locking scheme was chosen
because it is an easy one to conceptualize and implement.
Better possibilities exist, including ideas in source code
version control systems like CVS, subversion, and arch.
We hope people experiment with them to get experience that
may influence the next generation of DAS-like tools."

Okay, time for a 2.5hr nap before the meeting.  *yawn*

					Andrew
					dalke at dalkescientific.com