[DAS2] DAS/2 questions

Gregg Helt gregghelt at gmail.com
Mon Feb 25 18:21:17 UTC 2008


On Mon, Feb 25, 2008 at 7:51 AM, Cyrus Harmon <ch-das at bobobeach.com> wrote:

>
> Hello DAS folks,
>
> I've been looking at the DAS/2 spec and have a few questions/comments:
>
> [0. I'm assuming that DAS2 is the thing to use and that the version of
> the spec found here:http://biodas.org/documents/das2/das2_get.htmlis
> as close to normative/current/etc... as can be found.]


 Yep, that's the current version of the DAS/2 genome annotation spec.

1. The table at the beginning of
> http://biodas.org/documents/das2/das2_get.html
>  lists 4 types: sources, segments, types and features. Section 1.2
> says "Each of the five new formats has its own MIME type." and then
> goes on to list three: "application/x-das-sources+xml, application/x-
> das-features+xml, application/x-das-types+xml". Are there three, four
> or five types?


Four different formats: sources, segments, types, features.  I'll fix the
spec doc.


> 2. It seems to me that it would be worth splitting up the transport
> issues from the filespec. Why not have a spec for the XML and a spec
> for DAS-over-http(s)? This seems trivial (although it of course
> requires a bit more work to maintain two resources rather than one),
> but I could be wrong. Clearly, some of the sensible values that a DAS
> server would return are based on things that are established at the
> time of the request, but the spec should still allow for construction
> of DAS/2 files without regard to the particular transport layer.
> Perhaps there's a need for establishing some set of criteria like well-
> formed-ness and validity that describe increasing levels of
> "correctness" and one could enforce the transport related issues at
> one of the higher levels.


 At one point the the spec was split up more, and the consensus among the
contributors was that it needed to be consolidated.  Hence the current
organization.  This may be worth revisiting.  The readability and flow of
the current doc could definitely be improved on.


> 3. DTDs? Searching for the string DTD in the document turns up empty.
> Is this by design?



> 4. Without a DTD it's a bit hard to read (well, the DTD might not help
> too much, but it does give some constraints) the specs for things like
> SOURCES. I'd suggest that the detailed document sections lead off with
> some sort of formal-ish representation of what is in that document (or
> document element) and then follow up with the examples. It seems that
> there is a fairly small number of document elements for each document
> type. Can we list those in their own sections in section 3.X as 3.X.Y
> and in these sections be explicit about what is in each document
> element here?


I apologize for the missing links from the HTML spec doc to the formal
schemas!  Somewhere along the way in our efforts to improve the HTML doc the
links got dropped out.  I'll add them back in.  In the meantime here's the
link I use to the CVS head for the schema:
http://cvs.biodas.org/cgi-bin/viewcvs/viewcvs.cgi/das/das2/das2_schemas.rnc?rev=HEAD&cvsroot=biodas&content-type=text/vnd.viewcvs-markup

There are no DTDs for DAS/2.  Instead we use the RELAX-NG schema language to
formally describe DAS/2 XML.

Why RELAX-NG instead of DTD, XML-Schema or other alternatives?
Quick answer: James Clark.
Longer answer: search the web, there's plenty of debate about what is the
best XML schema language.

You can sort-of covert a RELAX-NG schema to a DTD, but there are many useful
constraints you can specify in RELAX-NG that you can't in a DTD, which
therefore get dropped in the  conversion process.  We felt when designing
DAS/2 that we shouldn't include a down-converted DTD spec because that might
encourage developers to use DTD validators etc. to determine whether an XML
doc is valid DAS/2 when in fact there are plenty of ways a doc could pass a
DTD validation but not a RELAX-NG validation.  And there are plenty of
RELAX-NG validators and other RELAX-NG tools out there.  Andrew Dalke has
written a web-based DAS/2 validator that utilizes RELAX-NG validation,
though it's currently offline.  (Andrew, are you out there?  Let me know if
we need to move the validation service to a more permanent host)

There is also an XML-Schema version of the DAS/2 schema but is derived from
the RELAX-NG schema -- first autoconverted then edited by hand to correct
some conversion problems.  The RELAX-NG schema is the official DAS/2 schema.


      Gregg



More information about the DAS2 mailing list