[DAS2] das2 comments

Andrew Dalke dalke at dalkescientific.com
Wed Mar 16 05:30:10 UTC 2005


In going through spec I noticed some things that seem questionable.

We have "SOURCES" which contains a list of "SOURCE" elements.
         "NAMESPACES"     ....              "NAMESPACE" elements.
         "TYPES"          ....              "TYPE" elements

But we have "FEATURELIST" which contains "FEATURE" elements.

And "REGION-LIST" which contains "REGION" element.
     "PROPERTY-LIST"       ...    "PROPERTY"

There's also "CAPABILITIES", which contains "METHOD" elements.

I suggest we normalize these to use the same style.  My
preference is for the English plurals
   FEATURELIST --> FEATURES
   REGION-LIST --> REGIONS
   PROPERTY-LIST --> PROPERTIES

I'm not sure if CAPABILITIES/METHOD should be changed
and if so, to what?


The FEATURELIST/FEATURE/XID documentation says

     A typical feature will either have a single <LOC> tag or a
     single <XID> tag, although  it is possible (and sensible)
     to have one or more of both.

If I understand it correctly then it's equivalent to the
following, which I think is clearer

     A typical feature will have at least one <LOC> tag or
     one <XID> tag.  It is possible (and sensible) to have
     one or more of both.


There's a FEATURE/PROP example that includes a bit of base64
encoded data that purports to be a jpeg.  It isn't.

When I decode it, 'file' says it's a MS Office document.  When
I look at the byte stream I see something that looks like the
big endian Unicode BOM UTF-16/UTF-32 and the letters
"abcdefghijkl" a 4 byte intervals.

Any reason we couldn't have a real gif/png/jpeg/whatever here?
Besides the need to make one.


Speaking of which, do the prop fields each need a "name"
or "description" attribute?  How is a user supposed to
distinguish between these two images?

<PROP  ptype = "property/image"
     href = "http://www.wormbase.org/db/seq/gbrowse_img?name=cTel54X.1" 
/>

<PROP  ptype = "property/image"
        mime_type = "image/jpeg"
        content_encoding = "base64">
BASE64-ENCODED-DATA-HERE
</PROP>


Some months ago we had the discussion on date representations.
I thought we decided on ISO 8601 dates instead of RFC dates.
Looking through my emails I see there wasn't a conclusion.  In
private email to Lincoln I said

    Were we to go this route I would say we define that all
    dates be given as
       YYYY-MM-DD
    all datetimes be given as
       YYYY-MM-DDTHH:MM:SS(.ss*)?(Z|[+-]hh:mm)
    (timezone required, fractions of a second optional),

     0001 <= YYYY <= 9999, 00<=HH<=23 and leap second support
    is implementation dependent.

    This is compatible with ISO 8601, compatible with XML Schema,
    supportable by the likely DAS/2 clients and servers, and not
    dependent on any external specification.

ISO dates or RFC dates?  I vote ISO dates.


If we go the ISO route we could more easily fit in with
the Dublin core metadata elements.  For example we could have

   dc:created = "1987-06-05"
   dc:description = "Volvox Example Database"

However, I do not think this is needed, in part because I think
the dates and the description fields are the only things that
would be affected.  Someone would need to present a good enough
use case for it.  The downside is that it adds another namespace
to the system and another layer to understand.


I've been trying to make sense of the XLink spec since that
seems relevant to what we're doing.   One possibility is
to replace things that point to URIs with "xlink:href".  But
is that a good idea?  Looking around I found Tim Berners-Lee's
commentary "When should I use XLink?"
   http://www.w3.org/DesignIssues/XLink.html

His answer is
    2. You should use xlink whenever your application is one
    of hypertext  linking, as xlink functionality such as
    power to control user interface  behavior on link traversal
    is useful and should be implemented in a  standard way to
    allow interoperability

(BTW, there's the old joke that in a multiple choice answer
you should pick the longest one.  That's true in this case.)

No one I think will browse this data directly.  There will
be some intermediate translation going on in between, from
a web-based middle layer, a dedicated client, or an XSLT
transformation.  Those will be able to add xlink fields if
needed.

So again I mention it here as a possibility and for the
record, but I don't think it's something to use.


					Andrew
					dalke at dalkescientific.com




More information about the DAS2 mailing list