properties and key/value data (was Re: [DAS2] Spec issues)

Andrew Dalke dalke at dalkescientific.com
Tue Nov 29 00:01:08 UTC 2005


Steve:
> To clarify a point of possible confusion, there are really two sets of
> key-value pairs to keep in mind:
>
> 1. The key-value pair for the property type.
> 2. The key-value pair for the property itself.

I don't see that #1 is a useful distinction.

> So in this example:
>
>   <PROP das:ptype="property/genefinder-score">29</PROP>
>
> The key for the type is 'das:ptype' and it's value is
> 'property/genefinder-score' and this value is a relative URL based on
> xml:base in the enclosing PROPERTIES element (or in it's grandparent or
> great-grandparent element, etc.). The value of the property itself is  
> 29 and
> it's key is the whole key-value pair for the type (
> das:ptype="property/genefinder-score").

How do I make an extension type?  For example, I want to add
a new property for 3D structure depiction, which can be one of
"cartoon", "ribbons", or "wires".

Let's say it's under my company web site in
   http://www.dalkescientific.com/das-types/rep3d

How do I write it?  I tried but couldn't figure it out.

What does that URL resolve, if anything?


> In Andrew's Relax-NG equivalent:
>
>   <prop:genefinder-score>29</score>
>
> the element name contains both the key ('prop:') and the value of the
> property type ('genefinder-score'), while the element name as a whole  
> serves
> as the key for the property itself (value=29). The  
> 'prop:genefinder-score'
> string is not a relative URL, but is just a namespace-scoped element  
> name,
> with 'prop:' serving merely to make 'genefinder-score' globally unique,
> relative to the URI defined by:
>
>   xmlns:prop="http://www.biodas.org/ns/das/genome/2.00/properties"

It took me a while to understand XML namespaces.  This helped
   http://www.jclark.com/xml/xmlns.htm

He uses (for purposes of explanation) the so-called "Clark notation".
An example from that document is

    <cars:part xmlns:cars="http://www.cars.com/xml"/>
      maps to
   <{http://www.cars.com/xml}part/>

"""The role of the URI in a universal name is purely to allow
applications to recognize the name. There are no guarantees about
the resource identified by the URI."""

Using Clark notation helps with remembering that, since { and }
here are not valid for URLs.

The element name "prop:genefinder-score" is a convenient way to
write the full element name, and that's all.  There is no meaning
to the parts of the name.  "prop:" is not a key, since given these
two namespace definitions

   <... xmlns:prop="http://www.dalkescientific.com/"
        xmlns:wash="http://www.dalkescientific.com/">

then these two elements are identical

     <prop:genefinder-score>29</score>
     <wash:genefinder-score>29</score>

I think Steve is saying the same thing as I am - I wanted to rephrase
it to make sure.


> A potential drawback of the Relax-NG approach, as discussed in today's  
> conf
> call, is that the value of the property type is not resolvable as in  
> the
> other approach using the PROPERTIES parent element.
>
> Andrew doesn't see a need for resolvability, e.g., for a dynamically
> discoverable schema fragment. But I thought of another use case  
> besides the
> one mentioned in today's call (determining data type such as int or  
> float,
> which isn't of much use in practice). The URL for the type could point  
> to a
> human readable definition of the term. A user may not need  
> clarification of
> 'genefinder-score' but might for something like 'softberry-ztuple'.

Who is the user that would want the clarification?  That is, what human
will be doing the reading?

Once clarified, what does that user do with the information?

In my opinion, the only people who care about this are developers,
and more specifically, developers who will extend a client to support
new data types.  Users of, say, the web front end or of IGB don't care.

That's a relatively small number of people.  And the use case is
solved by having the doc_href for the versioned source include a
link to any extensions served.


Here's another solution. Somewhere early in the results include

<link ref="das.format_href" href="http://example.org/blah">

where the schema includes links for each of the fields, including
any extensions.  It doesn't need to be a <link>, just something
meant as a shout out to developer people.


> One could still satisfy such a use case under the Relax-NG approach by
> providing a resolvable URL based on the element name + namespace such  
> as:
>
> http://www.biodas.org/ns/das/genome/2.00/properties#genefinder-score
>
> True, there's no XML spec that says this is legal, but we could  
> declare that
> such a convention will hold for all biodas.org-based properties. One  
> problem
> with the above convention is that it's not obvious what the URL  
> resolves to.
> So we could have something like:
>
> http://www.biodas.org/ns/das/genome/2.00/properties?prop=genefinder- 
> score&de
> fine=true
>
> http://www.biodas.org/ns/das/genome/2.00/properties?prop=genefinder- 
> score&sc
> hema=true

We could do this, though it's a bit complicated with some tools which
represent element via Clark notation - it needs a bit of string munging.

I suggest that the reason why "it's not obvious what the URL resolves
to" is because there's nothing which will actually use this.

It is easier to just have a human-readable link either on the doc_href
page or via some special "if you're a developer, look here" reference,
and don't worry about automating it further.

					Andrew
					dalke at dalkescientific.com




More information about the DAS2 mailing list