[DAS2] Spec issues

Chervitz, Steve Steve_Chervitz at affymetrix.com
Fri Nov 4 20:32:22 UTC 2005


As Gregg noted in this week's DAS/2 meeting, xml:base and
XML namespace (xmlns) are complementary technologies:
 
  * xml:base is for resolving relative URLs occurring within attribute
     values or CDATA elements
  * xmlns is for resolving names of attributes and elements.

So bearing this in mind, here's my take:

On Thursday 27 October 2005, Lincoln Stein wrote:
>
> On Wednesday 26 October 2005 07:29 pm, Chervitz, Steve wrote:
> >
> > <snip>
> >
> > Next issue: Feature properties example (only showing relevant attributes):
> >
> > Description: Properties are typed using the ptype attribute. The value of
> > the property may be indicated by a URL given by the href attribute, or may
> > be given inline as the CDATA content of the <PROP> section.
> >
> > <FEATURES xml:base="http://www.wormbase.org/das/genome/volvox/1/">
> >   <FEATURE id="feature/cTel54X.1.2"
> >                    type="type/curated_exon">
> >     <PROP ptype="property/genefinder-score">29</PROP>
> >     <PROP ptype="das:phase">2</PROP>
> >     <PROP ptype="property/protein_translation"
> >                href="/das/protein/volvox/2/feature/CTEL54X.1" />
> >   </FEATURE>
> > </FEATURES>
> >
> > So in contrast to the TYPE properties which are restricted to being simple
> > string-based key:value pairs, FEATURE properties can be more complex, which
> > seems reasonable, given the wild world of features. We might consider using
> > 'key' rather than 'ptype' for FEATURE properties, for consistency with TYPE
> > prop elements (however, read on).
> 
> I'm not so happy with "key" since it is nondescript. Originally this was
> "type" but the word collided with feature type.
> 
> I am getting uncomfortable with the dichotomy we've (I've?) created between
> XML base keys/properties and namespace-based keys/properties. It seems nasty
> to have the ptype attribute be either a relative URI
> (property/genefinder-score), or a controlled vocabulary member (das:phase).
> Is there any reason we shouldn't choose one or the other?
> 
> For example, does this work?
> 
>  <FEATURES xmlns:das="http://www.biodas.org/ns/das/genome/2.00"
>         xmlns:dasprop="http://www.biodas.org/ns/das/genome/2.00/properties"
>         xmlns:type="http://www.wormbase.org/das/genome/volvox/1/type"
>         xmlns:id="http://www.wormbase.org/das/genome/volvox/1/feature">
>         xmlns:prop="http://www.wormbase.org/das/genome/volvox/1/property">
>       <FEATURE das:id="id:cTel54X.1.2"
>                   das:type="type:curated_exon">
>              <PROP das:ptype="prop:genefinder-score">29</PROP>
>              <PROP das:ptype="dasprop:phase">2</PROP>
>              <PROP das:ptype="dasprop:protein_translation"
>  das:href="http://www.wormbase.org/das/protein/volvox/2/feature/CTEL54X.1" />
>       </FEATURE>
> 
> This looks so much cleaner to me.

Here's a new version of this example using xml:base, a default xmlns,
and a special attribute to define the URL for the controlled
vocabulary of DAS property keys. I'm also using xlink for the href:

  <FEATURES xmlns="http://www.biodas.org/ns/das/genome/2.00"
            xmlns:das="http://www.biodas.org/ns/das/genome/2.00"
            xml:base="http://www.wormbase.org/das/genome/volvox/1/"
            xmlns:xlink="http://www.w3.org/1999/xlink"
            das:prop="http://www.biodas.org/ns/das/genome/2.00/properties"
            >
    <FEATURE das:id="feature/cTel54X.1.2"
             das:type="type/curated_exon">
      <PROP das:ptype="property/genefinder-score">29</PROP>
      <PROP das:ptype="das:prop#phase">2</PROP>
      <PROP das:ptype="das:prop#protein_translation"
            xlink:type="simple"
  xlink:href="http://www.wormbase.org/das/protein/volvox/2/feature/CTEL54X.1
/>
    </FEATURE>

According to the W3C XML namespace spec, the default namespace only
applies to elements, which is why there is a separate 'xmlns:das'
pointing to the same URL as the default namespace. This permits
assigning a namespace to the attributes.

The above example avoids using xmlns in a non-standard way (i.e.,
referring to a namespace within attribute values, as in Lincoln's
example). The interpretation is as follows:

 * the 'das:prop' namespace defines the controlled vocabulary for
   property types occurring in this document

 * 'das:id', 'das:type', and 'das:ptype' attribute keys are defined
   within the xmlns:das namespace (i.e., the full id of 'type' is
   derived by appending '#type' to the xmlns:das URL).

 * the values of the 'das:id', 'das:type', and 'das:ptype' attributes
   are URLs relative to xml:base unless they begin with 'das:prop#', in
   which case they are relative to the das:prop namespace.

So, for example, the 'das:ptype#phase' attribute value is really
shorthand for this absolute, globally unique URL (which, if it
existed, could provide metadata about the property type):
http://www.biodas.org/ns/das/genome/2.00/properties#phase

The value of the property for this feature is given by the CDATA (29),
but could also be specified via an xlink:href attribute, as in the
protein_translation property above (which must be resolved to get the
actual value).

What do folks think about this scheme? We could do a similar thing
with type properties.

Also, how do folks feel about using xlink for all of our href
attributes as shown above? Seems more correct to me. We refer to the
xlink namespace already in our XML examples, but don't actually use it
anywhere.

Steve
















More information about the DAS2 mailing list