[DAS2] Spec issues

Lincoln Stein lstein at cshl.edu
Thu Oct 27 15:41:30 UTC 2005


On Wednesday 26 October 2005 07:29 pm, Chervitz, Steve wrote:
> In the spec for DAS/2 retrievals, there are some open issues regarding
> types and features that I'd like to solicit feedback on. This is kind of a
> long message, so feel free to pick and choose what you want to respond to.
>
> For reference, here's the latest retrieval spec:
> http://biodas.org/documents/das2/das2_get.html
>
> Type properties example (only showing relevant attributes):
>
> Description: A set of machine-readable configuration information as
> key/value pairs
>
> <TYPES xml:base="http://www.wormbase.org/dase/genome/volvox/1/type">
>   <TYPE id="curated_gene"
>           ontology="http://song.sf.net/ontologies/sofa#gene"
>           source="curated"
>           xml:base="gene/">
>     <PROP key="bg:glyph"   value="arrow" />
>     <PROP key="das:editable" value="yes" /?
>   </TYPE
> </TYPES>
>
> The spec currently describes the key attribute as "the name of the
> property. Elaborate on how to interpret the name". So how should name be
> interpreted? Can it be a URI/URL? If we want it to be just a simple string
> that can derive from some controlled vocabulary, how does one specify which
> vocabulary it derives from? (e.g.,
> http://www.biodas.org/ns/das/properties/2.00)

I thought that the "bg:" and "das:" were straight XML namespaces. I.e:

	<TYPES
		xmlns:bg="http://www.biodas.org/ns/properties"
		..etc...
	
So that bg:glyph is interpreted as http://www.biodas.org/ns/properties#glyph

> Also, we might want to allow some controlled vocabulary terms to be used
> for the value of type.source (e.g., "das:curated"), to ensure that
> different users use the same term to specify that a feature type is
> produced by curation.

Same idea, but see below.

>
> The spec also seems alarmed by the existence of a xml:base attribute in the
> TYPE element. The idea is that any relative URL within this element would
> be resolved using that element's xml:base attribute. How would folks be
> with having the DAS/2 spec fully support the XML Base spec (
> http://www.w3.org/TR/xmlbase/ )? The result of this would be to add an
> optional xml:base attribute to all elements that contain URLs or
> subelements with URLs.

> For an example of how this would work, in the above XML snippet, the
> absolute URL for TYPE.id would be
> http://www.wormbase.org/dase/genome/volvox/1/type/gene/curated_gene

I'm ok with this.

> Next issue: Feature properties example (only showing relevant attributes):
>
> Description: Properties are typed using the ptype attribute. The value of
> the property may be indicated by a URL given by the href attribute, or may
> be given inline as the CDATA content of the <PROP> section.
>
> <FEATURES xml:base="http://www.wormbase.org/das/genome/volvox/1/">
>   <FEATURE id="feature/cTel54X.1.2"
>                    type="type/curated_exon">
>     <PROP ptype="property/genefinder-score">29</PROP>
>     <PROP ptype="das:phase">2</PROP>
>     <PROP ptype="property/protein_translation"
>                href="/das/protein/volvox/2/feature/CTEL54X.1" />
>   </FEATURE>
> </FEATURES>
>
> So in contrast to the TYPE properties which are restricted to being simple
> string-based key:value pairs, FEATURE properties can be more complex, which
> seems reasonable, given the wild world of features. We might consider using
> 'key' rather than 'ptype' for FEATURE properties, for consistency with TYPE
> prop elements (however, read on).

I'm not so happy with "key" since it is nondescript. Originally this was 
"type" but the word collided with feature type.

I am getting uncomfortable with the dichotomy we've (I've?) created between 
XML base keys/properties and namespace-based keys/properties. It seems nasty 
to have the ptype attribute be either a relative URI 
(property/genefinder-score), or a controlled vocabulary member (das:phase). 
Is there any reason we shouldn't choose one or the other?

For example, does this work?

	<FEATURES xmlns:das="http://www.biodas.org/ns/das/genome/2.00"
			    xmlns:dasprop="http://www.biodas.org/ns/das/genome/2.00/properties"
			    xmlns:type="http://www.wormbase.org/das/genome/volvox/1/type"
			    xmlns:id="http://www.wormbase.org/das/genome/volvox/1/feature">
			    xmlns:prop="http://www.wormbase.org/das/genome/volvox/1/property">
		<FEATURE das:id="id:cTel54X.1.2"
	                    	  das:type="type:curated_exon">
			<PROP das:ptype="prop:genefinder-score">29</PROP>
			<PROP das:ptype="dasprop:phase">2</PROP>
			<PROP das:ptype="dasprop:protein_translation"
	 das:href="http://www.wormbase.org/das/protein/volvox/2/feature/CTEL54X.1" />
		</FEATURE>

This looks so much cleaner to me.

> In the feature filter section, the property-based filter describes feature
> properties as being string-based, a la TYPE properties. More complex
> feature properties would not necessarily be filterable, so this should be
> expanded upon, stating that property-based feature filters will only work
> for feature properties that are simple strings (not properties where the
> value is a URL or is a CDATA with MIME type not equal to text/plain).
>
> One issue that comes up here, which actually pertains to the spec as a
> whole, is that there are various attributes that are intended to be URLs
> but are named quite different things. In the FEATURE snippet above, there
> are four different attributes that are URLs: id, type, ptype, and href.
> There is a bugzilla entry requesting that all attributes named 'id' which
> are in fact URLs be named 'uri':
> http://bugzilla.open-bio.org/show_bug.cgi?id=1788

> This seems like a good move to me, since it flags these attributes as
> resolvable. Does anyone have objections to this?

Ok with me, but I'd like to hear what people think about throwing out the base 
idea entirely and using namespaces as described above.
>
> For other attributes that are URLs but are not named 'id' or 'href' (such
> as type, ptype above), we could either leave as-is, or we could append
> '_uri' to their name to flag their resolvability. Feature's PROP.ptype is
> an interesting case, since it is both an identifier (equivalent to type
> PROP.key) and a URL for describing the property. For this reason, I would
> also propose either renaming it 'uri' (to capture this dual role) or 'key'
> (for consistency with type properties). Thoughts?
>
> The feature example DASXML above also shows a way to attach a protein
> translation to a feature as a property. Since this will be a common task,
> I'd vote for having a feature property of
> "das:property/protein_translation" among the list of built-in feature
> properties in the das namespace. Anyone want to take a stab at defining the
> full list of built-in properties within the "das:" and "bg:" namespaces? I
> think it's a key requirement for interoperability.

Absolutely -- I'd like to start work on that.

Lincoln

>
> Steve
>
>
> _______________________________________________
> DAS2 mailing list
> DAS2 at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/das2

-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
FOR URGENT MESSAGES & SCHEDULING, 
PLEASE CONTACT MY ASSISTANT, 
SANDRA MICHELSEN, AT michelse at cshl.edu




More information about the DAS2 mailing list