[DAS2] today's sprint meeting

Andrew Dalke dalke at dalkescientific.com
Tue Mar 14 16:21:54 UTC 2006


Gregg can't make it this morning and asked that I let today's
meeting.  Here are the things I would like to talk about:

== segment identifier.

Quoting from my email yesterday

   - do not use segment "name" as an identifier
       - rename it "title" (human readable only)
       - allow a new optional "alias-of" attribute which is the
            link to the primary identifier for this segment

<SEGMENTS>
  <SEGMENT uri="http://whatever.com/ChromosomeA" length="2000"
     alias-of="http://www.ncbi.nlm.nih.gov/human/v32/Chromosome/A"
     title="Chromosome A" />
</SEGMENTS>


   - change the feature location to use the segment uri

<FEATURES>
   <FEATURE id="F0001" type_id="T0001">
     <LOC segment_uri="http://whatever.com/ScaffoldA" range="200:300"/>
   </FEATURE>
</FEATURES>

   - change the feature filter range searches so there is a new "segment"
      keyword and so the "includes", "overlaps", etc. only work on
      the given segment, as
         segment=<uri>
         inside=$start:$stop
         overlaps=$start:$stop
         contains=$start:$stop
         identical=$start:$stop

http://biodas.org/feature.cgi?segment=http://whatever.com/ChromosomeD; 
inside=5000:6000
(with URL escaping rules for the query string that's
       
...feature.cgi? 
segment=http%3A%2F%2Fwhatever.com%2FChromosomeD&inside=5000%3A6000

   - If 'includes', 'overlaps', etc. are given then the 'segment'
       must be given (do we need this restriction?  It doesn't make
        sense to me to ask for "annotations on 1000 to 2000 of anything"

   - only allow at most one each of includes, overlaps,
       contains, or identical (do we need this restriction?  Then again,  
Gregg
       only needs a single includes and a single overlaps; perhaps make  
this
       even more restrictive?)

   - multiple segments may be given, but then range searches
       are not supported (do we need this restriction?)

Consensus on this side seems to be fine.  The biggest worry is the
increasing use of URIs in URL query strings.


== coordinate systems

Quoting from an email I wrote recently

   - move the COORDINATE element inside of the
         CAPABILITY[type="segments"] element

   - add a 'created' timestamp to the COORDINATE (for sorting by time)

   - add a unique 'uri' identifier attribute to the COORDINATE
      (two coordinates are equal if and only if they have the same id)

Result looks like

<CAPABILITY type="segments"
      query_uri="http://localhost/das2/h.sapiens/v22/segments">
   <COORDINATES uri="http://das.sanger.ac.uk/registry/coordinates/ABC123"
      source="Chromosome" authority="NCBI" version="v22" taxid="9606"
      created="2006-03-14T07:27:49" />
</CAPABILITY>


   - have that identifier be resolvable, to get information about
       the coordinate system (but perhaps leave the contents for a
       future spec)

== use 'uri' instead of 'id' in the spec

I've decided to go with 'uri' instead of 'id' (or 'url' or 'iri')
in its various uses in the spec.

== churn

My feeling is this is the last major churn.  I'm not able to keep
up with the documentation writing, which makes it hard for people
to get things done.

Should I work with people today on getting data sources working
and developing example data files for people to review?  That is,
examples which show and explain the various element in the spec?
I figure more people work from example than from spec description.

					Andrew
					dalke at dalkescientific.com




More information about the DAS2 mailing list