[DAS] DAS1.6: coordinate systems
thomas.a.down at gmail.com
Wed Aug 11 19:14:41 UTC 2010
My reading of the current spec is a bit vague about how we should refer to
There seem to be three ways to represent a CS:
- Comma-separated list, e.g. NCBI_36,Chromosome,Homo sapiens
- URI, e.g.
- XML, e.g.:
source="Chromosome" authority="NCBI" test_range="1:1,1000"
The XML representation seems to be the most complete.
The URIs don't really get discussed much in the spec. Should they resolve
to anything in particular? Or should they just be treated as opaque
strings? The example I've given resolves to an HTML document with a
Vitruvian Man icon and some human-readable details, but probably isn't going
to be any help to a client.
If you restrict yourself to single-genome DAS (sequence, features, etc.),
this all works out fine -- the only interaction you need with the coordinate
system infrastructure is to filter out suitable sources from a registry, and
in that case you can either filter on the XML COORDINATES elements -- which
is fairly straightforward -- or you can ask the registry to filter for you
(using a data model which is a reasonably close match to the XML).
However, working with coordinate systems seems to be pretty much essential
once you start working with alignements, and this is where things start to
The returned alignment XML defines the CS of each sequence in the alignment
using the comma-separated form. My assumption is that you're meant to treat
this as an opaque string and correlate it with data from a registry, but
this isn't 100% clear.
On the other hand, if you want to specify a coordinate system in the
alignment QUERY, you're supposed to provide a URI. It's not at all clear to
me what a server is supposed to be doing with this. Again, opaque string?
Is it too late to ask if there's any chance of rationalizing this (and maybe
providing a few concrete examples in the spec) before 1.6-final?
More information about the DAS