[DAS2] Refinements to range attribute and query filters in spec

Andrew Dalke dalke at dalkescientific.com
Fri Feb 10 10:47:26 UTC 2006


Gregg:
> In the latest spec, the format for range queries is
>       seqid/min:max:strand
> and the format for range attributes in feature elements is
>       min:max:strand


> I personally find these kind of ranges confusing and not particularly
> useful, and would rather make min and max required for both the range
> attribute and range-based query filters.

Agreed on this side.  All clients can easily get the upper limit,
and the lower limit is always 0.

> My main point though is we need to be explicit about how strand info or
> lack thereof affects features queries with range-based filters.

It was a confusion on my part.  There are three places which
refer to location + strand.

   1. specifying a feature location
   2. fetching a sequence
   3. doing a range search

"1. specifying a feature location"

We've been talking here about limiting the use of strands
for these.  Features definitely need a strand.  If the
strand is not specified then the feature is on both strands.
or has no meaning.  If needed, resolve the ambiguity by
looking at the type (or other property).  If you really,
really want to specify that it's on both strands then use
the 0.

The location element currently looks like this
   <LOC id="some_url_for_sequence"/>  <!-- on whole sequence -->
   <LOC id="some_url_for_sequence" range="300:500" />
   <LOC id="some_url_for_sequence" range="300:500:-1" />  <!-- on strand 
-->

Given the decision yesterday that segments are special,
in terms of identification, I propose using the short id,
so these look like, respectively

   <LOC segment="Chr1"/>
   <LOC segment="Chr1/300:500"/>
   <LOC segment="Chr1/300:500:-1"/>

"2. fetching a sequence"

Why does the server needs to support a reverse complement feature?
Let's leave it out and make the client do a string reversal if
it needs it.

"3. doing a range search"

Is there any reason to specify the strandedness when doing
a feature query?

Discussion here seems to be "would be nice but that lack
is one of the things people have never complained about
in DAS1".

I propose removing strandedness from the features query.

If others disagree then here are two solutions:
   A. have a "strand=" parameter, so that the strandedness
is different from the ranges.  If you want a query for
  the union of range Chr1/A:B:-1 and range Chr1/X:Y:1
then tough - make two requests, one for each strand.

   B. ranges may specify the strand (as now) but if not
specified then it means "of any strand".

We worked on a few cases where it might be useful to
make mixed strand queries.  There weren't any compelling
reasons.  Even in the worst case scenario without strand
support in the features query is that you get on average
twice the number of features back, and worst case for
option A is the need to make two queries.


					Andrew
					dalke at dalkescientific.com




More information about the DAS2 mailing list