[DAS] Adjacent feature extension
Jonathan Warren
jw12 at sanger.ac.uk
Mon Mar 7 14:11:20 UTC 2011
On 7 Mar 2011, at 12:43, Andy Jenkinson wrote:
> On 7 Mar 2011, at 11:51, Jonathan Warren wrote:
>
>> On 7 Mar 2011, at 11:19, Andy Jenkinson wrote:
>>
>>> On 7 Mar 2011, at 10:57, Jonathan Warren wrote:
>>>
>>>>
>>>> My vote would ideally to change feature_by_id to return one
>>>> feature and have the adjacent_feature as returning one feature.
>>>> This in my opinion would mean these capabilities on servers do
>>>> "exactly as they say on the tin" and would be easier to implement
>>>> for data providers and are thus more likely to be implemented?
>>>> If the feature_id capability as it stands is needed it could be
>>>> changed to something more akin to what it means like
>>>> feature_id_region but I would bet no one would bother to change
>>>> it/use it?
>>>>
>>>> However the reality is that we are too late to change the old
>>>> feature_by_id, but I don't think we need to make the same mistake
>>>> twice by repeating it for adjacent_features?
>>>
>>> I disagree. I think the problems with feature-by-id are that a)
>>> the name of the capability implies singular, and b) the concept
>>> itself (i.e. getting a feature by its ID) is such a common
>>> operation that is otherwise missing in DAS. I don't think either
>>> of those apply to an "adjacent" capability unless you specifically
>>> choose to call it "adjacent-feature" as opposed to "adjacent-
>>> features". I honestly don't think a capability called "adjacent-
>>> features" with a query structure like "/das/features?adjacent=foo:
>>> 1" implies singular, rather the opposite in fact. To me that query
>>> suggests "get me the features adjacent to foo:1". True that 2
>>> features is plural which still leaves a "one feature either side"
>>> interpretation possible, but IMO certainly not implicit enough to
>>> stop anyone implementing it to actually read the specification/
>>> documentation. Add to that the fact that this is an entirely new
>>> behaviour that we have the chance to properly document and make it
>>> clear exactly what the server must do.
>>>
>>> So IMO we have a clear choice.
>> I still think it's simpler to implement it for one feature either
>> side and keep complexity in the client. Generally how many people
>> stay wake after line 10 when reading the spec? :) Lets see if there
>> are more votes...
>
> It probably is simpler to implement (well, to implement with maximum
> efficiency) and I am not advocating one over the other, but IMO the
> implementation considerations are a separate part of our choice and
> are orthogonal to whether it's confusing for those implementing it
> and consequently whether we see divergence from the spec like we do
> with feature-by-id. As Gustavo says, he'd implement feature-by-id as
> one feature because that's what he thinks it means, not because it's
> difficult. I'd posit that it'd be a one line change for any server
> maintainer to fix theirs to implement it correctly (i.e. use the
> feature's start/end to resubmit the query), it's just that it'd be
> more complicated to do it in a single step from the beginning.
>
> We should be under no illusions though that people are going to be
> able to implement this easily without reading the documentation
> carefully, no matter which option is chosen.
Good template methods and or examples in tutorials examples will
encourage use of this command.
> In particular, I can foresee servers not interpreting the "type"
> filter appropriately, being likely to process the adjacent query
> then apply the type filter, which would be wrong. I have a feeling
> most sources implement the type filter as a passive "post filter"
> rather than an active one. I can tell you right now that it is going
> to be really quite difficult for me to implement "adjacent"
> correctly for the ASTD gene/transcript/exon sources, and I suspect
> the same will be true for retrofitting lots of other sources.
This is an optional capability though right?
>
>>>
>>> As to feature-by-id, I know changing behaviour is potentially a
>>> very disruptive change, but I think we can potentially do this
>>> purely because servers don't tend to implement it correctly
>>> anyway. Clients can happily filter out any additional features
>>> returned by old servers, and if any clients are reliant on the
>>> server including all overlapping features then as far as I am
>>> concerned they are either a) targeting specific servers rather
>>> than DAS-wide and thus unaffected, or b) already broken :)
>> So you agree feature-by_id should be changed if we have the stomach
>> for it? - good and Gustavo too. Well done Andy - You have just
>> agreed to write Spec 1.7 or 3??? ;) Your argument above can be used
>> for leaving the spec as it is then as well - but ideally I agree
>> and guess we can call it spec 1.61 assuming other people agree.
>
> I already have a small list of changes for DAS 1.7 or whatever and
> think it's fine for that context. In any case, let's keep these two
> issues separate as Thomas says.
I was really hoping not to do another major spec revision for at least
3 years and to focus on extensions giving new capabilities- otherwise
for the core capabilities everyone is always playing catch up! This
maybe something to discuss at some point soon.
>
>>>
>>> I have to admit that the feature-by-id capability is one of the
>>> (many) things I loathe having to explain and would love to change
>>> it. Doing so would be consistent with what we were trying to do
>>> with 1.6 (i.e. rationalise existing use of the spec) but I
>>> chickened out really.
>>>
>>> Cheers,
>>> Andy
>>
>> Jonathan Warren
>> Senior Developer and DAS coordinator
>> blog: http://biodasman.wordpress.com/
>> jw12 at sanger.ac.uk
>> Ext: 2314
>> Telephone: 01223 492314
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> --
>> The Wellcome Trust Sanger Institute is operated by Genome
>> ResearchLimited, a charity registered in England with number
>> 1021457 and acompany registered in England with number 2742969,
>> whose registeredoffice is 215 Euston Road, London, NW1 2BE.
>
Jonathan Warren
Senior Developer and DAS coordinator
blog: http://biodasman.wordpress.com/
jw12 at sanger.ac.uk
Ext: 2314
Telephone: 01223 492314
--
The Wellcome Trust Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a
company registered in England with number 2742969, whose registered
office is 215 Euston Road, London, NW1 2BE.
More information about the DAS
mailing list