[DAS] plans for 1.6 release

Gregg Helt gregghelt at gmail.com
Fri Sep 3 12:18:57 UTC 2010


On Fri, Sep 3, 2010 at 3:02 AM, Jonathan Warren <jw12 at sanger.ac.uk> wrote:

> Hi Thomas and Andy
>
> Thanks for posting this to the list.
> I don't think from our brief discussions at the Sanger/EBI that people here
> would object to moving to the DAS2 way for parents/parts. My impression of
> the DAS2 spec was that it was thought about and discussed in great detail
> and thus there would be very good reasons for doing it this way?
>

The main reason that DAS/2 required full parent/part feature hierarchies in
the feature response is exactly what Thomas mentioned -- assurance that a
client gets a full feature graph back from a query and thus can avoid
additional query roundtripping (unfortunate), or (much worse) presentation
of partial data as if it were complete.

Also an advantage of moving to this way is that it has been tested by people
> implementing DAS2 extensively already?
>

Yes the parent/part protocol of DAS/2 has been pretty extensively tested.
And it's certainly the way I'd like to see parent/part hierarchies handled
in DAS 1.6.  Probably the main disadvantage to this approach is potentially
making the server implementation of feature range queries more complicated.
However this kind of query is not exactly unique to DAS, so in my experience
most databases already have optimizations in place to facilitate retrieval
of complete feature graphs in response to range queries.


> As the DAS 1.6 spec is in pretty much every other respect (apart from minor
> things like cvIds) a consolidation of the way DAS "IS" being used in the
> community surely it has been tested already?
>
> The main principles of the 1.6 spec have been largely unchanged for over a
> year and were agreed by everyone at the 3rd day of the workshop in 2009-
> most changes say over the last 6 months I believe have largely been
> clarifications. The last year or so the DAS1.6 spec was supposed to be in a
> testing phase - but I agree that maybe not many clients have tested it yet.
> There has been much work at the EBI on clients and servers for proteins with
> 1.6.
>

I think one problem is that there has been less testing of DAS 1.6 clients
and servers for genome annotations than for proteins.  This issue, of how to
handle retrieval of parent/part feature hierarchies for range queries that
overlap parent but not all parts, doesn't arise in most DAS protein clients
since they typically query for features across the whole length of a protein
rather than a range.

Looking at the current DAS 1.6 registry, I've only found two servers that
support feature queries on chromosomes.  One of these (DS_180) ignores the
range restriction given in the feature query.  The other (DS_876, "Test 1.6
sources") does pay attention to the query range, so is usable for testing,
but I'm not sure what the corresponding 1.5 source is for comparison -- is
there one?  For the biologists I'm working with, the two most relevant DAS
genomic data sources are Ensembl and UCSC.  So as a client developer what
I'd really like for testing is a DAS 1.6 Ensembl source that has a
corresponding 1.5 source to compare to.

Gregg

I would like the spec to be set in stone, but I don't think it needs to be
> that way and I don't see that it can't be tweaked if needs be shortly after
> initial release. I think by releasing the spec we are saying "this is how
> the DAS community  think the system should work and we have provided servers
> that conform to that spec, code that will help clients process the xml and a
> registry that can validate and hold 1.6 data sources". Given the lack of
> testing buy clients so far, I'm not sure how we can move DAS forward any
> other way?
>
> The current situation is that Andy has kindly agreed to postpone the
> release for another week or so for developers to look at the spec and
> preferably do some testing with it.
>
> It would be really great to get some more feedback from client developers
> and some  more 1.6 clients out there using 1.6 - thanks for your wise words
> Thomas.
>
>
> On 2 Sep 2010, at 17:35, Thomas Down wrote:
>
>  On Tue, Aug 31, 2010 at 5:28 PM, Andy Jenkinson <andy.jenkinson at ebi.ac.uk
>> >wrote:
>>
>>  Hi all,
>>>
>>> As many of you know, DAS version 1.6 has been in development for a while
>>> now, and we would really like to get it out of the door and "official".
>>> To
>>> this end, we aim to do the following:
>>> 1. resolve the remaining questions over the current draft (principally,
>>> alignments and categorize) *
>>> 2. ensure core Java and Perl software libraries support the specification
>>> **
>>> 3. officially switch over ***
>>> 4. developers of solid client and server implementations will announce
>>> dates for 1.6 support
>>>
>>>
>> I agree that it'll be nice to get this out, although I'm still kind-of
>> nervous about setting changes in stone when there aren't already
>> implementations out there (client-side, in particular).  Are there any
>> client developers who've had a good poke around and can reassure me?
>>
>> The other concern (which I've only just noticed, and had some discussions
>> off-list with Andy) is that -- in the new PART/PARENT system -- it is
>> explicitly stated that you'll only get features overlapping the query
>> region.  Features outside the query region will be excluded even if
>> they're
>> PARTs of something that is included (e.g., you'll only get a subset of
>> exons
>> for a gene).
>>
>> To my mind, the biggest advantage of a feature hierarchy system over the
>> GROUP system it replaces is that you can guarantee that you'll get the
>> complete dataset for a complex feature in a single fetch.  I note that the
>> DAS/2 spec (which has a very similar PART/PARENT system) takes the
>> opposite
>> approach and states that you'll always get complete graphs of features
>> back.
>>
>> Given that this is a new (to DAS/1) system, is there any reason not to do
>> things the DAS/2 way?
>>
>>              Thomas.
>> _______________________________________________
>> DAS mailing list
>> DAS at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/das
>>
>
> Jonathan Warren
> Senior Developer and DAS coordinator
> blog: http://biodasman.wordpress.com/
> jw12 at sanger.ac.uk
> Ext: 2314
> Telephone: 01223 492314
>
>
>
>
>
>
>
>
>
> --
> The Wellcome Trust Sanger Institute is operated by Genome ResearchLimited,
> a charity registered in England with number 1021457 and acompany registered
> in England with number 2742969, whose registeredoffice is 215 Euston Road,
> London, NW1 2BE._______________________________________________
>
> DAS mailing list
> DAS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/das
>



More information about the DAS mailing list