[DAS] 1.6 draft 7

Andy Jenkinson andy.jenkinson at ebi.ac.uk
Wed Sep 22 23:29:14 UTC 2010


Hi David,

For the avoidance of doubt, I am aware of no great reluctance to add alternative formats to DAS/1. In fact it is something I and others want to do and have wanted to do for a long time, we just haven't got around to it yet. So far, the maxbins extension has got us near enough and in truth just getting where we are with 1.6 (which is for the most part a consolidation of existing usage, with some exceptions) has been hard enough. That it is not in the 1.6 specification should not be discouraging: for years major features of DAS as used by many, especially in Europe, were merely extensions to the core specification. Some of those are now becoming "core".

The opportunity is there to implement it and the spec can be evolved accordingly so long as it is done "in the open". I would be happy to collaborate on it. One of the first things we wanted to do as a proof of principle and to establish a robust specification for negotiating content types was to have a JSON version of every command. IMO this would be a useful exercise in its own right which, as an addition to the binary formats, I daresay could drive quick adoption of alternative content types as a whole. Once the mechanism is decided, this example would also be quite easy to implement in servers and javascript-based clients such as those emerging recently.

Our approach at Hinxton these days is somewhat pragmatic: if the spec needs to change to accomplish a real world goal, we will engage with the community to change it. But we do not tend to make speculative changes in the hope of them being used. In short, the possibility is there, it just needs someone to put their hand up and set about implementing it.

By the way, what is the 2MB/s limitation you refer to?

Cheers,
Andy

On 22 Sep 2010, at 21:33, David Nix wrote:

> Bugger, that's unfortunate.  Without alternative formats, it is impossible to distribute genomic data from next generation sequencing and microarray experiments in a timely manner.  Even then, the data transfer is very slow due to the <2MB/s bandwidth limitation.
> 
> I wonder if folks should be encouraged to use DAS/2 for genomic data distribution and DAS/1 for everything else?  There seems to be a great reluctance to add alternative formats to DAS/1.  I can understand the advantages of having a standardized data distribution format.  Unfortunately this won't work for us, even compressed, DAS GFF XML is about 100x larger than some of the other binary genomic data formats such as bar and useq.
> 
> I'm afraid DAS is going to get left behind as other data distribution models are adopted that can accommodate the ever growing density of genomic data.
> 
> What do you think?
> 
> -cheers, D
> 
> On 9/22/10 10:51 AM, "Andy Jenkinson" <andy.jenkinson at ebi.ac.uk> wrote:
> 
> Hi David,
> 
> It is not part of DAS 1.6 but was discussed at the DAS workshop. During the workshop we had some discussion on the topic and came up with a couple of sensible proposals for an extension to 1.6 to cover it. If my memory serves me we agreed on an outline proposal, and there is a write up on http://www.biodas.org/wiki/DAS1.6E (courtesy of Gregg? Is that correct?) but as far as I know there are no implementations as yet.
> 
> Cheers,
> Andy
> 
> On 22 Sep 2010, at 17:17, David Nix wrote:
> 
>> Hello Andy,
>> 
>> I'm looking at the latest and trying to find out if alternative file formats were added to 1.6?  Can one respond to DAS/1 queries with binary data formats or is it still XML?  If the later, any time frame for when this will be added?
>> 
>> -cheers, D
>> 
>> --
>> David Austin Nix, PhD | Bioinformatics Shared Resource | Huntsman Cancer Institute | 2000 Circle of Hope | SLC, UT 84112 | Rm: 3165 | Vc: 801.587.4611 | Fx: 801.585.6458 | david.nix at hci.utah.edu | Skype/iChat: LiveNix | WebSite: http://bioserver.hci.utah.edu | DAS/2: http://bioserver.hci.utah.edu:8080/DAS2DB/genome
>> 
>> 
>> 
>> On 9/22/10 9:44 AM, "Andy Jenkinson" <andy.jenkinson at ebi.ac.uk> wrote:
>> 
>> Hi all,
>> 
>> I have updated the 1.6 specification to draft 7 in light of recent discussions on the list:
>> 1. All features in a parent/part hierarchy must be returned if any overlap a query segment.
>> 2. The alignment command is back to extension status, in anticipation of a revamp (see the 1.6E page on the wiki).
>> 
>> Also in this draft is a previous change that was missed: the start and end attributes of a SEGMENT element in the features, types and entry_points commands are now optional. This makes it possible for servers without access to detailed information about the segments they are annotating to comply with the specification. Previously, it was impossible for such servers to respond in a compliant fashion to requests in which the client does not specify a start/end position.
>> 
>> If my understanding is correct, no further changes to the specification are anticipated which means we can consider this the final draft...
>> 
>> See here for details:
>> http://www.biodas.org/wiki/DAS1.6
>> 
>> Cheers,
>> Andy
>> _______________________________________________
>> DAS mailing list
>> DAS at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/das
>> 
> 
> 





More information about the DAS mailing list