[DAS] 1.6 draft 7

David Nix David.Nix at hci.utah.edu
Thu Sep 23 16:49:16 UTC 2010


Hello Thomas,

Yes, us too!  BAM file distribution has been built into the command line interface GenoViz Genometry DAS/2 server by the BioViz folks.  It needs a bit of testing before being added to the production release.  Provided there are no glitches, we'll add this functionality to the GenoPub GUI interface too.  This will allow one to visualize sequence reads in the Integrated Genome Browser via DAS/2 distribution.

IGB is pretty slick, the BioViz folks have done and continue to do an amazing job with this long running project.  Their implementation of short read visualization is quite good. Check out http://www.bioviz.org/igb/ and their "View Short Read Alignments in IGB" screencast.

I don't see any inherent  reason that DAS/1 cannot broker the BAM distribution.  Just need to implement the 1.6E alternative format.  I don't know how long it would take to modify the GenoViz code.  IGB currently supports DAS/1 so hopefully it wouldn't be too much work, 1-2 months with testing?

-cheers, D


On 9/23/10 8:44 AM, "Thomas Down" <thomas.a.down at gmail.com> wrote:

Hi David,

Just getting in touch to let you know that we're very interested in getting big piles of sequencing data onto genome browsers, and are certainly interested in alternative formats.  My current thoughts are that -- at least for what we're doing -- a slightly more concise XML schema, plus richer control of server-side summarization/binning (think maxbins++) might be the sweet spot, but very interested to see other options as well.

Also, just to check... are you currently gzipping your DAS XML?  This makes a huge (at least 10-fold, potentially more) difference, and while I agree it's still not quite optimal, this has got us a long way so far.  It's possible to negotiate compression purely in the HTTP layer (using the Accept-Encoding header), so doesn't necessarily impact the DAS spec at all.

Best wishes,

               Thomas.

On Wed, Sep 22, 2010 at 9:33 PM, David Nix <David.Nix at hci.utah.edu> wrote:
Bugger, that's unfortunate.  Without alternative formats, it is impossible to distribute genomic data from next generation sequencing and microarray experiments in a timely manner.  Even then, the data transfer is very slow due to the <2MB/s bandwidth limitation.

I wonder if folks should be encouraged to use DAS/2 for genomic data distribution and DAS/1 for everything else?  There seems to be a great reluctance to add alternative formats to DAS/1.  I can understand the advantages of having a standardized data distribution format.  Unfortunately this won't work for us, even compressed, DAS GFF XML is about 100x larger than some of the other binary genomic data formats such as bar and useq.

I'm afraid DAS is going to get left behind as other data distribution models are adopted that can accommodate the ever growing density of genomic data.

What do you think?

-cheers, D

On 9/22/10 10:51 AM, "Andy Jenkinson" <andy.jenkinson at ebi.ac.uk> wrote:

Hi David,

It is not part of DAS 1.6 but was discussed at the DAS workshop. During the workshop we had some discussion on the topic and came up with a couple of sensible proposals for an extension to 1.6 to cover it. If my memory serves me we agreed on an outline proposal, and there is a write up on http://www.biodas.org/wiki/DAS1.6E (courtesy of Gregg? Is that correct?) but as far as I know there are no implementations as yet.

Cheers,
Andy

On 22 Sep 2010, at 17:17, David Nix wrote:

> Hello Andy,
>
> I'm looking at the latest and trying to find out if alternative file formats were added to 1.6?  Can one respond to DAS/1 queries with binary data formats or is it still XML?  If the later, any time frame for when this will be added?
>
> -cheers, D
>
> --
> David Austin Nix, PhD | Bioinformatics Shared Resource | Huntsman Cancer Institute | 2000 Circle of Hope | SLC, UT 84112 | Rm: 3165 | Vc: 801.587.4611 | Fx: 801.585.6458 | david.nix at hci.utah.edu | Skype/iChat: LiveNix | WebSite: http://bioserver.hci.utah.edu | DAS/2: http://bioserver.hci.utah.edu:8080/DAS2DB/genome
>
>
>
> On 9/22/10 9:44 AM, "Andy Jenkinson" <andy.jenkinson at ebi.ac.uk> wrote:
>
> Hi all,
>
> I have updated the 1.6 specification to draft 7 in light of recent discussions on the list:
> 1. All features in a parent/part hierarchy must be returned if any overlap a query segment.
> 2. The alignment command is back to extension status, in anticipation of a revamp (see the 1.6E page on the wiki).
>
> Also in this draft is a previous change that was missed: the start and end attributes of a SEGMENT element in the features, types and entry_points commands are now optional. This makes it possible for servers without access to detailed information about the segments they are annotating to comply with the specification. Previously, it was impossible for such servers to respond in a compliant fashion to requests in which the client does not specify a start/end position.
>
> If my understanding is correct, no further changes to the specification are anticipated which means we can consider this the final draft...
>
> See here for details:
> http://www.biodas.org/wiki/DAS1.6
>
> Cheers,
> Andy
> _______________________________________________
> DAS mailing list
> DAS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/das
>



_______________________________________________
DAS mailing list
DAS at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/das






More information about the DAS mailing list