[DAS2] default format for a single segment

Andrew Dalke dalke at dalkescientific.com
Thu Aug 17 12:29:19 UTC 2006


Two proposals here:
   1) change the default format for a single segment
       request from FASTA -> das2xml
   2) add optional <FORMAT> elements to each segment

  ==  Proposal 1 ===

Currently every DAS2 service returns an application/x-das-*+xml
document by default except for the segment document.  A request
for on a segment URI returns its FASTA sequence.

I would like to change that.  I would like the segment document
by default to return a das-segment document.  For example, if
this is the segments document


<?xml version="1.0" encoding="UTF-8"?>
<SEGMENTS xmlns="http://biodas.org/documents/das2">
  <FORMAT name="das2xml" />
  <FORMAT name="fasta" />

  <SEGMENT uri="segment/chrI" title="Chromosome I" length="230209"
      reference="http://dalkescientific.com/yeast1/ChrI" />
  <SEGMENT uri="segment/chrII" title="Chromosome II" length="813179"
      reference="http://dalkescientific.com/yeast1/ChrII" />
  <SEGMENT uri="segment/chrIII" title="Chromosome III" length="316617"
      reference="http://dalkescientific.com/yeast1/ChrIII" />
</SEGMENTS>

then doing the request for "segment/chrI" should return

<?xml version="1.0" encoding="UTF-8"?>
<SEGMENTS xmlns="http://biodas.org/documents/das2">
  <FORMAT name="das2xml" />
  <FORMAT name="fasta" />

  <SEGMENT uri="segment/chrI" title="Chromosome I" length="230209"
      reference="http://dalkescientific.com/yeast1/ChrI" />
</SEGMENTS>

   == Proposal 2 ==

My server implements a "raw" sequence format which contains only
sequence data and does not even contain the FASTA header.
The raw format only works for a single segment and not for the
list of segments.

In the current spec the "FORMAT" entry is somewhat ambiguous.
Does it work for the set of segments or for a single given segment?
That is,

    segments?format=das2xml --> the segments document for all of the 
segments
    segment/chrI?format=das2xml --> the segments document for a given 
segments

    segments?format=fasta --> all sequences, in FASTA format
    segment/chrI?format=das2xml --> the FASTA sequence for the given 
segment

However,  segments?format=raw  makes no sense.  No one will
use that one for real.

I propose that the SEGMENT elements also get an optional
FORMAT element which looks like this


<?xml version="1.0" encoding="UTF-8"?>
<SEGMENTS xmlns="http://biodas.org/documents/das2">
  <FORMAT name="das2xml" />
  <FORMAT name="fasta" />

  <SEGMENT uri="segment/chrI" title="Chromosome I" length="230209"
      reference="http://dalkescientific.com/yeast1/ChrI">
     <FORMAT name="raw" />
  </SEGMENT>
  <SEGMENT uri="segment/chrII" title="Chromosome II" length="813179"
      reference="http://dalkescientific.com/yeast1/ChrII">
     <FORMAT name="raw" />
  </SEGMENT>
  <SEGMENT uri="segment/chrIII" title="Chromosome III" length="316617"
      reference="http://dalkescientific.com/yeast1/ChrIII">
     <FORMAT name="raw" />
  </SEGMENT>
</SEGMENTS>

The formats for a given segment are the union of its <FORMAT>
elements and those in the top-level.  That is, each segment
here implements "raw", "fasta" and "das2xml" formats.


					Andrew
					dalke at dalkescientific.com




More information about the DAS2 mailing list