[DAS2] default format for a single segment
Andrew Dalke
dalke at dalkescientific.com
Thu Aug 17 12:29:19 UTC 2006
Two proposals here:
1) change the default format for a single segment
request from FASTA -> das2xml
2) add optional <FORMAT> elements to each segment
== Proposal 1 ===
Currently every DAS2 service returns an application/x-das-*+xml
document by default except for the segment document. A request
for on a segment URI returns its FASTA sequence.
I would like to change that. I would like the segment document
by default to return a das-segment document. For example, if
this is the segments document
<?xml version="1.0" encoding="UTF-8"?>
<SEGMENTS xmlns="http://biodas.org/documents/das2">
<FORMAT name="das2xml" />
<FORMAT name="fasta" />
<SEGMENT uri="segment/chrI" title="Chromosome I" length="230209"
reference="http://dalkescientific.com/yeast1/ChrI" />
<SEGMENT uri="segment/chrII" title="Chromosome II" length="813179"
reference="http://dalkescientific.com/yeast1/ChrII" />
<SEGMENT uri="segment/chrIII" title="Chromosome III" length="316617"
reference="http://dalkescientific.com/yeast1/ChrIII" />
</SEGMENTS>
then doing the request for "segment/chrI" should return
<?xml version="1.0" encoding="UTF-8"?>
<SEGMENTS xmlns="http://biodas.org/documents/das2">
<FORMAT name="das2xml" />
<FORMAT name="fasta" />
<SEGMENT uri="segment/chrI" title="Chromosome I" length="230209"
reference="http://dalkescientific.com/yeast1/ChrI" />
</SEGMENTS>
== Proposal 2 ==
My server implements a "raw" sequence format which contains only
sequence data and does not even contain the FASTA header.
The raw format only works for a single segment and not for the
list of segments.
In the current spec the "FORMAT" entry is somewhat ambiguous.
Does it work for the set of segments or for a single given segment?
That is,
segments?format=das2xml --> the segments document for all of the
segments
segment/chrI?format=das2xml --> the segments document for a given
segments
segments?format=fasta --> all sequences, in FASTA format
segment/chrI?format=das2xml --> the FASTA sequence for the given
segment
However, segments?format=raw makes no sense. No one will
use that one for real.
I propose that the SEGMENT elements also get an optional
FORMAT element which looks like this
<?xml version="1.0" encoding="UTF-8"?>
<SEGMENTS xmlns="http://biodas.org/documents/das2">
<FORMAT name="das2xml" />
<FORMAT name="fasta" />
<SEGMENT uri="segment/chrI" title="Chromosome I" length="230209"
reference="http://dalkescientific.com/yeast1/ChrI">
<FORMAT name="raw" />
</SEGMENT>
<SEGMENT uri="segment/chrII" title="Chromosome II" length="813179"
reference="http://dalkescientific.com/yeast1/ChrII">
<FORMAT name="raw" />
</SEGMENT>
<SEGMENT uri="segment/chrIII" title="Chromosome III" length="316617"
reference="http://dalkescientific.com/yeast1/ChrIII">
<FORMAT name="raw" />
</SEGMENT>
</SEGMENTS>
The formats for a given segment are the union of its <FORMAT>
elements and those in the top-level. That is, each segment
here implements "raw", "fasta" and "das2xml" formats.
Andrew
dalke at dalkescientific.com
More information about the DAS2
mailing list