[DAS2] Sequence retrieval proposal

Andrew Dalke dalke at dalkescientific.com
Sun Dec 11 19:26:21 UTC 2005


Steve:
> I am also somewhat loath to add yet another sequence file format to the
> world. Seems reasonable to state that a DAS/2 server can supply 
> sequence in
> an alternative format via requests such as:
>
>   http://www.wormbase.org/das/genome/volvox/1/sequence?format=GAME

That makes good sense to me.

> Here's a brief tour of some possibly extensible candidates:

Do you want to say this as:
   "The server must implement these sequence formats"
or
   "If the server implements one or more of these sequence formats then
     it must use the corresponding id and content-type."
?

Or say nothing and wait until several different servers implement
this then standardize on what they do?

I don't think anyone here seriously wants the first. :)

The last is my favorite, then the middle one.


My stronger preference is to get a complete 2.0 spec out.  Do
you or other users need checksum validation of the sequence and/or
alternate sequence formats in 2.0?  What prevents you from extending
existing HTTP headers or experimenting with extensions then
submitting your experience for inclusion in future versions of
the spec?

My sense is that this can wait.

> We might consider proscribing some conventions for what DAS considers 
> proper
> fasta format. I put in a little bit of description of a DAS-acceptable 
> fasta
> format here in the retrieval spec:
> http://biodas.org/documents/das2/das2_get.html#sequence

Do current DAS clients even use the header?

Will future ones use it?  If so, why?  Shouldn't all the information
in a header be available as an annotation?

The wikiepedia entry for FASTA is pretty good.
   http://en.wikipedia.org/wiki/Fasta_format

I had my students a few months ago find different FASTA definitions.
Some disagreed with others.  Wikiepedia was the most complete.

					Andrew
					dalke at dalkescientific.com




More information about the DAS2 mailing list