[DAS2] proposed April 17 agenda
Andrew Dalke
dalke at dalkescientific.com
Tue Apr 18 07:36:39 UTC 2006
Summary of today's conference call.
> 2. status reports
The biggest one is that the new version of IGB is out and
the Affy DAS server is available at
http://netaffxdas.affymetrix.com/das2/sequence
Steve and Ed (as I recall) tracked down a problem with
that server which might affect other implementations. The
problem is knowing the public/external URL for the DAS service.
In theory it can be determined by looking at various CGI
headers, but with things like an Apache rewrite and forwards
to the actual server it can get complicated. The solution
seems to be either use relative links or have a configuration
option in the server specifying the base name.
Lincoln's been working on reference names. Allen's been
working on how the writeback server might work. I've been
working on the spec, and have not gone further with the
validator.
> 3. who maintains the list of reference names for different
> genomes (starting with the list Licoln developed)?
Lincoln proposed, to broad acceptance, that we set up a
wiki page with the reference names.
The easiest way is to use the OBF wiki, at
http://open-bio.org/wiki/Main_Page
because that is already set up. I can ask the OBF about
the appropriateness of that - I think it's fine.
> 4. resolve some questions with the spec (see my previous email)
Here are the resolutions:
1) type ontology URI
I've emailed Suzi asking about plans for GO, the Gene Ontology
Consortium, whoever in coming up with standardized, public
ontology URLs. Allen's cc'ed on it, and we'll discuss this
off the DAS list.
2) Feature strand.
I stand corrected. The definitions are
1 for positive
-1 for negative
0 both strands
not don't know or does not have meaning
3) taxid
There seems to be no reason to keep the 'taxid' in the SOURCE
element. We'll only have it in the COORDINATES element.
4) 'writeable'
We'll defer this (leaving it as-is) until we have the writeback
defined a bit better.
5) content-type for FASTA records
We'll recommend "text/x-fasta" or "text/plain" as the content-type
for FASTA responses. There is no widely accepted community standard.
6) response document too large
There is no automatic way for a client to narrow its request.
This must be done by a person, depending on what the search
criteria are. Servers should support large requests so that
this isn't a problem.
7) styles
We'll shift to using a stylesheet. This will be listed in the
versioned source record as
<CAPABILITY type="stylesheet" query_uri="blah_blah.xml" />
As a rough sketch the document will look like
<STYLE uri="http://url/for/feature/type"
zoom="high" fgcolor="red" bgcolor="black">
<BOX />
<LABEL font_family="monospace">
</STYLE>
<STYLE uri="http://url/for/feature/type"
zoom="medium" fgcolor="red" bgcolor="black">
<LINE line_width="3px" />
<LABEL font_family="monospace">
</STYLE>
<STYLE uri="http://url/for/feature/type"
zoom="medium" fgcolor="red" bgcolor="black">
<LINE line_width="1px" />
</STYLE>
The STYLE elements add a new "uri" attribute which is
the URI of the feature type being styled.
In theory this could also include the feature uri (to define
the style for a single feature) or an ontology uri (sets the
style for all features with that ontology term or its descendants).
However, with that comes problems of precedence. If the
feature type and the feature and the ontology each have
styles, which one wins? I think feature beats type beats ontology.
But I also think we can ignore this because no one has asked
for this sort of flexibility.
(More flexibility would be support for a query language selecting
which features, types, sources, ontologies, feature alias, etc.
should get a given style. Not going there. :)
8) the "count" format
This should be the number of feature elements returned, and not
the number of "annotations" (counting the multiple features of a
complex annotation as 1)
9) alignments
Lincoln will provide examples.
10) CIGAR string
We'll use the EBI style CIGAR strings, and the documentation will
be based on the GFF3 description at
http://song.sourceforge.net/gff3.shtml
10.5) Do we need a REGION element?
No. Deleted from the spec.
11) XID
On Ed's recommendation I'm looking at MAGE XML. I am not a
good UML reader so it's slow going. My view so far is that
what I sketched out is on the right track and we can simplify
things compared to MAGE, eg, we don't need full bibliographic
records.
The other idea is to defer finalizing this until people start
providing data with XIDs, so we know what's needed.
12) complex features
Lincoln will come up with some examples.
13) "root" attribute
There are two changes here:
- complex annotations must have a single root feature
- all features which are in complex annotations must have
a link to the root element
There's some worry about the first requirement, in that
some complex annotations may not have a "real" root. I
argue that having a synthetic one is okay. There were no
strong arguments against having a single root.
We decided to defer finalizing this until we have some
example of complex annotations.
14) features have a 'STYLE' element
no, they don't.
15) "*" and "?" in the query string
The proposal here is to say that the interpretation of
"*" other than at the start and/or end of the query
string is implementation defined, as is the use of "?".
It used to be that any other use of "*" must be treated
as an asterisks, so "***" finds all strings containing
a "*".
It looks like people are fine with this looseness.
> 5. get a volunteer to come up with best-practices examples
> of how to represent various complex annotations
That's Lincoln.
> 6. writeback planning
Allen will take the implementation lead on this, funding
willing. He's currently working on how to associate an
identifier with a new feature.
One thought is to progress in stages:
- upload completely new features / complex annotations to the server
- modify an existing feature, though not the parent/part relationship
(eg, change the location)
- delete a simple feature
- delete a complex annotation
- modify an existing complex annotation, or turn a simple feature
into a complex annotation
- do 'em all at once
The work will need to be server driven as the current clients
can't handle this before the end of the funding period. The
clients will mostly be library code.
Andrew
dalke at dalkescientific.com
More information about the DAS2
mailing list