[DAS2] Sanger/EBI trip report
Andreas Kahari
ak at ebi.ac.uk
Tue Oct 18 22:39:30 UTC 2005
On Tue, Oct 18, 2005 at 10:33:30PM +0200, Andrew Dalke wrote:
> I visited EBI and Sanger last week to talk with the
> people there about their use of DAS, the ongoing work
> with the DAS/2 spec, and the future directions, including
> structure DAS.
>
> One meeting was with Andreas, the other Andreas (there are
> too many Andre* in the UK - I think I need to change my
> name), Eugene and Stefan.
Hi, I'm one of them Andreases. In this particular email I'm just
commenting on a very small number of things that Andrew is writing.
> Andreas has a service registry system. I don't know where
That's Andreas Prlic, the other one (depending on your point of view),
not me.
> it is though. The registration includes metadata about the
It is at
http://das.sanger.ac.uk/registry/
(final slash is essential, it seems)
> server. I would like some way for the DAS/2 server to
> provide the metadata so the registry could extract most
> of what it needs by querying the base server. As Andreas
> pointed out, that data could wrong or incomplete so the
> registry could override it. I mentioned the idea that
> the DAS/2 spec as it is now lets the registry server
> provide the top-level das/genome XML and is free to point
> clients to the real databases. This is one of the
> advantages of a ReST architecture.
>
>
> An interesting thing I learned was the wide use of stylesheets.
> There are about 15 stylesheet types in use on the campus,
> and Ensemble uses a version which is not-quite compatible.
> Andreas Prlic pointed out that the stylesheet needs extensions
> for 3D because the annotation styles are different than for
> a 2D plot. Thomas Down apparently has a version which puts
> a color scale on a field, eg, so that better scores are shown
> differently from worse scores.
>
> Stylesheets also came up when talking with Ed (or is that the
> other Ed :) and Roy. They are developing zmap, a replacement
> for fmap. It's a C app (gtk-based using the FooCanvas to
> display huge numbers of elements) designed to speak the same
> xremote API as fmap. They want annotations which can be
> individually annotatable, that is, annotated on more than the type.
>
> The example they gave was using three tracks - annotation,
> transcript and homology. They want to copy from the later two
> into the first track and preserve the original color and style.
> Sadly, that's what my notes say, but I don't understand it from
> there. What I took from it was the need to have different
> ways to determine a style for an annotation, like on a
> pre-track or perhaps per-annotation mechanism.
>
> The obvious one which comes to mind, which we talked about
> as a possibility, was to take ideas from CSS.
>
> Ed (I think) asked about how to handle assembly data.
> I pointed out the section in the spec which says it can be
> fetched by asking for it in BED format. He wanted to
> know more about how to know if a given element was a clone
> or a transcript. At this point I said he needed to ask
> a real scientist. :)
>
> James Gilbert also came by during the discussion. He
> asked about how we deal with hierarchical features, and
> wanted to know more about how our data model fits with
> the one in Otter.
> http://www.sanger.ac.uk/Users/jgrg/otter_xml.html
> I don't know the answer to that question.
>
> In both meetings people like that we refrained from making
> new XML for everything, using "format=" instead.
>
> Andreas et al. asked about computational services which
> might take a non-trivial time. I mentioned the solution
> we talked about during BOSC where the server returns a
> "202 Accepted" and a bit of XML saying "you can check on
> the status at this URL but it'll probably take about 5
> minutes to figure out." The client should be able to
> ask the server to halt the computation.
This is related to something that mainly Tom Oinn here at the EBI
has been working on: Distributed Annotation with Lazily Evaluated
Computation (DALEC), a kind of DAS frontend to Taverna workflows.
http://taverna.sourceforge.net/projects/dalec/
> In general there was a good reception to the use of the
> "format=" parameter, instead of making new XML formats.
>
> It does look like we need to spend more time on the
> format extensibility. It seems much of what the UK folks
> do is based on extending DAS/1 in various ways. DAS/2
> doesn't and cannot capture all of them. I've been looking
> at the ATOM spec.
> http://www.intertwingly.net/wiki/pie/RestEchoApiDiscuss
> http://atompub.org/2005/07/11/draft-ietf-atompub-format-10.html
I need to read this.
> It has a very nice way to embed data in the atom:content
> field, where the data can be inline text, html, xml, or
> "other", or be a link to an external href.
>
> Along those lines, I think the Atom publication protocol
> has some nice ideas to help with the writeback spec.
>
> Ed described the locking model that they use. It's
> unchanged since last year's dicussion. The annotators
> decide on who gets a region, which is locked for that
> person. In their case it's exported into a local AceDB
> instance, edited via fmap. When done that database (as
> a whole) is sent back to the main database for integration.
> The region is locked, preventing resolution conflicts.
>
> Andreas et. al mentioned an interesting annotation - annotate
> a region to say it's been looked at but there are no
> annotations for the region. "This region intentionally
> left blank."
Yes, this was something that confused me at first but that makes perfect
sense to me now. Groups sometimes need to say they've looked at a
region (protein/gene/whatever) because the fact that they are explicitly
not annotating something is as much an annotation as actually annotating
something with a box. Covering the region with an annotation saying
"there's nothing here" does not seem quite right to me.
> I talked as well with Tony, mostly on organization issues.
> One of the things he said they might want to do in the
> future is a 2D image DAS.
>
> I brought up the idea of having a DAS sprint - once the
> spec w/ writeback starts to congeal, get the implementers
> together in a room for a few days and work on code, then
> use the experience to improve the spec. Keepin' it real.
>
> I talked about some of the disconnect between the DAS/2
> dev folks (all in the US) and the UK folks. The phone
> conference call is at 8pm UK time and rather little of
> what we talk about gets written up. When the UK people
> ask questions (cf. James Gilbert's question "Nested features?"
> from Sept. 28, 2005) there's no response. Similarly,
> the DAS/1 extensions in the UK aren't written down so
This is not *quite* true. The alignment and structure extensions to
DAS/1 by Andreas Prlic are well documented here:
http://www.efamily.org.uk/xml/das/documentation/
> it's hard to know what's useful for DAS/2. My being in
> Europe for the next few months should help a bit with
> that, and I've always had a wacky schedule anyway so I'll
> be in on the conf. calls (now that I'm back to easily
> available broadband). But I'm not enough of a domain
> expert to be able to answer or address the scientific
> points.
>
Me and Stefan enjoyed Andrew's visit and would certainly like to see
some sort of dialogue or collaboration or whatever may help getting
further with specifying and implementing DAS/2.
>
> I've missed a few things so if anyone else here wants
> to, feel free to add comments.
>
> Andrew
> dalke at dalkescientific.com
Regards,
Andreas
--
Andreas Kähäri
EMBL-EBI/ensembl
------{ www.embl.org }----{ www.ebi.ac.uk }----{ www.ensembl.org }------
More information about the DAS2
mailing list