From Gregg_Helt at affymetrix.com Thu Oct 6 14:02:44 2005 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Thu, 6 Oct 2005 11:02:44 -0700 Subject: [DAS2] Problems with biopackages DAS2 server Message-ID: I'm having some problems with feature responses from the DAS/2 server at das.biopackages.net. It looks like in the das2feature XML at least some features are now pointing to themselves as parents. For example, as part of response to http://das.biopackages.net/das/genome/human/17/feature?overlaps=chr21/19 301959:21567303;type=SO:mRNA ... ... ... Another problem I'm seeing is server internal errors in response to combination of an overlaps and inside feature filter, for example: http://das.biopackages.net/das/genome/human/17/feature?overlaps=chr21/25 108802:26550995;inside=chr21/21567303:46976097;type=SO:mRNA returns an HTTP status code of 500, "Internal Server Error". Has the combination of overlaps/inside filter not been implemented yet? Or is this possibly because of the size of the inside filter's region? I haven't noticed either of these problems before, but for the last several weeks I've only been testing the higher-level responses (sources, regions, types) while redoing the client GUI, so I'm not sure when this started happening. gregg From allenday at ucla.edu Thu Oct 6 15:49:05 2005 From: allenday at ucla.edu (Allen Day) Date: Thu, 6 Oct 2005 12:49:05 -0700 (PDT) Subject: [DAS2] Re: Problems with biopackages DAS2 server In-Reply-To: References: Message-ID: On Thu, 6 Oct 2005, Helt,Gregg wrote: > I'm having some problems with feature responses from the DAS/2 server at > das.biopackages.net. It looks like in the das2feature XML at least some > features are now pointing to themselves as parents. For example, as Yes, I know about this one. It's an artifact of the performance improvements I pushed to the production server (the one you use). It can be fixed quickly -- like by Friday. > part of response to > http://das.biopackages.net/das/genome/human/17/feature?overlaps=chr21/19 > 301959:21567303;type=SO:mRNA > > ... > > > > ... > > ... > > Another problem I'm seeing is server internal errors in response to > combination of an overlaps and inside feature filter, for example: Now that you've mentioned it, I could have predicted this error. I'll add a unit test for overlaps+inside combined queries. So the good news is that once I fix it and add a test for this type of error, it won't happen again. The bad news is that this fix may take several days as I have several other urgent things in my queue. > http://das.biopackages.net/das/genome/human/17/feature?overlaps=chr21/25 > 108802:26550995;inside=chr21/21567303:46976097;type=SO:mRNA > > returns an HTTP status code of 500, "Internal Server Error". > Has the combination of overlaps/inside filter not been implemented yet? > Or is this possibly because of the size of the inside filter's region? > > I haven't noticed either of these problems before, but for the last > several weeks I've only been testing the higher-level responses > (sources, regions, types) while redoing the client GUI, so I'm not sure > when this started happening. These are new errors as of last week, when I pushed out the new range query optimizations. -Allen > > gregg > > > From allenday at ucla.edu Fri Oct 7 20:24:05 2005 From: allenday at ucla.edu (Allen Day) Date: Fri, 7 Oct 2005 17:24:05 -0700 (PDT) Subject: [DAS2] Re: Problems with biopackages DAS2 server In-Reply-To: References: Message-ID: the parentage bug has been repaired on the prod server. overlaps+inside will have to wait until next week. -allen On Thu, 6 Oct 2005, Helt,Gregg wrote: > I'm having some problems with feature responses from the DAS/2 server at > das.biopackages.net. It looks like in the das2feature XML at least some > features are now pointing to themselves as parents. For example, as > part of response to > http://das.biopackages.net/das/genome/human/17/feature?overlaps=chr21/19 > 301959:21567303;type=SO:mRNA > > ... > > > > ... > > ... > > Another problem I'm seeing is server internal errors in response to > combination of an overlaps and inside feature filter, for example: > > http://das.biopackages.net/das/genome/human/17/feature?overlaps=chr21/25 > 108802:26550995;inside=chr21/21567303:46976097;type=SO:mRNA > > returns an HTTP status code of 500, "Internal Server Error". > Has the combination of overlaps/inside filter not been implemented yet? > Or is this possibly because of the size of the inside filter's region? > > I haven't noticed either of these problems before, but for the last > several weeks I've only been testing the higher-level responses > (sources, regions, types) while redoing the client GUI, so I'm not sure > when this started happening. > > gregg > > > From suzi at fruitfly.org Mon Oct 17 10:56:02 2005 From: suzi at fruitfly.org (Suzanna Lewis) Date: Mon, 17 Oct 2005 07:56:02 -0700 Subject: [DAS2] missing conference call Message-ID: <9d630486cf99f871bffa2550bc38a23b@fruitfly.org> I won't be able to be on the call this week. Gregg. I hope we can talk tomorrow or Wednesday. -S From Gregg_Helt at affymetrix.com Mon Oct 17 15:41:43 2005 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Mon, 17 Oct 2005 12:41:43 -0700 Subject: [DAS2] Problem with parent/child features in biopackages server feature response Message-ID: I finally got back to testing DAS/2 feature requests/responses in my IGB client. I'm seeing a new problem in responses from the biopackages server, there are no or elements for features at all. See for example the returned XML from my standard test query: http://das.biopackages.net/das/genome/human/17/feature?overlaps=chr21/26 027736:26068042;type=SO:mRNA ... ... ... But I know that "auto316726" should be a child of "BC001178". Andrew, would your DAS/2 validator catch problems like this? Thanks, Gregg From allenday at ucla.edu Mon Oct 17 17:50:14 2005 From: allenday at ucla.edu (Allen Day) Date: Mon, 17 Oct 2005 14:50:14 -0700 (PDT) Subject: [DAS2] Re: Problem with parent/child features in biopackages server feature response In-Reply-To: References: Message-ID: Hi Gregg, There was a logic inversion in my code. PARENT/PART relationships should now be restored. Andrew, I'd also like to know if there is some code already written that can be written into my regression tests. I understand that it's really irritating to have non-fatal errors of different types continuously appearing. Maybe I should spend some time beefing up the regression test suite to do things like diff feature graphs and make sure they are identical? Something to discuss on the conference call... -Allen On Mon, 17 Oct 2005, Helt,Gregg wrote: > I finally got back to testing DAS/2 feature requests/responses in my IGB > client. I'm seeing a new problem in responses from the biopackages > server, there are no or elements for features at all. > > See for example the returned XML from my standard test query: > http://das.biopackages.net/das/genome/human/17/feature?overlaps=chr21/26 > 027736:26068042;type=SO:mRNA > > ... > > > ... > > name="auto316726"> > > > ... > > But I know that "auto316726" should be a child of "BC001178". > > Andrew, would your DAS/2 validator catch problems like this? > > > Thanks, > Gregg > > From Gregg_Helt at affymetrix.com Mon Oct 17 23:36:47 2005 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Mon, 17 Oct 2005 20:36:47 -0700 Subject: [DAS2] RE: Problem with parent/child features in biopackages server feature response Message-ID: Thanks for the quick fix! Now I'm seeing another problem though. Some feature queries with a combination of one "overlaps" and one "inside" filter are giving weird errors. But some return correctly. For example, http://das.biopackages.net/das/genome/human/17/feature?overlaps=chr21/26 104961:26271647;inside=chr21/26104961:46976097;type=SO:mRNA returns correctly. But http://das.biopackages.net/das/genome/human/17/feature?overlaps=chr21/25 854314:26027736;inside=chr21/0:26027736;type=SO:mRNA does not, instead returning these headers: HTTP/1.1 200 OK Date: Mon, 17 Oct 2005 22:40:54 GMT Server: Apache/2.0.51 (Fedora) X-DAS-Version: DAS/2.0 X-DAS-Server: GMOD/0.0 X-DAS-Content-Type: text/x-das-feature+xml X-DAS-Status-Details: Died at /usr/lib/perl5/site_perl/5.8.3/Package/Base/Devel.pm line 425. and then what appears to be more headers but is designated as content (so I'm guessing a header-terminating blank line got inserted before these): X-DAS-Status: 500 X-DAS-Status: 200 Content-Length: 0 Keep-Alive: timeout=15, max=100 Connection: Keep-Alive Content-Type: text/xml The combination of an "overlaps" and "inside" filter is an important part of IGB's client-side query optimizations. I can turn this optimization off for now, but I'm hoping you can diagnose and fix the problem. Thanks again, Gregg > -----Original Message----- > From: Allen Day [mailto:allenday at ucla.edu] > Sent: Monday, October 17, 2005 2:50 PM > To: Helt,Gregg > Cc: das2 at portal.open-bio.org > Subject: Re: Problem with parent/child features in biopackages server > feature response > > Hi Gregg, > > There was a logic inversion in my code. PARENT/PART relationships should > now be restored. > > Andrew, I'd also like to know if there is some code already written that > can be written into my regression tests. > > I understand that it's really irritating to have non-fatal errors of > different types continuously appearing. Maybe I should spend some time > beefing up the regression test suite to do things like diff feature graphs > and make sure they are identical? Something to discuss on the conference > call... > > -Allen > > On Mon, 17 Oct 2005, Helt,Gregg wrote: > > > I finally got back to testing DAS/2 feature requests/responses in my IGB > > client. I'm seeing a new problem in responses from the biopackages > > server, there are no or elements for features at all. > > > > See for example the returned XML from my standard test query: > > http://das.biopackages.net/das/genome/human/17/feature?overlaps=chr21/26 > > 027736:26068042;type=SO:mRNA > > > > ... > > > > > > ... > > > > > name="auto316726"> > > > > > > ... > > > > But I know that "auto316726" should be a child of "BC001178". > > > > Andrew, would your DAS/2 validator catch problems like this? > > > > > > Thanks, > > Gregg > > > > From allenday at ucla.edu Tue Oct 18 02:15:18 2005 From: allenday at ucla.edu (Allen Day) Date: Mon, 17 Oct 2005 23:15:18 -0700 (PDT) Subject: [DAS2] RE: Problem with parent/child features in biopackages server feature response In-Reply-To: References: Message-ID: Okay, give it a shot now. You uncovered a bug in my SQL query where the parent feature was outside your overlaps+inside ranges and wasn't being retrieved, leading to an error. These two are now in my regression test suite. -Allen On Mon, 17 Oct 2005, Helt,Gregg wrote: > Thanks for the quick fix! > > Now I'm seeing another problem though. Some feature queries with a > combination of one "overlaps" and one "inside" filter are giving weird > errors. But some return correctly. > > For example, > http://das.biopackages.net/das/genome/human/17/feature?overlaps=chr21/26 > 104961:26271647;inside=chr21/26104961:46976097;type=SO:mRNA > returns correctly. > > But > http://das.biopackages.net/das/genome/human/17/feature?overlaps=chr21/25 > 854314:26027736;inside=chr21/0:26027736;type=SO:mRNA > does not, instead returning these headers: > HTTP/1.1 200 OK > Date: Mon, 17 Oct 2005 22:40:54 GMT > Server: Apache/2.0.51 (Fedora) > X-DAS-Version: DAS/2.0 > X-DAS-Server: GMOD/0.0 > X-DAS-Content-Type: text/x-das-feature+xml > X-DAS-Status-Details: Died at > /usr/lib/perl5/site_perl/5.8.3/Package/Base/Devel.pm line 425. > > and then what appears to be more headers but is designated as content > (so I'm guessing a header-terminating blank line got inserted before > these): > X-DAS-Status: 500 > X-DAS-Status: 200 > Content-Length: 0 > Keep-Alive: timeout=15, max=100 > Connection: Keep-Alive > Content-Type: text/xml > > The combination of an "overlaps" and "inside" filter is an important > part of IGB's client-side query optimizations. I can turn this > optimization off for now, but I'm hoping you can diagnose and fix the > problem. > > Thanks again, > Gregg > > > -----Original Message----- > > From: Allen Day [mailto:allenday at ucla.edu] > > Sent: Monday, October 17, 2005 2:50 PM > > To: Helt,Gregg > > Cc: das2 at portal.open-bio.org > > Subject: Re: Problem with parent/child features in biopackages server > > feature response > > > > Hi Gregg, > > > > There was a logic inversion in my code. PARENT/PART relationships > should > > now be restored. > > > > Andrew, I'd also like to know if there is some code already written > that > > can be written into my regression tests. > > > > I understand that it's really irritating to have non-fatal errors of > > different types continuously appearing. Maybe I should spend some > time > > beefing up the regression test suite to do things like diff feature > graphs > > and make sure they are identical? Something to discuss on the > conference > > call... > > > > -Allen > > > > On Mon, 17 Oct 2005, Helt,Gregg wrote: > > > > > I finally got back to testing DAS/2 feature requests/responses in my > IGB > > > client. I'm seeing a new problem in responses from the biopackages > > > server, there are no or elements for features at > all. > > > > > > See for example the returned XML from my standard test query: > > > > http://das.biopackages.net/das/genome/human/17/feature?overlaps=chr21/26 > > > 027736:26068042;type=SO:mRNA > > > > > > ... > > > > > > > > > ... > > > > > > > > name="auto316726"> > > > > > > > > > ... > > > > > > But I know that "auto316726" should be a child of "BC001178". > > > > > > Andrew, would your DAS/2 validator catch problems like this? > > > > > > > > > Thanks, > > > Gregg > > > > > > > From dalke at dalkescientific.com Tue Oct 18 16:33:30 2005 From: dalke at dalkescientific.com (Andrew Dalke) Date: Tue, 18 Oct 2005 22:33:30 +0200 Subject: [DAS2] Sanger/EBI trip report Message-ID: <2c2048500487fb8e44532a89eee83c7d@dalkescientific.com> I visited EBI and Sanger last week to talk with the people there about their use of DAS, the ongoing work with the DAS/2 spec, and the future directions, including structure DAS. One meeting was with Andreas, the other Andreas (there are too many Andre* in the UK - I think I need to change my name), Eugene and Stefan. Andreas has a service registry system. I don't know where it is though. The registration includes metadata about the server. I would like some way for the DAS/2 server to provide the metadata so the registry could extract most of what it needs by querying the base server. As Andreas pointed out, that data could wrong or incomplete so the registry could override it. I mentioned the idea that the DAS/2 spec as it is now lets the registry server provide the top-level das/genome XML and is free to point clients to the real databases. This is one of the advantages of a ReST architecture. An interesting thing I learned was the wide use of stylesheets. There are about 15 stylesheet types in use on the campus, and Ensemble uses a version which is not-quite compatible. Andreas Prlic pointed out that the stylesheet needs extensions for 3D because the annotation styles are different than for a 2D plot. Thomas Down apparently has a version which puts a color scale on a field, eg, so that better scores are shown differently from worse scores. Stylesheets also came up when talking with Ed (or is that the other Ed :) and Roy. They are developing zmap, a replacement for fmap. It's a C app (gtk-based using the FooCanvas to display huge numbers of elements) designed to speak the same xremote API as fmap. They want annotations which can be individually annotatable, that is, annotated on more than the type. The example they gave was using three tracks - annotation, transcript and homology. They want to copy from the later two into the first track and preserve the original color and style. Sadly, that's what my notes say, but I don't understand it from there. What I took from it was the need to have different ways to determine a style for an annotation, like on a pre-track or perhaps per-annotation mechanism. The obvious one which comes to mind, which we talked about as a possibility, was to take ideas from CSS. Ed (I think) asked about how to handle assembly data. I pointed out the section in the spec which says it can be fetched by asking for it in BED format. He wanted to know more about how to know if a given element was a clone or a transcript. At this point I said he needed to ask a real scientist. :) James Gilbert also came by during the discussion. He asked about how we deal with hierarchical features, and wanted to know more about how our data model fits with the one in Otter. http://www.sanger.ac.uk/Users/jgrg/otter_xml.html I don't know the answer to that question. In both meetings people like that we refrained from making new XML for everything, using "format=" instead. Andreas et al. asked about computational services which might take a non-trivial time. I mentioned the solution we talked about during BOSC where the server returns a "202 Accepted" and a bit of XML saying "you can check on the status at this URL but it'll probably take about 5 minutes to figure out." The client should be able to ask the server to halt the computation. In general there was a good reception to the use of the "format=" parameter, instead of making new XML formats. It does look like we need to spend more time on the format extensibility. It seems much of what the UK folks do is based on extending DAS/1 in various ways. DAS/2 doesn't and cannot capture all of them. I've been looking at the ATOM spec. http://www.intertwingly.net/wiki/pie/RestEchoApiDiscuss http://atompub.org/2005/07/11/draft-ietf-atompub-format-10.html It has a very nice way to embed data in the atom:content field, where the data can be inline text, html, xml, or "other", or be a link to an external href. Along those lines, I think the Atom publication protocol has some nice ideas to help with the writeback spec. Ed described the locking model that they use. It's unchanged since last year's dicussion. The annotators decide on who gets a region, which is locked for that person. In their case it's exported into a local AceDB instance, edited via fmap. When done that database (as a whole) is sent back to the main database for integration. The region is locked, preventing resolution conflicts. Andreas et. al mentioned an interesting annotation - annotate a region to say it's been looked at but there are no annotations for the region. "This region intentionally left blank." I talked as well with Tony, mostly on organization issues. One of the things he said they might want to do in the future is a 2D image DAS. I brought up the idea of having a DAS sprint - once the spec w/ writeback starts to congeal, get the implementers together in a room for a few days and work on code, then use the experience to improve the spec. Keepin' it real. I talked about some of the disconnect between the DAS/2 dev folks (all in the US) and the UK folks. The phone conference call is at 8pm UK time and rather little of what we talk about gets written up. When the UK people ask questions (cf. James Gilbert's question "Nested features?" from Sept. 28, 2005) there's no response. Similarly, the DAS/1 extensions in the UK aren't written down so it's hard to know what's useful for DAS/2. My being in Europe for the next few months should help a bit with that, and I've always had a wacky schedule anyway so I'll be in on the conf. calls (now that I'm back to easily available broadband). But I'm not enough of a domain expert to be able to answer or address the scientific points. I've missed a few things so if anyone else here wants to, feel free to add comments. Andrew dalke at dalkescientific.com From ak at ebi.ac.uk Tue Oct 18 18:39:30 2005 From: ak at ebi.ac.uk (Andreas Kahari) Date: Tue, 18 Oct 2005 23:39:30 +0100 Subject: [DAS2] Sanger/EBI trip report In-Reply-To: <2c2048500487fb8e44532a89eee83c7d@dalkescientific.com> References: <2c2048500487fb8e44532a89eee83c7d@dalkescientific.com> Message-ID: <20051018223930.GC6854@ebi.ac.uk> On Tue, Oct 18, 2005 at 10:33:30PM +0200, Andrew Dalke wrote: > I visited EBI and Sanger last week to talk with the > people there about their use of DAS, the ongoing work > with the DAS/2 spec, and the future directions, including > structure DAS. > > One meeting was with Andreas, the other Andreas (there are > too many Andre* in the UK - I think I need to change my > name), Eugene and Stefan. Hi, I'm one of them Andreases. In this particular email I'm just commenting on a very small number of things that Andrew is writing. > Andreas has a service registry system. I don't know where That's Andreas Prlic, the other one (depending on your point of view), not me. > it is though. The registration includes metadata about the It is at http://das.sanger.ac.uk/registry/ (final slash is essential, it seems) > server. I would like some way for the DAS/2 server to > provide the metadata so the registry could extract most > of what it needs by querying the base server. As Andreas > pointed out, that data could wrong or incomplete so the > registry could override it. I mentioned the idea that > the DAS/2 spec as it is now lets the registry server > provide the top-level das/genome XML and is free to point > clients to the real databases. This is one of the > advantages of a ReST architecture. > > > An interesting thing I learned was the wide use of stylesheets. > There are about 15 stylesheet types in use on the campus, > and Ensemble uses a version which is not-quite compatible. > Andreas Prlic pointed out that the stylesheet needs extensions > for 3D because the annotation styles are different than for > a 2D plot. Thomas Down apparently has a version which puts > a color scale on a field, eg, so that better scores are shown > differently from worse scores. > > Stylesheets also came up when talking with Ed (or is that the > other Ed :) and Roy. They are developing zmap, a replacement > for fmap. It's a C app (gtk-based using the FooCanvas to > display huge numbers of elements) designed to speak the same > xremote API as fmap. They want annotations which can be > individually annotatable, that is, annotated on more than the type. > > The example they gave was using three tracks - annotation, > transcript and homology. They want to copy from the later two > into the first track and preserve the original color and style. > Sadly, that's what my notes say, but I don't understand it from > there. What I took from it was the need to have different > ways to determine a style for an annotation, like on a > pre-track or perhaps per-annotation mechanism. > > The obvious one which comes to mind, which we talked about > as a possibility, was to take ideas from CSS. > > Ed (I think) asked about how to handle assembly data. > I pointed out the section in the spec which says it can be > fetched by asking for it in BED format. He wanted to > know more about how to know if a given element was a clone > or a transcript. At this point I said he needed to ask > a real scientist. :) > > James Gilbert also came by during the discussion. He > asked about how we deal with hierarchical features, and > wanted to know more about how our data model fits with > the one in Otter. > http://www.sanger.ac.uk/Users/jgrg/otter_xml.html > I don't know the answer to that question. > > In both meetings people like that we refrained from making > new XML for everything, using "format=" instead. > > Andreas et al. asked about computational services which > might take a non-trivial time. I mentioned the solution > we talked about during BOSC where the server returns a > "202 Accepted" and a bit of XML saying "you can check on > the status at this URL but it'll probably take about 5 > minutes to figure out." The client should be able to > ask the server to halt the computation. This is related to something that mainly Tom Oinn here at the EBI has been working on: Distributed Annotation with Lazily Evaluated Computation (DALEC), a kind of DAS frontend to Taverna workflows. http://taverna.sourceforge.net/projects/dalec/ > In general there was a good reception to the use of the > "format=" parameter, instead of making new XML formats. > > It does look like we need to spend more time on the > format extensibility. It seems much of what the UK folks > do is based on extending DAS/1 in various ways. DAS/2 > doesn't and cannot capture all of them. I've been looking > at the ATOM spec. > http://www.intertwingly.net/wiki/pie/RestEchoApiDiscuss > http://atompub.org/2005/07/11/draft-ietf-atompub-format-10.html I need to read this. > It has a very nice way to embed data in the atom:content > field, where the data can be inline text, html, xml, or > "other", or be a link to an external href. > > Along those lines, I think the Atom publication protocol > has some nice ideas to help with the writeback spec. > > Ed described the locking model that they use. It's > unchanged since last year's dicussion. The annotators > decide on who gets a region, which is locked for that > person. In their case it's exported into a local AceDB > instance, edited via fmap. When done that database (as > a whole) is sent back to the main database for integration. > The region is locked, preventing resolution conflicts. > > Andreas et. al mentioned an interesting annotation - annotate > a region to say it's been looked at but there are no > annotations for the region. "This region intentionally > left blank." Yes, this was something that confused me at first but that makes perfect sense to me now. Groups sometimes need to say they've looked at a region (protein/gene/whatever) because the fact that they are explicitly not annotating something is as much an annotation as actually annotating something with a box. Covering the region with an annotation saying "there's nothing here" does not seem quite right to me. > I talked as well with Tony, mostly on organization issues. > One of the things he said they might want to do in the > future is a 2D image DAS. > > I brought up the idea of having a DAS sprint - once the > spec w/ writeback starts to congeal, get the implementers > together in a room for a few days and work on code, then > use the experience to improve the spec. Keepin' it real. > > I talked about some of the disconnect between the DAS/2 > dev folks (all in the US) and the UK folks. The phone > conference call is at 8pm UK time and rather little of > what we talk about gets written up. When the UK people > ask questions (cf. James Gilbert's question "Nested features?" > from Sept. 28, 2005) there's no response. Similarly, > the DAS/1 extensions in the UK aren't written down so This is not *quite* true. The alignment and structure extensions to DAS/1 by Andreas Prlic are well documented here: http://www.efamily.org.uk/xml/das/documentation/ > it's hard to know what's useful for DAS/2. My being in > Europe for the next few months should help a bit with > that, and I've always had a wacky schedule anyway so I'll > be in on the conf. calls (now that I'm back to easily > available broadband). But I'm not enough of a domain > expert to be able to answer or address the scientific > points. > Me and Stefan enjoyed Andrew's visit and would certainly like to see some sort of dialogue or collaboration or whatever may help getting further with specifying and implementing DAS/2. > > I've missed a few things so if anyone else here wants > to, feel free to add comments. > > Andrew > dalke at dalkescientific.com Regards, Andreas -- Andreas K?h?ri EMBL-EBI/ensembl ------{ www.embl.org }----{ www.ebi.ac.uk }----{ www.ensembl.org }------ From ap3 at sanger.ac.uk Wed Oct 19 04:56:25 2005 From: ap3 at sanger.ac.uk (Andreas Prlic) Date: Wed, 19 Oct 2005 09:56:25 +0100 Subject: [DAS2] Sanger/EBI trip report In-Reply-To: <2c2048500487fb8e44532a89eee83c7d@dalkescientific.com> References: <2c2048500487fb8e44532a89eee83c7d@dalkescientific.com> Message-ID: <9d0b4d6d8bde3942d590b5c1aa78efd2@sanger.ac.uk> Hi Andrew! Thanks for the very good summary. I wanted to comment on the DAS -registry we are having here at http://das.sanger.ac.uk/registry/ So far there has not been a mechanism how DAS clients can programmatically discover available DAS sources, that can be shared between different DAS clients. This registry addresses that issue (among others). One thing that was necessary to do for this, was to provide "coordinate systems", that describe the data that has been annotated. E.g. Ensembl can project data from different coordinate systems into one display so it is important to know what the data is being served in. - also some coordinate systems are not supported and therefore such DAS sources have to be ignored. What I understood from our discussion there is no convention on globally unique coordinate systems in DAS2 so far, but I think that would be a nice feature to have. There is some documentation on the registry available at http://das.sanger.ac.uk/registry/help_index.jsp Greetings, Andreas ----------------------------------------------------------------------- Andreas Prlic Wellcome Trust Sanger Institute Hinxton, Cambridge CB10 1SA, UK +44 (0) 1223 49 6891 From Gregg_Helt at affymetrix.com Wed Oct 19 14:18:25 2005 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Wed, 19 Oct 2005 11:18:25 -0700 Subject: [DAS2] Sanger/EBI trip report Message-ID: > -----Original Message----- > From: das2-bounces at portal.open-bio.org [mailto:das2-bounces at portal.open- > bio.org] On Behalf Of Andrew Dalke > Sent: Tuesday, October 18, 2005 1:34 PM > To: DAS/2 > Subject: [DAS2] Sanger/EBI trip report > ... > I talked about some of the disconnect between the DAS/2 > dev folks (all in the US) and the UK folks. The phone > conference call is at 8pm UK time and rather little of > what we talk about gets written up. When the UK people > ask questions (cf. James Gilbert's question "Nested features?" > from Sept. 28, 2005) there's no response. Similarly, > the DAS/1 extensions in the UK aren't written down so > it's hard to know what's useful for DAS/2. My being in > Europe for the next few months should help a bit with > that, and I've always had a wacky schedule anyway so I'll > be in on the conf. calls (now that I'm back to easily > available broadband). But I'm not enough of a domain > expert to be able to answer or address the scientific > points. ... As Andrew mentioned, I've been hosting a weekly DAS/2 conference call over here in the US, every Thursday at 12 noon Pacific time. It'd be great if Sanger/EBI folk could join in. Would moving it to 9 AM Pacific (5 PM UK) help? And I'll see about getting someone to take notes for the meetings and posting to the list. I agree we definitely need to be more active about responding to questions on the DAS/2 mailing list. There's plenty to talk about... Thanks, Gregg Helt From Gregg_Helt at affymetrix.com Wed Oct 19 15:05:06 2005 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Wed, 19 Oct 2005 12:05:06 -0700 Subject: [DAS2] Sanger/EBI trip report Message-ID: > -----Original Message----- > From: das2-bounces at portal.open-bio.org [mailto:das2-bounces at portal.open- > bio.org] On Behalf Of Andreas Prlic > Sent: Wednesday, October 19, 2005 1:56 AM > To: Andrew Dalke > Cc: DAS/2 > Subject: Re: [DAS2] Sanger/EBI trip report > > Hi Andrew! > > Thanks for the very good summary. > > I wanted to comment on the DAS -registry we are having here at > http://das.sanger.ac.uk/registry/ > > So far there has not been a mechanism how DAS clients can > programmatically discover available DAS sources, that can be > shared between different DAS clients. This registry addresses > that issue (among others). As part of the work under the DAS/2 grant, we've had several meetings trying to sketch out a strategy for registry/discovery of DAS/2 servers. Andreas presented his DAS registry at one of these back in September 2004. Looks very nice. There's still a lot of differing opinions about the best way to do this though. Maybe in November we can devote at least one of the DAS/2 teleconference calls to revisiting registry/discovery strategies, if Andreas could join in. > One thing that was necessary to do for this, was to provide > "coordinate systems", that describe the data that has been > annotated. E.g. Ensembl can project data from different > coordinate systems into one display so it is important to know > what the data is being served in. - also some coordinate > systems are not supported and therefore such DAS sources have > to be ignored. > > What I understood from our discussion there is no convention > on globally unique coordinate systems in DAS2 so far, but I think > that would be a nice feature to have. > > There is some documentation on the registry available at > http://das.sanger.ac.uk/registry/help_index.jsp > > Greetings, > Andreas > As far as globally unique coordinate systems, the current plan in DAS/2 is that this should be indicated as a URI id attribute in an element for the versioned source. For example, as part of the response to http://server/das/genome: and allowing multiple tags to indicate different URIs for the same genome assembly. Preferably the assembly id would be a URI controlled by the institution that performed the assembly, and be resolvable to some meaningful information about that assembly. In practice that might not be possible (for example the above snippet refers to a download page at UCSC but the assembly was done at NCBI). gregg From Gregg_Helt at affymetrix.com Thu Oct 20 13:39:23 2005 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Thu, 20 Oct 2005 10:39:23 -0700 Subject: [DAS2] RE: Problem with parent/child features in biopackages server feature response Message-ID: Thanks, whatever you did fixed the problem! gregg > -----Original Message----- > From: Allen Day [mailto:allenday at ucla.edu] > Sent: Monday, October 17, 2005 11:15 PM > To: Helt,Gregg > Cc: das2 at portal.open-bio.org > Subject: RE: Problem with parent/child features in biopackages server > feature response > > Okay, give it a shot now. You uncovered a bug in my SQL query where the > parent feature was outside your overlaps+inside ranges and wasn't being > retrieved, leading to an error. These two are now in my regression test > suite. > > -Allen > > > On Mon, 17 Oct 2005, Helt,Gregg wrote: > > > Thanks for the quick fix! > > > > Now I'm seeing another problem though. Some feature queries with a > > combination of one "overlaps" and one "inside" filter are giving weird > > errors. But some return correctly. > > > > For example, > > http://das.biopackages.net/das/genome/human/17/feature?overlaps=chr21/26 > > 104961:26271647;inside=chr21/26104961:46976097;type=SO:mRNA > > returns correctly. > > > > But > > http://das.biopackages.net/das/genome/human/17/feature?overlaps=chr21/25 > > 854314:26027736;inside=chr21/0:26027736;type=SO:mRNA > > does not, instead returning these headers: > > HTTP/1.1 200 OK > > Date: Mon, 17 Oct 2005 22:40:54 GMT > > Server: Apache/2.0.51 (Fedora) > > X-DAS-Version: DAS/2.0 > > X-DAS-Server: GMOD/0.0 > > X-DAS-Content-Type: text/x-das-feature+xml > > X-DAS-Status-Details: Died at > > /usr/lib/perl5/site_perl/5.8.3/Package/Base/Devel.pm line 425. > > > > and then what appears to be more headers but is designated as content > > (so I'm guessing a header-terminating blank line got inserted before > > these): > > X-DAS-Status: 500 > > X-DAS-Status: 200 > > Content-Length: 0 > > Keep-Alive: timeout=15, max=100 > > Connection: Keep-Alive > > Content-Type: text/xml > > > > The combination of an "overlaps" and "inside" filter is an important > > part of IGB's client-side query optimizations. I can turn this > > optimization off for now, but I'm hoping you can diagnose and fix the > > problem. > > > > Thanks again, > > Gregg > > > > > -----Original Message----- > > > From: Allen Day [mailto:allenday at ucla.edu] > > > Sent: Monday, October 17, 2005 2:50 PM > > > To: Helt,Gregg > > > Cc: das2 at portal.open-bio.org > > > Subject: Re: Problem with parent/child features in biopackages server > > > feature response > > > > > > Hi Gregg, > > > > > > There was a logic inversion in my code. PARENT/PART relationships > > should > > > now be restored. > > > > > > Andrew, I'd also like to know if there is some code already written > > that > > > can be written into my regression tests. > > > > > > I understand that it's really irritating to have non-fatal errors of > > > different types continuously appearing. Maybe I should spend some > > time > > > beefing up the regression test suite to do things like diff feature > > graphs > > > and make sure they are identical? Something to discuss on the > > conference > > > call... > > > > > > -Allen > > > > > > On Mon, 17 Oct 2005, Helt,Gregg wrote: > > > > > > > I finally got back to testing DAS/2 feature requests/responses in my > > IGB > > > > client. I'm seeing a new problem in responses from the biopackages > > > > server, there are no or elements for features at > > all. > > > > > > > > See for example the returned XML from my standard test query: > > > > > > http://das.biopackages.net/das/genome/human/17/feature?overlaps=chr21/26 > > > > 027736:26068042;type=SO:mRNA > > > > > > > > ... > > > > > > > > > > > > ... > > > > > > > > > > > name="auto316726"> > > > > > > > > > > > > ... > > > > > > > > But I know that "auto316726" should be a child of "BC001178". > > > > > > > > Andrew, would your DAS/2 validator catch problems like this? > > > > > > > > > > > > Thanks, > > > > Gregg > > > > > > > > > > From Gregg_Helt at affymetrix.com Thu Oct 20 14:07:18 2005 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Thu, 20 Oct 2005 11:07:18 -0700 Subject: [DAS2] (no subject) Message-ID: I've been revising the DAS/2 GUI in IGB to use a tree view for looking at server / source / version hierarchy. I've attached a screen shot. It's still a little clunky to overlay data from multiple sources (DAS/2, DAS/1, quickload), but I'm hoping to make that smoother in time for the Genome Informatics meeting next week. gregg -------------- next part -------------- A non-text attachment was scrubbed... Name: igb_das2_example2.JPG Type: image/jpeg Size: 159366 bytes Desc: igb_das2_example2.JPG URL: From lstein at cshl.edu Thu Oct 20 16:23:02 2005 From: lstein at cshl.edu (Lincoln Stein) Date: Thu, 20 Oct 2005 16:23:02 -0400 Subject: [DAS2] (no subject) In-Reply-To: References: Message-ID: <200510201623.03566.lstein@cshl.edu> That's looking pretty good. Lincoln On Thursday 20 October 2005 02:07 pm, Helt,Gregg wrote: > I've been revising the DAS/2 GUI in IGB to use a tree view for looking > at server / source / version hierarchy. I've attached a screen shot. > It's still a little clunky to overlay data from multiple sources (DAS/2, > DAS/1, quickload), but I'm hoping to make that smoother in time for the > Genome Informatics meeting next week. > > gregg -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From suzi at fruitfly.org Thu Oct 20 22:41:10 2005 From: suzi at fruitfly.org (Suzanna Lewis) Date: Thu, 20 Oct 2005 22:41:10 -0400 Subject: [DAS2] (no subject) In-Reply-To: <200510201623.03566.lstein@cshl.edu> References: <200510201623.03566.lstein@cshl.edu> Message-ID: <0d95651c0e3be0c941dae38a5185166a@fruitfly.org> Lincoln, See you tomorrow I hope? We should strongly encourage Jason and Zhirong to have Gregg talk. -S On Oct 20, 2005, at 4:23 PM, Lincoln Stein wrote: > That's looking pretty good. > > Lincoln > > On Thursday 20 October 2005 02:07 pm, Helt,Gregg wrote: >> I've been revising the DAS/2 GUI in IGB to use a tree view for looking >> at server / source / version hierarchy. I've attached a screen shot. >> It's still a little clunky to overlay data from multiple sources >> (DAS/2, >> DAS/1, quickload), but I'm hoping to make that smoother in time for >> the >> Genome Informatics meeting next week. >> >> gregg > > -- > Lincoln D. Stein > Cold Spring Harbor Laboratory > 1 Bungtown Road > Cold Spring Harbor, NY 11724 > FOR URGENT MESSAGES & SCHEDULING, > PLEASE CONTACT MY ASSISTANT, > SANDRA MICHELSEN, AT michelse at cshl.edu > _______________________________________________ > DAS2 mailing list > DAS2 at portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/das2 From allenday at ucla.edu Fri Oct 21 13:48:54 2005 From: allenday at ucla.edu (Allen Day) Date: Fri, 21 Oct 2005 10:48:54 -0700 (PDT) Subject: [DAS2] DAS server responses cached In-Reply-To: References: Message-ID: I set up Apache's mod_cache module last night -- you should now get nearly instantaneous response for the common query types ( /region, /type, etc), as well as for other queries that have been previously issued. -Allen On Thu, 20 Oct 2005, Helt,Gregg wrote: > Thanks, whatever you did fixed the problem! > > gregg > > > -----Original Message----- > > From: Allen Day [mailto:allenday at ucla.edu] > > Sent: Monday, October 17, 2005 11:15 PM > > To: Helt,Gregg > > Cc: das2 at portal.open-bio.org > > Subject: RE: Problem with parent/child features in biopackages server > > feature response > > > > Okay, give it a shot now. You uncovered a bug in my SQL query where > the > > parent feature was outside your overlaps+inside ranges and wasn't > being > > retrieved, leading to an error. These two are now in my regression > test > > suite. > > > > -Allen > > > > > > On Mon, 17 Oct 2005, Helt,Gregg wrote: > > > > > Thanks for the quick fix! > > > > > > Now I'm seeing another problem though. Some feature queries with a > > > combination of one "overlaps" and one "inside" filter are giving > weird > > > errors. But some return correctly. > > > > > > For example, > > > > http://das.biopackages.net/das/genome/human/17/feature?overlaps=chr21/26 > > > 104961:26271647;inside=chr21/26104961:46976097;type=SO:mRNA > > > returns correctly. > > > > > > But > > > > http://das.biopackages.net/das/genome/human/17/feature?overlaps=chr21/25 > > > 854314:26027736;inside=chr21/0:26027736;type=SO:mRNA > > > does not, instead returning these headers: > > > HTTP/1.1 200 OK > > > Date: Mon, 17 Oct 2005 22:40:54 GMT > > > Server: Apache/2.0.51 (Fedora) > > > X-DAS-Version: DAS/2.0 > > > X-DAS-Server: GMOD/0.0 > > > X-DAS-Content-Type: text/x-das-feature+xml > > > X-DAS-Status-Details: Died at > > > /usr/lib/perl5/site_perl/5.8.3/Package/Base/Devel.pm line 425. > > > > > > and then what appears to be more headers but is designated as > content > > > (so I'm guessing a header-terminating blank line got inserted before > > > these): > > > X-DAS-Status: 500 > > > X-DAS-Status: 200 > > > Content-Length: 0 > > > Keep-Alive: timeout=15, max=100 > > > Connection: Keep-Alive > > > Content-Type: text/xml > > > > > > The combination of an "overlaps" and "inside" filter is an important > > > part of IGB's client-side query optimizations. I can turn this > > > optimization off for now, but I'm hoping you can diagnose and fix > the > > > problem. > > > > > > Thanks again, > > > Gregg > > > > > > > -----Original Message----- > > > > From: Allen Day [mailto:allenday at ucla.edu] > > > > Sent: Monday, October 17, 2005 2:50 PM > > > > To: Helt,Gregg > > > > Cc: das2 at portal.open-bio.org > > > > Subject: Re: Problem with parent/child features in biopackages > server > > > > feature response > > > > > > > > Hi Gregg, > > > > > > > > There was a logic inversion in my code. PARENT/PART relationships > > > should > > > > now be restored. > > > > > > > > Andrew, I'd also like to know if there is some code already > written > > > that > > > > can be written into my regression tests. > > > > > > > > I understand that it's really irritating to have non-fatal errors > of > > > > different types continuously appearing. Maybe I should spend some > > > time > > > > beefing up the regression test suite to do things like diff > feature > > > graphs > > > > and make sure they are identical? Something to discuss on the > > > conference > > > > call... > > > > > > > > -Allen > > > > > > > > On Mon, 17 Oct 2005, Helt,Gregg wrote: > > > > > > > > > I finally got back to testing DAS/2 feature requests/responses > in my > > > IGB > > > > > client. I'm seeing a new problem in responses from the > biopackages > > > > > server, there are no or elements for features > at > > > all. > > > > > > > > > > See for example the returned XML from my standard test query: > > > > > > > > > http://das.biopackages.net/das/genome/human/17/feature?overlaps=chr21/26 > > > > > 027736:26068042;type=SO:mRNA > > > > > > > > > > ... > > > > > name="BC001178"> > > > > > > > > > > ... > > > > > > > > > > > > > > name="auto316726"> > > > > > > > > > > > > > > > ... > > > > > > > > > > But I know that "auto316726" should be a child of "BC001178". > > > > > > > > > > Andrew, would your DAS/2 validator catch problems like this? > > > > > > > > > > > > > > > Thanks, > > > > > Gregg > > > > > > > > > > > > > > From Steve_Chervitz at affymetrix.com Fri Oct 21 20:48:57 2005 From: Steve_Chervitz at affymetrix.com (Chervitz, Steve) Date: Fri, 21 Oct 2005 17:48:57 -0700 Subject: [DAS2] DAS/2 weekly meeting notes Message-ID: Notes from the weekly DAS/2 teleconference, 21 Oct 2005. $Id: das2-teleconf-2005-10-20.txt,v 1.1 2005/10/21 23:55:05 sac Exp $ * Gregg and Lincoln hashed out some details regarding specific aims for the DAS/2 continuation grant, refining these aims: - Client enhancements + includes client-side caching - Spec enhancements + caching (IF_MODIFIED_SINCE support) + dealing with lengthy server operations + extensions to stylesheet capabilities + embedding arbitrary snippets of XML (e.g., RDF) - IGB-Apollo interactions + Gregg has been doing some background research on HTTP and UDP protocols * Greg plans to have a draft ready to circulate for review by the participating organizations by Tues next week (10/25). * DAS/2 meeting at CSHL Genome Informatics meeting: - Planned for 10/28 evening (7:30pm EDT) in Lincoln's office - Teleconf will be available * Proposal to move the weekly DAS/2 meeting to 9:00am PDT/PST Thursday so that UK folks can participate - Folks will check schedules to see if this works - Chime in if it doesn't work for you Status Reports -------------- Gregg: * Completed treeview of DAS/2 sources in IGB. See his recent post with GUI snapshot. This will be included by default in the next IGB release next week, to coincide with the CSHL GI meeting. * Invited to talk at CSHL GI meeting on genome viz. Good opportunity to present IGB. May accept the invitation if it's later during the meeting. * Is making an effort to respond to questions on the DAS/2 list. * IGB release planned for next week, coinciding with CSHL GI meeting. - Lots of new additions (DAS/2 support, new GUI preferences setting). Might be a 4.0 release. - Access via Java Web Start - Users will be able to launch previous version of IGB, if they have problems with the new version. Discussion: - Ann Loraine puts in a strong plug for JWS-based application launching. It's just so convenient, even for someone who is comfortable building and running java code with command-line tools (but doesn't always have time to do so these days). - Gregg would like to set up an auto-build system for IGB code on SourceForge (or open-bio.org, if SF doesn't support this). This would allow users to download a pre-built jar. A JWS system could then be built on top of this if we want. Ed E: * Lots of IGB commits last week. - Can now set persistent preferences via GUI rather than pref file (bg colors, feature colors, number of levels of a given tier, show/hide prefs, etc.). Much nicer user experience. - No documentation yet. - Gregg adds: Deb Coulton (sp?) is now revising the IGB user guide. Targeting end of Nov for release. Steve C: * Working on improving the DAS/2 spec, resolving bugzilla issues, consistent wording, etc. - See http://biodas.org/documents/das2/das2_get.html - Be sure to force your browser to reload the page daily to make sure you have the latest version. - Will post notable changes to the DAS/2 list. - Will post any issues requiring broader discussion to list before modifying spec. - Gregg adds: See his response to Andreas on the list regarding how to deal with server agreement on what sequence coordinate system is being used. Do folks agree with Gregg's proposal? Allen D: * Bug fixes on biopackages.net DAS/2 server - A DAS2XML parser would help catch bugs in advance. Will ask Andrew about setting this up. * Discussed performance issues. - Feature request is as fast as it's going to get now (in terms of SQL optimization). - Has a SQUID proxy set up to help with relatively static data, but you must configure your client to use it. (also ss Allen's post on 10/21 regarding aache mod_cache module). - Gregg adds: Would like to get support in spec for 'if modified since' tags. - Gregg: What about using a different schema underneath, such as GFF-db as used by gbrowse which is quite fast? - Allen: It's not fast for large segments. The DAS/2 server uses a chado schema with feature indexing optimizations by Allen. It's performance improves for big queries, but degrades for small queries. So the DAS/2 server's response is more flat over a range of query sizes relative to gbrowse. Possibly could improve the DAS/2 server by upgrading from Postgres 7.4 to 8.0, which has improvements on left joins. - Lincoln: How could gbrowse performance be improved? - Allen: Via partial indexing. Should improve for queries covering 200+ MB. From Steve_Chervitz at affymetrix.com Wed Oct 26 21:29:38 2005 From: Steve_Chervitz at affymetrix.com (Chervitz, Steve) Date: Wed, 26 Oct 2005 18:29:38 -0700 Subject: [DAS2] Spec issues Message-ID: In the spec for DAS/2 retrievals, there are some open issues regarding types and features that I'd like to solicit feedback on. This is kind of a long message, so feel free to pick and choose what you want to respond to. For reference, here's the latest retrieval spec: http://biodas.org/documents/das2/das2_get.html Type properties example (only showing relevant attributes): Description: A set of machine-readable configuration information as key/value pairs The spec currently describes the key attribute as "the name of the property. Elaborate on how to interpret the name". So how should name be interpreted? Can it be a URI/URL? If we want it to be just a simple string that can derive from some controlled vocabulary, how does one specify which vocabulary it derives from? (e.g., http://www.biodas.org/ns/das/properties/2.00) Also, we might want to allow some controlled vocabulary terms to be used for the value of type.source (e.g., "das:curated"), to ensure that different users use the same term to specify that a feature type is produced by curation. The spec also seems alarmed by the existence of a xml:base attribute in the TYPE element. The idea is that any relative URL within this element would be resolved using that element's xml:base attribute. How would folks be with having the DAS/2 spec fully support the XML Base spec ( http://www.w3.org/TR/xmlbase/ )? The result of this would be to add an optional xml:base attribute to all elements that contain URLs or subelements with URLs. For an example of how this would work, in the above XML snippet, the absolute URL for TYPE.id would be http://www.wormbase.org/dase/genome/volvox/1/type/gene/curated_gene Next issue: Feature properties example (only showing relevant attributes): Description: Properties are typed using the ptype attribute. The value of the property may be indicated by a URL given by the href attribute, or may be given inline as the CDATA content of the section. 29 2 So in contrast to the TYPE properties which are restricted to being simple string-based key:value pairs, FEATURE properties can be more complex, which seems reasonable, given the wild world of features. We might consider using 'key' rather than 'ptype' for FEATURE properties, for consistency with TYPE prop elements (however, read on). In the feature filter section, the property-based filter describes feature properties as being string-based, a la TYPE properties. More complex feature properties would not necessarily be filterable, so this should be expanded upon, stating that property-based feature filters will only work for feature properties that are simple strings (not properties where the value is a URL or is a CDATA with MIME type not equal to text/plain). One issue that comes up here, which actually pertains to the spec as a whole, is that there are various attributes that are intended to be URLs but are named quite different things. In the FEATURE snippet above, there are four different attributes that are URLs: id, type, ptype, and href. There is a bugzilla entry requesting that all attributes named 'id' which are in fact URLs be named 'uri': http://bugzilla.open-bio.org/show_bug.cgi?id=1788 This seems like a good move to me, since it flags these attributes as resolvable. Does anyone have objections to this? For other attributes that are URLs but are not named 'id' or 'href' (such as type, ptype above), we could either leave as-is, or we could append '_uri' to their name to flag their resolvability. Feature's PROP.ptype is an interesting case, since it is both an identifier (equivalent to type PROP.key) and a URL for describing the property. For this reason, I would also propose either renaming it 'uri' (to capture this dual role) or 'key' (for consistency with type properties). Thoughts? The feature example DASXML above also shows a way to attach a protein translation to a feature as a property. Since this will be a common task, I'd vote for having a feature property of "das:property/protein_translation" among the list of built-in feature properties in the das namespace. Anyone want to take a stab at defining the full list of built-in properties within the "das:" and "bg:" namespaces? I think it's a key requirement for interoperability. Steve From Gregg_Helt at affymetrix.com Thu Oct 27 10:50:17 2005 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Thu, 27 Oct 2005 07:50:17 -0700 Subject: [DAS2] DAS/2 meeting on Friday, no meeting today Message-ID: Since we're having a DAS/2 meeting at CSHL on Friday, I'm canceling our regular meeting today. I'll send out more details later about tomorrow's meeting, but it probably won't start till the evening (Eastern time). Thanks, Gregg From lstein at cshl.edu Thu Oct 27 11:14:55 2005 From: lstein at cshl.edu (Lincoln Stein) Date: Thu, 27 Oct 2005 09:14:55 -0600 Subject: [DAS2] DAS/2 meeting on Friday, no meeting today In-Reply-To: References: Message-ID: <200510270914.55947.lstein@cshl.edu> Gregg, Isn't the DAS/2 meeting at CSHL scheduled for Saturday evening at 6? Room details will follow. Lincoln On Thursday 27 October 2005 08:50 am, Helt,Gregg wrote: > Since we're having a DAS/2 meeting at CSHL on Friday, I'm canceling our > regular meeting today. I'll send out more details later about > tomorrow's meeting, but it probably won't start till the evening > (Eastern time). > > Thanks, > Gregg > > > _______________________________________________ > DAS2 mailing list > DAS2 at portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/das2 -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From lstein at cshl.edu Thu Oct 27 11:41:30 2005 From: lstein at cshl.edu (Lincoln Stein) Date: Thu, 27 Oct 2005 09:41:30 -0600 Subject: [DAS2] Spec issues In-Reply-To: References: Message-ID: <200510270941.30528.lstein@cshl.edu> On Wednesday 26 October 2005 07:29 pm, Chervitz, Steve wrote: > In the spec for DAS/2 retrievals, there are some open issues regarding > types and features that I'd like to solicit feedback on. This is kind of a > long message, so feel free to pick and choose what you want to respond to. > > For reference, here's the latest retrieval spec: > http://biodas.org/documents/das2/das2_get.html > > Type properties example (only showing relevant attributes): > > Description: A set of machine-readable configuration information as > key/value pairs > > > ontology="http://song.sf.net/ontologies/sofa#gene" > source="curated" > xml:base="gene/"> > > > > The spec currently describes the key attribute as "the name of the > property. Elaborate on how to interpret the name". So how should name be > interpreted? Can it be a URI/URL? If we want it to be just a simple string > that can derive from some controlled vocabulary, how does one specify which > vocabulary it derives from? (e.g., > http://www.biodas.org/ns/das/properties/2.00) I thought that the "bg:" and "das:" were straight XML namespaces. I.e: Also, we might want to allow some controlled vocabulary terms to be used > for the value of type.source (e.g., "das:curated"), to ensure that > different users use the same term to specify that a feature type is > produced by curation. Same idea, but see below. > > The spec also seems alarmed by the existence of a xml:base attribute in the > TYPE element. The idea is that any relative URL within this element would > be resolved using that element's xml:base attribute. How would folks be > with having the DAS/2 spec fully support the XML Base spec ( > http://www.w3.org/TR/xmlbase/ )? The result of this would be to add an > optional xml:base attribute to all elements that contain URLs or > subelements with URLs. > For an example of how this would work, in the above XML snippet, the > absolute URL for TYPE.id would be > http://www.wormbase.org/dase/genome/volvox/1/type/gene/curated_gene I'm ok with this. > Next issue: Feature properties example (only showing relevant attributes): > > Description: Properties are typed using the ptype attribute. The value of > the property may be indicated by a URL given by the href attribute, or may > be given inline as the CDATA content of the section. > > > type="type/curated_exon"> > 29 > 2 > href="/das/protein/volvox/2/feature/CTEL54X.1" /> > > > > So in contrast to the TYPE properties which are restricted to being simple > string-based key:value pairs, FEATURE properties can be more complex, which > seems reasonable, given the wild world of features. We might consider using > 'key' rather than 'ptype' for FEATURE properties, for consistency with TYPE > prop elements (however, read on). I'm not so happy with "key" since it is nondescript. Originally this was "type" but the word collided with feature type. I am getting uncomfortable with the dichotomy we've (I've?) created between XML base keys/properties and namespace-based keys/properties. It seems nasty to have the ptype attribute be either a relative URI (property/genefinder-score), or a controlled vocabulary member (das:phase). Is there any reason we shouldn't choose one or the other? For example, does this work? xmlns:prop="http://www.wormbase.org/das/genome/volvox/1/property"> 29 2 This looks so much cleaner to me. > In the feature filter section, the property-based filter describes feature > properties as being string-based, a la TYPE properties. More complex > feature properties would not necessarily be filterable, so this should be > expanded upon, stating that property-based feature filters will only work > for feature properties that are simple strings (not properties where the > value is a URL or is a CDATA with MIME type not equal to text/plain). > > One issue that comes up here, which actually pertains to the spec as a > whole, is that there are various attributes that are intended to be URLs > but are named quite different things. In the FEATURE snippet above, there > are four different attributes that are URLs: id, type, ptype, and href. > There is a bugzilla entry requesting that all attributes named 'id' which > are in fact URLs be named 'uri': > http://bugzilla.open-bio.org/show_bug.cgi?id=1788 > This seems like a good move to me, since it flags these attributes as > resolvable. Does anyone have objections to this? Ok with me, but I'd like to hear what people think about throwing out the base idea entirely and using namespaces as described above. > > For other attributes that are URLs but are not named 'id' or 'href' (such > as type, ptype above), we could either leave as-is, or we could append > '_uri' to their name to flag their resolvability. Feature's PROP.ptype is > an interesting case, since it is both an identifier (equivalent to type > PROP.key) and a URL for describing the property. For this reason, I would > also propose either renaming it 'uri' (to capture this dual role) or 'key' > (for consistency with type properties). Thoughts? > > The feature example DASXML above also shows a way to attach a protein > translation to a feature as a property. Since this will be a common task, > I'd vote for having a feature property of > "das:property/protein_translation" among the list of built-in feature > properties in the das namespace. Anyone want to take a stab at defining the > full list of built-in properties within the "das:" and "bg:" namespaces? I > think it's a key requirement for interoperability. Absolutely -- I'd like to start work on that. Lincoln > > Steve > > > _______________________________________________ > DAS2 mailing list > DAS2 at portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/das2 -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From lstein at cshl.edu Fri Oct 28 15:58:30 2005 From: lstein at cshl.edu (Lincoln Stein) Date: Fri, 28 Oct 2005 13:58:30 -0600 Subject: [DAS2] Revise meeting time In-Reply-To: References: Message-ID: <200510281558.31204.lstein@cshl.edu> Hi, I totally missed the fact that there's a concert at 6 pm Saturday. Would people prefer to meeting over lunch tomorrow after the end of the morning session? Lincoln -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From Gregg_Helt at affymetrix.com Fri Oct 28 16:20:06 2005 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Fri, 28 Oct 2005 13:20:06 -0700 Subject: [DAS2] RE: Revise meeting time Message-ID: Either time works for me. gregg > -----Original Message----- > From: Lincoln Stein [mailto:lstein at cshl.edu] > Sent: Friday, October 28, 2005 3:59 PM > To: das2 at portal.open-bio.org > Cc: Chervitz, Steve; Helt,Gregg; michelse at cshl.edu > Subject: Revise meeting time > > Hi, > > I totally missed the fact that there's a concert at 6 pm Saturday. Would > people prefer to meeting over lunch tomorrow after the end of the morning > session? > > Lincoln > > -- > Lincoln D. Stein > Cold Spring Harbor Laboratory > 1 Bungtown Road > Cold Spring Harbor, NY 11724 > FOR URGENT MESSAGES & SCHEDULING, > PLEASE CONTACT MY ASSISTANT, > SANDRA MICHELSEN, AT michelse at cshl.edu From Gregg_Helt at affymetrix.com Thu Oct 6 18:02:44 2005 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Thu, 6 Oct 2005 11:02:44 -0700 Subject: [DAS2] Problems with biopackages DAS2 server Message-ID: I'm having some problems with feature responses from the DAS/2 server at das.biopackages.net. It looks like in the das2feature XML at least some features are now pointing to themselves as parents. For example, as part of response to http://das.biopackages.net/das/genome/human/17/feature?overlaps=chr21/19 301959:21567303;type=SO:mRNA ... ... ... Another problem I'm seeing is server internal errors in response to combination of an overlaps and inside feature filter, for example: http://das.biopackages.net/das/genome/human/17/feature?overlaps=chr21/25 108802:26550995;inside=chr21/21567303:46976097;type=SO:mRNA returns an HTTP status code of 500, "Internal Server Error". Has the combination of overlaps/inside filter not been implemented yet? Or is this possibly because of the size of the inside filter's region? I haven't noticed either of these problems before, but for the last several weeks I've only been testing the higher-level responses (sources, regions, types) while redoing the client GUI, so I'm not sure when this started happening. gregg From allenday at ucla.edu Thu Oct 6 19:49:05 2005 From: allenday at ucla.edu (Allen Day) Date: Thu, 6 Oct 2005 12:49:05 -0700 (PDT) Subject: [DAS2] Re: Problems with biopackages DAS2 server In-Reply-To: References: Message-ID: On Thu, 6 Oct 2005, Helt,Gregg wrote: > I'm having some problems with feature responses from the DAS/2 server at > das.biopackages.net. It looks like in the das2feature XML at least some > features are now pointing to themselves as parents. For example, as Yes, I know about this one. It's an artifact of the performance improvements I pushed to the production server (the one you use). It can be fixed quickly -- like by Friday. > part of response to > http://das.biopackages.net/das/genome/human/17/feature?overlaps=chr21/19 > 301959:21567303;type=SO:mRNA > > ... > > > > ... > > ... > > Another problem I'm seeing is server internal errors in response to > combination of an overlaps and inside feature filter, for example: Now that you've mentioned it, I could have predicted this error. I'll add a unit test for overlaps+inside combined queries. So the good news is that once I fix it and add a test for this type of error, it won't happen again. The bad news is that this fix may take several days as I have several other urgent things in my queue. > http://das.biopackages.net/das/genome/human/17/feature?overlaps=chr21/25 > 108802:26550995;inside=chr21/21567303:46976097;type=SO:mRNA > > returns an HTTP status code of 500, "Internal Server Error". > Has the combination of overlaps/inside filter not been implemented yet? > Or is this possibly because of the size of the inside filter's region? > > I haven't noticed either of these problems before, but for the last > several weeks I've only been testing the higher-level responses > (sources, regions, types) while redoing the client GUI, so I'm not sure > when this started happening. These are new errors as of last week, when I pushed out the new range query optimizations. -Allen > > gregg > > > From allenday at ucla.edu Sat Oct 8 00:24:05 2005 From: allenday at ucla.edu (Allen Day) Date: Fri, 7 Oct 2005 17:24:05 -0700 (PDT) Subject: [DAS2] Re: Problems with biopackages DAS2 server In-Reply-To: References: Message-ID: the parentage bug has been repaired on the prod server. overlaps+inside will have to wait until next week. -allen On Thu, 6 Oct 2005, Helt,Gregg wrote: > I'm having some problems with feature responses from the DAS/2 server at > das.biopackages.net. It looks like in the das2feature XML at least some > features are now pointing to themselves as parents. For example, as > part of response to > http://das.biopackages.net/das/genome/human/17/feature?overlaps=chr21/19 > 301959:21567303;type=SO:mRNA > > ... > > > > ... > > ... > > Another problem I'm seeing is server internal errors in response to > combination of an overlaps and inside feature filter, for example: > > http://das.biopackages.net/das/genome/human/17/feature?overlaps=chr21/25 > 108802:26550995;inside=chr21/21567303:46976097;type=SO:mRNA > > returns an HTTP status code of 500, "Internal Server Error". > Has the combination of overlaps/inside filter not been implemented yet? > Or is this possibly because of the size of the inside filter's region? > > I haven't noticed either of these problems before, but for the last > several weeks I've only been testing the higher-level responses > (sources, regions, types) while redoing the client GUI, so I'm not sure > when this started happening. > > gregg > > > From suzi at fruitfly.org Mon Oct 17 14:56:02 2005 From: suzi at fruitfly.org (Suzanna Lewis) Date: Mon, 17 Oct 2005 07:56:02 -0700 Subject: [DAS2] missing conference call Message-ID: <9d630486cf99f871bffa2550bc38a23b@fruitfly.org> I won't be able to be on the call this week. Gregg. I hope we can talk tomorrow or Wednesday. -S From Gregg_Helt at affymetrix.com Mon Oct 17 19:41:43 2005 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Mon, 17 Oct 2005 12:41:43 -0700 Subject: [DAS2] Problem with parent/child features in biopackages server feature response Message-ID: I finally got back to testing DAS/2 feature requests/responses in my IGB client. I'm seeing a new problem in responses from the biopackages server, there are no or elements for features at all. See for example the returned XML from my standard test query: http://das.biopackages.net/das/genome/human/17/feature?overlaps=chr21/26 027736:26068042;type=SO:mRNA ... ... ... But I know that "auto316726" should be a child of "BC001178". Andrew, would your DAS/2 validator catch problems like this? Thanks, Gregg From allenday at ucla.edu Mon Oct 17 21:50:14 2005 From: allenday at ucla.edu (Allen Day) Date: Mon, 17 Oct 2005 14:50:14 -0700 (PDT) Subject: [DAS2] Re: Problem with parent/child features in biopackages server feature response In-Reply-To: References: Message-ID: Hi Gregg, There was a logic inversion in my code. PARENT/PART relationships should now be restored. Andrew, I'd also like to know if there is some code already written that can be written into my regression tests. I understand that it's really irritating to have non-fatal errors of different types continuously appearing. Maybe I should spend some time beefing up the regression test suite to do things like diff feature graphs and make sure they are identical? Something to discuss on the conference call... -Allen On Mon, 17 Oct 2005, Helt,Gregg wrote: > I finally got back to testing DAS/2 feature requests/responses in my IGB > client. I'm seeing a new problem in responses from the biopackages > server, there are no or elements for features at all. > > See for example the returned XML from my standard test query: > http://das.biopackages.net/das/genome/human/17/feature?overlaps=chr21/26 > 027736:26068042;type=SO:mRNA > > ... > > > ... > > name="auto316726"> > > > ... > > But I know that "auto316726" should be a child of "BC001178". > > Andrew, would your DAS/2 validator catch problems like this? > > > Thanks, > Gregg > > From Gregg_Helt at affymetrix.com Tue Oct 18 03:36:47 2005 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Mon, 17 Oct 2005 20:36:47 -0700 Subject: [DAS2] RE: Problem with parent/child features in biopackages server feature response Message-ID: Thanks for the quick fix! Now I'm seeing another problem though. Some feature queries with a combination of one "overlaps" and one "inside" filter are giving weird errors. But some return correctly. For example, http://das.biopackages.net/das/genome/human/17/feature?overlaps=chr21/26 104961:26271647;inside=chr21/26104961:46976097;type=SO:mRNA returns correctly. But http://das.biopackages.net/das/genome/human/17/feature?overlaps=chr21/25 854314:26027736;inside=chr21/0:26027736;type=SO:mRNA does not, instead returning these headers: HTTP/1.1 200 OK Date: Mon, 17 Oct 2005 22:40:54 GMT Server: Apache/2.0.51 (Fedora) X-DAS-Version: DAS/2.0 X-DAS-Server: GMOD/0.0 X-DAS-Content-Type: text/x-das-feature+xml X-DAS-Status-Details: Died at /usr/lib/perl5/site_perl/5.8.3/Package/Base/Devel.pm line 425. and then what appears to be more headers but is designated as content (so I'm guessing a header-terminating blank line got inserted before these): X-DAS-Status: 500 X-DAS-Status: 200 Content-Length: 0 Keep-Alive: timeout=15, max=100 Connection: Keep-Alive Content-Type: text/xml The combination of an "overlaps" and "inside" filter is an important part of IGB's client-side query optimizations. I can turn this optimization off for now, but I'm hoping you can diagnose and fix the problem. Thanks again, Gregg > -----Original Message----- > From: Allen Day [mailto:allenday at ucla.edu] > Sent: Monday, October 17, 2005 2:50 PM > To: Helt,Gregg > Cc: das2 at portal.open-bio.org > Subject: Re: Problem with parent/child features in biopackages server > feature response > > Hi Gregg, > > There was a logic inversion in my code. PARENT/PART relationships should > now be restored. > > Andrew, I'd also like to know if there is some code already written that > can be written into my regression tests. > > I understand that it's really irritating to have non-fatal errors of > different types continuously appearing. Maybe I should spend some time > beefing up the regression test suite to do things like diff feature graphs > and make sure they are identical? Something to discuss on the conference > call... > > -Allen > > On Mon, 17 Oct 2005, Helt,Gregg wrote: > > > I finally got back to testing DAS/2 feature requests/responses in my IGB > > client. I'm seeing a new problem in responses from the biopackages > > server, there are no or elements for features at all. > > > > See for example the returned XML from my standard test query: > > http://das.biopackages.net/das/genome/human/17/feature?overlaps=chr21/26 > > 027736:26068042;type=SO:mRNA > > > > ... > > > > > > ... > > > > > name="auto316726"> > > > > > > ... > > > > But I know that "auto316726" should be a child of "BC001178". > > > > Andrew, would your DAS/2 validator catch problems like this? > > > > > > Thanks, > > Gregg > > > > From allenday at ucla.edu Tue Oct 18 06:15:18 2005 From: allenday at ucla.edu (Allen Day) Date: Mon, 17 Oct 2005 23:15:18 -0700 (PDT) Subject: [DAS2] RE: Problem with parent/child features in biopackages server feature response In-Reply-To: References: Message-ID: Okay, give it a shot now. You uncovered a bug in my SQL query where the parent feature was outside your overlaps+inside ranges and wasn't being retrieved, leading to an error. These two are now in my regression test suite. -Allen On Mon, 17 Oct 2005, Helt,Gregg wrote: > Thanks for the quick fix! > > Now I'm seeing another problem though. Some feature queries with a > combination of one "overlaps" and one "inside" filter are giving weird > errors. But some return correctly. > > For example, > http://das.biopackages.net/das/genome/human/17/feature?overlaps=chr21/26 > 104961:26271647;inside=chr21/26104961:46976097;type=SO:mRNA > returns correctly. > > But > http://das.biopackages.net/das/genome/human/17/feature?overlaps=chr21/25 > 854314:26027736;inside=chr21/0:26027736;type=SO:mRNA > does not, instead returning these headers: > HTTP/1.1 200 OK > Date: Mon, 17 Oct 2005 22:40:54 GMT > Server: Apache/2.0.51 (Fedora) > X-DAS-Version: DAS/2.0 > X-DAS-Server: GMOD/0.0 > X-DAS-Content-Type: text/x-das-feature+xml > X-DAS-Status-Details: Died at > /usr/lib/perl5/site_perl/5.8.3/Package/Base/Devel.pm line 425. > > and then what appears to be more headers but is designated as content > (so I'm guessing a header-terminating blank line got inserted before > these): > X-DAS-Status: 500 > X-DAS-Status: 200 > Content-Length: 0 > Keep-Alive: timeout=15, max=100 > Connection: Keep-Alive > Content-Type: text/xml > > The combination of an "overlaps" and "inside" filter is an important > part of IGB's client-side query optimizations. I can turn this > optimization off for now, but I'm hoping you can diagnose and fix the > problem. > > Thanks again, > Gregg > > > -----Original Message----- > > From: Allen Day [mailto:allenday at ucla.edu] > > Sent: Monday, October 17, 2005 2:50 PM > > To: Helt,Gregg > > Cc: das2 at portal.open-bio.org > > Subject: Re: Problem with parent/child features in biopackages server > > feature response > > > > Hi Gregg, > > > > There was a logic inversion in my code. PARENT/PART relationships > should > > now be restored. > > > > Andrew, I'd also like to know if there is some code already written > that > > can be written into my regression tests. > > > > I understand that it's really irritating to have non-fatal errors of > > different types continuously appearing. Maybe I should spend some > time > > beefing up the regression test suite to do things like diff feature > graphs > > and make sure they are identical? Something to discuss on the > conference > > call... > > > > -Allen > > > > On Mon, 17 Oct 2005, Helt,Gregg wrote: > > > > > I finally got back to testing DAS/2 feature requests/responses in my > IGB > > > client. I'm seeing a new problem in responses from the biopackages > > > server, there are no or elements for features at > all. > > > > > > See for example the returned XML from my standard test query: > > > > http://das.biopackages.net/das/genome/human/17/feature?overlaps=chr21/26 > > > 027736:26068042;type=SO:mRNA > > > > > > ... > > > > > > > > > ... > > > > > > > > name="auto316726"> > > > > > > > > > ... > > > > > > But I know that "auto316726" should be a child of "BC001178". > > > > > > Andrew, would your DAS/2 validator catch problems like this? > > > > > > > > > Thanks, > > > Gregg > > > > > > > From dalke at dalkescientific.com Tue Oct 18 20:33:30 2005 From: dalke at dalkescientific.com (Andrew Dalke) Date: Tue, 18 Oct 2005 22:33:30 +0200 Subject: [DAS2] Sanger/EBI trip report Message-ID: <2c2048500487fb8e44532a89eee83c7d@dalkescientific.com> I visited EBI and Sanger last week to talk with the people there about their use of DAS, the ongoing work with the DAS/2 spec, and the future directions, including structure DAS. One meeting was with Andreas, the other Andreas (there are too many Andre* in the UK - I think I need to change my name), Eugene and Stefan. Andreas has a service registry system. I don't know where it is though. The registration includes metadata about the server. I would like some way for the DAS/2 server to provide the metadata so the registry could extract most of what it needs by querying the base server. As Andreas pointed out, that data could wrong or incomplete so the registry could override it. I mentioned the idea that the DAS/2 spec as it is now lets the registry server provide the top-level das/genome XML and is free to point clients to the real databases. This is one of the advantages of a ReST architecture. An interesting thing I learned was the wide use of stylesheets. There are about 15 stylesheet types in use on the campus, and Ensemble uses a version which is not-quite compatible. Andreas Prlic pointed out that the stylesheet needs extensions for 3D because the annotation styles are different than for a 2D plot. Thomas Down apparently has a version which puts a color scale on a field, eg, so that better scores are shown differently from worse scores. Stylesheets also came up when talking with Ed (or is that the other Ed :) and Roy. They are developing zmap, a replacement for fmap. It's a C app (gtk-based using the FooCanvas to display huge numbers of elements) designed to speak the same xremote API as fmap. They want annotations which can be individually annotatable, that is, annotated on more than the type. The example they gave was using three tracks - annotation, transcript and homology. They want to copy from the later two into the first track and preserve the original color and style. Sadly, that's what my notes say, but I don't understand it from there. What I took from it was the need to have different ways to determine a style for an annotation, like on a pre-track or perhaps per-annotation mechanism. The obvious one which comes to mind, which we talked about as a possibility, was to take ideas from CSS. Ed (I think) asked about how to handle assembly data. I pointed out the section in the spec which says it can be fetched by asking for it in BED format. He wanted to know more about how to know if a given element was a clone or a transcript. At this point I said he needed to ask a real scientist. :) James Gilbert also came by during the discussion. He asked about how we deal with hierarchical features, and wanted to know more about how our data model fits with the one in Otter. http://www.sanger.ac.uk/Users/jgrg/otter_xml.html I don't know the answer to that question. In both meetings people like that we refrained from making new XML for everything, using "format=" instead. Andreas et al. asked about computational services which might take a non-trivial time. I mentioned the solution we talked about during BOSC where the server returns a "202 Accepted" and a bit of XML saying "you can check on the status at this URL but it'll probably take about 5 minutes to figure out." The client should be able to ask the server to halt the computation. In general there was a good reception to the use of the "format=" parameter, instead of making new XML formats. It does look like we need to spend more time on the format extensibility. It seems much of what the UK folks do is based on extending DAS/1 in various ways. DAS/2 doesn't and cannot capture all of them. I've been looking at the ATOM spec. http://www.intertwingly.net/wiki/pie/RestEchoApiDiscuss http://atompub.org/2005/07/11/draft-ietf-atompub-format-10.html It has a very nice way to embed data in the atom:content field, where the data can be inline text, html, xml, or "other", or be a link to an external href. Along those lines, I think the Atom publication protocol has some nice ideas to help with the writeback spec. Ed described the locking model that they use. It's unchanged since last year's dicussion. The annotators decide on who gets a region, which is locked for that person. In their case it's exported into a local AceDB instance, edited via fmap. When done that database (as a whole) is sent back to the main database for integration. The region is locked, preventing resolution conflicts. Andreas et. al mentioned an interesting annotation - annotate a region to say it's been looked at but there are no annotations for the region. "This region intentionally left blank." I talked as well with Tony, mostly on organization issues. One of the things he said they might want to do in the future is a 2D image DAS. I brought up the idea of having a DAS sprint - once the spec w/ writeback starts to congeal, get the implementers together in a room for a few days and work on code, then use the experience to improve the spec. Keepin' it real. I talked about some of the disconnect between the DAS/2 dev folks (all in the US) and the UK folks. The phone conference call is at 8pm UK time and rather little of what we talk about gets written up. When the UK people ask questions (cf. James Gilbert's question "Nested features?" from Sept. 28, 2005) there's no response. Similarly, the DAS/1 extensions in the UK aren't written down so it's hard to know what's useful for DAS/2. My being in Europe for the next few months should help a bit with that, and I've always had a wacky schedule anyway so I'll be in on the conf. calls (now that I'm back to easily available broadband). But I'm not enough of a domain expert to be able to answer or address the scientific points. I've missed a few things so if anyone else here wants to, feel free to add comments. Andrew dalke at dalkescientific.com From ak at ebi.ac.uk Tue Oct 18 22:39:30 2005 From: ak at ebi.ac.uk (Andreas Kahari) Date: Tue, 18 Oct 2005 23:39:30 +0100 Subject: [DAS2] Sanger/EBI trip report In-Reply-To: <2c2048500487fb8e44532a89eee83c7d@dalkescientific.com> References: <2c2048500487fb8e44532a89eee83c7d@dalkescientific.com> Message-ID: <20051018223930.GC6854@ebi.ac.uk> On Tue, Oct 18, 2005 at 10:33:30PM +0200, Andrew Dalke wrote: > I visited EBI and Sanger last week to talk with the > people there about their use of DAS, the ongoing work > with the DAS/2 spec, and the future directions, including > structure DAS. > > One meeting was with Andreas, the other Andreas (there are > too many Andre* in the UK - I think I need to change my > name), Eugene and Stefan. Hi, I'm one of them Andreases. In this particular email I'm just commenting on a very small number of things that Andrew is writing. > Andreas has a service registry system. I don't know where That's Andreas Prlic, the other one (depending on your point of view), not me. > it is though. The registration includes metadata about the It is at http://das.sanger.ac.uk/registry/ (final slash is essential, it seems) > server. I would like some way for the DAS/2 server to > provide the metadata so the registry could extract most > of what it needs by querying the base server. As Andreas > pointed out, that data could wrong or incomplete so the > registry could override it. I mentioned the idea that > the DAS/2 spec as it is now lets the registry server > provide the top-level das/genome XML and is free to point > clients to the real databases. This is one of the > advantages of a ReST architecture. > > > An interesting thing I learned was the wide use of stylesheets. > There are about 15 stylesheet types in use on the campus, > and Ensemble uses a version which is not-quite compatible. > Andreas Prlic pointed out that the stylesheet needs extensions > for 3D because the annotation styles are different than for > a 2D plot. Thomas Down apparently has a version which puts > a color scale on a field, eg, so that better scores are shown > differently from worse scores. > > Stylesheets also came up when talking with Ed (or is that the > other Ed :) and Roy. They are developing zmap, a replacement > for fmap. It's a C app (gtk-based using the FooCanvas to > display huge numbers of elements) designed to speak the same > xremote API as fmap. They want annotations which can be > individually annotatable, that is, annotated on more than the type. > > The example they gave was using three tracks - annotation, > transcript and homology. They want to copy from the later two > into the first track and preserve the original color and style. > Sadly, that's what my notes say, but I don't understand it from > there. What I took from it was the need to have different > ways to determine a style for an annotation, like on a > pre-track or perhaps per-annotation mechanism. > > The obvious one which comes to mind, which we talked about > as a possibility, was to take ideas from CSS. > > Ed (I think) asked about how to handle assembly data. > I pointed out the section in the spec which says it can be > fetched by asking for it in BED format. He wanted to > know more about how to know if a given element was a clone > or a transcript. At this point I said he needed to ask > a real scientist. :) > > James Gilbert also came by during the discussion. He > asked about how we deal with hierarchical features, and > wanted to know more about how our data model fits with > the one in Otter. > http://www.sanger.ac.uk/Users/jgrg/otter_xml.html > I don't know the answer to that question. > > In both meetings people like that we refrained from making > new XML for everything, using "format=" instead. > > Andreas et al. asked about computational services which > might take a non-trivial time. I mentioned the solution > we talked about during BOSC where the server returns a > "202 Accepted" and a bit of XML saying "you can check on > the status at this URL but it'll probably take about 5 > minutes to figure out." The client should be able to > ask the server to halt the computation. This is related to something that mainly Tom Oinn here at the EBI has been working on: Distributed Annotation with Lazily Evaluated Computation (DALEC), a kind of DAS frontend to Taverna workflows. http://taverna.sourceforge.net/projects/dalec/ > In general there was a good reception to the use of the > "format=" parameter, instead of making new XML formats. > > It does look like we need to spend more time on the > format extensibility. It seems much of what the UK folks > do is based on extending DAS/1 in various ways. DAS/2 > doesn't and cannot capture all of them. I've been looking > at the ATOM spec. > http://www.intertwingly.net/wiki/pie/RestEchoApiDiscuss > http://atompub.org/2005/07/11/draft-ietf-atompub-format-10.html I need to read this. > It has a very nice way to embed data in the atom:content > field, where the data can be inline text, html, xml, or > "other", or be a link to an external href. > > Along those lines, I think the Atom publication protocol > has some nice ideas to help with the writeback spec. > > Ed described the locking model that they use. It's > unchanged since last year's dicussion. The annotators > decide on who gets a region, which is locked for that > person. In their case it's exported into a local AceDB > instance, edited via fmap. When done that database (as > a whole) is sent back to the main database for integration. > The region is locked, preventing resolution conflicts. > > Andreas et. al mentioned an interesting annotation - annotate > a region to say it's been looked at but there are no > annotations for the region. "This region intentionally > left blank." Yes, this was something that confused me at first but that makes perfect sense to me now. Groups sometimes need to say they've looked at a region (protein/gene/whatever) because the fact that they are explicitly not annotating something is as much an annotation as actually annotating something with a box. Covering the region with an annotation saying "there's nothing here" does not seem quite right to me. > I talked as well with Tony, mostly on organization issues. > One of the things he said they might want to do in the > future is a 2D image DAS. > > I brought up the idea of having a DAS sprint - once the > spec w/ writeback starts to congeal, get the implementers > together in a room for a few days and work on code, then > use the experience to improve the spec. Keepin' it real. > > I talked about some of the disconnect between the DAS/2 > dev folks (all in the US) and the UK folks. The phone > conference call is at 8pm UK time and rather little of > what we talk about gets written up. When the UK people > ask questions (cf. James Gilbert's question "Nested features?" > from Sept. 28, 2005) there's no response. Similarly, > the DAS/1 extensions in the UK aren't written down so This is not *quite* true. The alignment and structure extensions to DAS/1 by Andreas Prlic are well documented here: http://www.efamily.org.uk/xml/das/documentation/ > it's hard to know what's useful for DAS/2. My being in > Europe for the next few months should help a bit with > that, and I've always had a wacky schedule anyway so I'll > be in on the conf. calls (now that I'm back to easily > available broadband). But I'm not enough of a domain > expert to be able to answer or address the scientific > points. > Me and Stefan enjoyed Andrew's visit and would certainly like to see some sort of dialogue or collaboration or whatever may help getting further with specifying and implementing DAS/2. > > I've missed a few things so if anyone else here wants > to, feel free to add comments. > > Andrew > dalke at dalkescientific.com Regards, Andreas -- Andreas K?h?ri EMBL-EBI/ensembl ------{ www.embl.org }----{ www.ebi.ac.uk }----{ www.ensembl.org }------ From ap3 at sanger.ac.uk Wed Oct 19 08:56:25 2005 From: ap3 at sanger.ac.uk (Andreas Prlic) Date: Wed, 19 Oct 2005 09:56:25 +0100 Subject: [DAS2] Sanger/EBI trip report In-Reply-To: <2c2048500487fb8e44532a89eee83c7d@dalkescientific.com> References: <2c2048500487fb8e44532a89eee83c7d@dalkescientific.com> Message-ID: <9d0b4d6d8bde3942d590b5c1aa78efd2@sanger.ac.uk> Hi Andrew! Thanks for the very good summary. I wanted to comment on the DAS -registry we are having here at http://das.sanger.ac.uk/registry/ So far there has not been a mechanism how DAS clients can programmatically discover available DAS sources, that can be shared between different DAS clients. This registry addresses that issue (among others). One thing that was necessary to do for this, was to provide "coordinate systems", that describe the data that has been annotated. E.g. Ensembl can project data from different coordinate systems into one display so it is important to know what the data is being served in. - also some coordinate systems are not supported and therefore such DAS sources have to be ignored. What I understood from our discussion there is no convention on globally unique coordinate systems in DAS2 so far, but I think that would be a nice feature to have. There is some documentation on the registry available at http://das.sanger.ac.uk/registry/help_index.jsp Greetings, Andreas ----------------------------------------------------------------------- Andreas Prlic Wellcome Trust Sanger Institute Hinxton, Cambridge CB10 1SA, UK +44 (0) 1223 49 6891 From Gregg_Helt at affymetrix.com Wed Oct 19 18:18:25 2005 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Wed, 19 Oct 2005 11:18:25 -0700 Subject: [DAS2] Sanger/EBI trip report Message-ID: > -----Original Message----- > From: das2-bounces at portal.open-bio.org [mailto:das2-bounces at portal.open- > bio.org] On Behalf Of Andrew Dalke > Sent: Tuesday, October 18, 2005 1:34 PM > To: DAS/2 > Subject: [DAS2] Sanger/EBI trip report > ... > I talked about some of the disconnect between the DAS/2 > dev folks (all in the US) and the UK folks. The phone > conference call is at 8pm UK time and rather little of > what we talk about gets written up. When the UK people > ask questions (cf. James Gilbert's question "Nested features?" > from Sept. 28, 2005) there's no response. Similarly, > the DAS/1 extensions in the UK aren't written down so > it's hard to know what's useful for DAS/2. My being in > Europe for the next few months should help a bit with > that, and I've always had a wacky schedule anyway so I'll > be in on the conf. calls (now that I'm back to easily > available broadband). But I'm not enough of a domain > expert to be able to answer or address the scientific > points. ... As Andrew mentioned, I've been hosting a weekly DAS/2 conference call over here in the US, every Thursday at 12 noon Pacific time. It'd be great if Sanger/EBI folk could join in. Would moving it to 9 AM Pacific (5 PM UK) help? And I'll see about getting someone to take notes for the meetings and posting to the list. I agree we definitely need to be more active about responding to questions on the DAS/2 mailing list. There's plenty to talk about... Thanks, Gregg Helt From Gregg_Helt at affymetrix.com Wed Oct 19 19:05:06 2005 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Wed, 19 Oct 2005 12:05:06 -0700 Subject: [DAS2] Sanger/EBI trip report Message-ID: > -----Original Message----- > From: das2-bounces at portal.open-bio.org [mailto:das2-bounces at portal.open- > bio.org] On Behalf Of Andreas Prlic > Sent: Wednesday, October 19, 2005 1:56 AM > To: Andrew Dalke > Cc: DAS/2 > Subject: Re: [DAS2] Sanger/EBI trip report > > Hi Andrew! > > Thanks for the very good summary. > > I wanted to comment on the DAS -registry we are having here at > http://das.sanger.ac.uk/registry/ > > So far there has not been a mechanism how DAS clients can > programmatically discover available DAS sources, that can be > shared between different DAS clients. This registry addresses > that issue (among others). As part of the work under the DAS/2 grant, we've had several meetings trying to sketch out a strategy for registry/discovery of DAS/2 servers. Andreas presented his DAS registry at one of these back in September 2004. Looks very nice. There's still a lot of differing opinions about the best way to do this though. Maybe in November we can devote at least one of the DAS/2 teleconference calls to revisiting registry/discovery strategies, if Andreas could join in. > One thing that was necessary to do for this, was to provide > "coordinate systems", that describe the data that has been > annotated. E.g. Ensembl can project data from different > coordinate systems into one display so it is important to know > what the data is being served in. - also some coordinate > systems are not supported and therefore such DAS sources have > to be ignored. > > What I understood from our discussion there is no convention > on globally unique coordinate systems in DAS2 so far, but I think > that would be a nice feature to have. > > There is some documentation on the registry available at > http://das.sanger.ac.uk/registry/help_index.jsp > > Greetings, > Andreas > As far as globally unique coordinate systems, the current plan in DAS/2 is that this should be indicated as a URI id attribute in an element for the versioned source. For example, as part of the response to http://server/das/genome: and allowing multiple tags to indicate different URIs for the same genome assembly. Preferably the assembly id would be a URI controlled by the institution that performed the assembly, and be resolvable to some meaningful information about that assembly. In practice that might not be possible (for example the above snippet refers to a download page at UCSC but the assembly was done at NCBI). gregg From Gregg_Helt at affymetrix.com Thu Oct 20 17:39:23 2005 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Thu, 20 Oct 2005 10:39:23 -0700 Subject: [DAS2] RE: Problem with parent/child features in biopackages server feature response Message-ID: Thanks, whatever you did fixed the problem! gregg > -----Original Message----- > From: Allen Day [mailto:allenday at ucla.edu] > Sent: Monday, October 17, 2005 11:15 PM > To: Helt,Gregg > Cc: das2 at portal.open-bio.org > Subject: RE: Problem with parent/child features in biopackages server > feature response > > Okay, give it a shot now. You uncovered a bug in my SQL query where the > parent feature was outside your overlaps+inside ranges and wasn't being > retrieved, leading to an error. These two are now in my regression test > suite. > > -Allen > > > On Mon, 17 Oct 2005, Helt,Gregg wrote: > > > Thanks for the quick fix! > > > > Now I'm seeing another problem though. Some feature queries with a > > combination of one "overlaps" and one "inside" filter are giving weird > > errors. But some return correctly. > > > > For example, > > http://das.biopackages.net/das/genome/human/17/feature?overlaps=chr21/26 > > 104961:26271647;inside=chr21/26104961:46976097;type=SO:mRNA > > returns correctly. > > > > But > > http://das.biopackages.net/das/genome/human/17/feature?overlaps=chr21/25 > > 854314:26027736;inside=chr21/0:26027736;type=SO:mRNA > > does not, instead returning these headers: > > HTTP/1.1 200 OK > > Date: Mon, 17 Oct 2005 22:40:54 GMT > > Server: Apache/2.0.51 (Fedora) > > X-DAS-Version: DAS/2.0 > > X-DAS-Server: GMOD/0.0 > > X-DAS-Content-Type: text/x-das-feature+xml > > X-DAS-Status-Details: Died at > > /usr/lib/perl5/site_perl/5.8.3/Package/Base/Devel.pm line 425. > > > > and then what appears to be more headers but is designated as content > > (so I'm guessing a header-terminating blank line got inserted before > > these): > > X-DAS-Status: 500 > > X-DAS-Status: 200 > > Content-Length: 0 > > Keep-Alive: timeout=15, max=100 > > Connection: Keep-Alive > > Content-Type: text/xml > > > > The combination of an "overlaps" and "inside" filter is an important > > part of IGB's client-side query optimizations. I can turn this > > optimization off for now, but I'm hoping you can diagnose and fix the > > problem. > > > > Thanks again, > > Gregg > > > > > -----Original Message----- > > > From: Allen Day [mailto:allenday at ucla.edu] > > > Sent: Monday, October 17, 2005 2:50 PM > > > To: Helt,Gregg > > > Cc: das2 at portal.open-bio.org > > > Subject: Re: Problem with parent/child features in biopackages server > > > feature response > > > > > > Hi Gregg, > > > > > > There was a logic inversion in my code. PARENT/PART relationships > > should > > > now be restored. > > > > > > Andrew, I'd also like to know if there is some code already written > > that > > > can be written into my regression tests. > > > > > > I understand that it's really irritating to have non-fatal errors of > > > different types continuously appearing. Maybe I should spend some > > time > > > beefing up the regression test suite to do things like diff feature > > graphs > > > and make sure they are identical? Something to discuss on the > > conference > > > call... > > > > > > -Allen > > > > > > On Mon, 17 Oct 2005, Helt,Gregg wrote: > > > > > > > I finally got back to testing DAS/2 feature requests/responses in my > > IGB > > > > client. I'm seeing a new problem in responses from the biopackages > > > > server, there are no or elements for features at > > all. > > > > > > > > See for example the returned XML from my standard test query: > > > > > > http://das.biopackages.net/das/genome/human/17/feature?overlaps=chr21/26 > > > > 027736:26068042;type=SO:mRNA > > > > > > > > ... > > > > > > > > > > > > ... > > > > > > > > > > > name="auto316726"> > > > > > > > > > > > > ... > > > > > > > > But I know that "auto316726" should be a child of "BC001178". > > > > > > > > Andrew, would your DAS/2 validator catch problems like this? > > > > > > > > > > > > Thanks, > > > > Gregg > > > > > > > > > > From Gregg_Helt at affymetrix.com Thu Oct 20 18:07:18 2005 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Thu, 20 Oct 2005 11:07:18 -0700 Subject: [DAS2] (no subject) Message-ID: I've been revising the DAS/2 GUI in IGB to use a tree view for looking at server / source / version hierarchy. I've attached a screen shot. It's still a little clunky to overlay data from multiple sources (DAS/2, DAS/1, quickload), but I'm hoping to make that smoother in time for the Genome Informatics meeting next week. gregg -------------- next part -------------- A non-text attachment was scrubbed... Name: igb_das2_example2.JPG Type: image/jpeg Size: 159366 bytes Desc: igb_das2_example2.JPG URL: From lstein at cshl.edu Thu Oct 20 20:23:02 2005 From: lstein at cshl.edu (Lincoln Stein) Date: Thu, 20 Oct 2005 16:23:02 -0400 Subject: [DAS2] (no subject) In-Reply-To: References: Message-ID: <200510201623.03566.lstein@cshl.edu> That's looking pretty good. Lincoln On Thursday 20 October 2005 02:07 pm, Helt,Gregg wrote: > I've been revising the DAS/2 GUI in IGB to use a tree view for looking > at server / source / version hierarchy. I've attached a screen shot. > It's still a little clunky to overlay data from multiple sources (DAS/2, > DAS/1, quickload), but I'm hoping to make that smoother in time for the > Genome Informatics meeting next week. > > gregg -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From suzi at fruitfly.org Fri Oct 21 02:41:10 2005 From: suzi at fruitfly.org (Suzanna Lewis) Date: Thu, 20 Oct 2005 22:41:10 -0400 Subject: [DAS2] (no subject) In-Reply-To: <200510201623.03566.lstein@cshl.edu> References: <200510201623.03566.lstein@cshl.edu> Message-ID: <0d95651c0e3be0c941dae38a5185166a@fruitfly.org> Lincoln, See you tomorrow I hope? We should strongly encourage Jason and Zhirong to have Gregg talk. -S On Oct 20, 2005, at 4:23 PM, Lincoln Stein wrote: > That's looking pretty good. > > Lincoln > > On Thursday 20 October 2005 02:07 pm, Helt,Gregg wrote: >> I've been revising the DAS/2 GUI in IGB to use a tree view for looking >> at server / source / version hierarchy. I've attached a screen shot. >> It's still a little clunky to overlay data from multiple sources >> (DAS/2, >> DAS/1, quickload), but I'm hoping to make that smoother in time for >> the >> Genome Informatics meeting next week. >> >> gregg > > -- > Lincoln D. Stein > Cold Spring Harbor Laboratory > 1 Bungtown Road > Cold Spring Harbor, NY 11724 > FOR URGENT MESSAGES & SCHEDULING, > PLEASE CONTACT MY ASSISTANT, > SANDRA MICHELSEN, AT michelse at cshl.edu > _______________________________________________ > DAS2 mailing list > DAS2 at portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/das2 From allenday at ucla.edu Fri Oct 21 17:48:54 2005 From: allenday at ucla.edu (Allen Day) Date: Fri, 21 Oct 2005 10:48:54 -0700 (PDT) Subject: [DAS2] DAS server responses cached In-Reply-To: References: Message-ID: I set up Apache's mod_cache module last night -- you should now get nearly instantaneous response for the common query types ( /region, /type, etc), as well as for other queries that have been previously issued. -Allen On Thu, 20 Oct 2005, Helt,Gregg wrote: > Thanks, whatever you did fixed the problem! > > gregg > > > -----Original Message----- > > From: Allen Day [mailto:allenday at ucla.edu] > > Sent: Monday, October 17, 2005 11:15 PM > > To: Helt,Gregg > > Cc: das2 at portal.open-bio.org > > Subject: RE: Problem with parent/child features in biopackages server > > feature response > > > > Okay, give it a shot now. You uncovered a bug in my SQL query where > the > > parent feature was outside your overlaps+inside ranges and wasn't > being > > retrieved, leading to an error. These two are now in my regression > test > > suite. > > > > -Allen > > > > > > On Mon, 17 Oct 2005, Helt,Gregg wrote: > > > > > Thanks for the quick fix! > > > > > > Now I'm seeing another problem though. Some feature queries with a > > > combination of one "overlaps" and one "inside" filter are giving > weird > > > errors. But some return correctly. > > > > > > For example, > > > > http://das.biopackages.net/das/genome/human/17/feature?overlaps=chr21/26 > > > 104961:26271647;inside=chr21/26104961:46976097;type=SO:mRNA > > > returns correctly. > > > > > > But > > > > http://das.biopackages.net/das/genome/human/17/feature?overlaps=chr21/25 > > > 854314:26027736;inside=chr21/0:26027736;type=SO:mRNA > > > does not, instead returning these headers: > > > HTTP/1.1 200 OK > > > Date: Mon, 17 Oct 2005 22:40:54 GMT > > > Server: Apache/2.0.51 (Fedora) > > > X-DAS-Version: DAS/2.0 > > > X-DAS-Server: GMOD/0.0 > > > X-DAS-Content-Type: text/x-das-feature+xml > > > X-DAS-Status-Details: Died at > > > /usr/lib/perl5/site_perl/5.8.3/Package/Base/Devel.pm line 425. > > > > > > and then what appears to be more headers but is designated as > content > > > (so I'm guessing a header-terminating blank line got inserted before > > > these): > > > X-DAS-Status: 500 > > > X-DAS-Status: 200 > > > Content-Length: 0 > > > Keep-Alive: timeout=15, max=100 > > > Connection: Keep-Alive > > > Content-Type: text/xml > > > > > > The combination of an "overlaps" and "inside" filter is an important > > > part of IGB's client-side query optimizations. I can turn this > > > optimization off for now, but I'm hoping you can diagnose and fix > the > > > problem. > > > > > > Thanks again, > > > Gregg > > > > > > > -----Original Message----- > > > > From: Allen Day [mailto:allenday at ucla.edu] > > > > Sent: Monday, October 17, 2005 2:50 PM > > > > To: Helt,Gregg > > > > Cc: das2 at portal.open-bio.org > > > > Subject: Re: Problem with parent/child features in biopackages > server > > > > feature response > > > > > > > > Hi Gregg, > > > > > > > > There was a logic inversion in my code. PARENT/PART relationships > > > should > > > > now be restored. > > > > > > > > Andrew, I'd also like to know if there is some code already > written > > > that > > > > can be written into my regression tests. > > > > > > > > I understand that it's really irritating to have non-fatal errors > of > > > > different types continuously appearing. Maybe I should spend some > > > time > > > > beefing up the regression test suite to do things like diff > feature > > > graphs > > > > and make sure they are identical? Something to discuss on the > > > conference > > > > call... > > > > > > > > -Allen > > > > > > > > On Mon, 17 Oct 2005, Helt,Gregg wrote: > > > > > > > > > I finally got back to testing DAS/2 feature requests/responses > in my > > > IGB > > > > > client. I'm seeing a new problem in responses from the > biopackages > > > > > server, there are no or elements for features > at > > > all. > > > > > > > > > > See for example the returned XML from my standard test query: > > > > > > > > > http://das.biopackages.net/das/genome/human/17/feature?overlaps=chr21/26 > > > > > 027736:26068042;type=SO:mRNA > > > > > > > > > > ... > > > > > name="BC001178"> > > > > > > > > > > ... > > > > > > > > > > > > > > name="auto316726"> > > > > > > > > > > > > > > > ... > > > > > > > > > > But I know that "auto316726" should be a child of "BC001178". > > > > > > > > > > Andrew, would your DAS/2 validator catch problems like this? > > > > > > > > > > > > > > > Thanks, > > > > > Gregg > > > > > > > > > > > > > > From Steve_Chervitz at affymetrix.com Sat Oct 22 00:48:57 2005 From: Steve_Chervitz at affymetrix.com (Chervitz, Steve) Date: Fri, 21 Oct 2005 17:48:57 -0700 Subject: [DAS2] DAS/2 weekly meeting notes Message-ID: Notes from the weekly DAS/2 teleconference, 21 Oct 2005. $Id: das2-teleconf-2005-10-20.txt,v 1.1 2005/10/21 23:55:05 sac Exp $ * Gregg and Lincoln hashed out some details regarding specific aims for the DAS/2 continuation grant, refining these aims: - Client enhancements + includes client-side caching - Spec enhancements + caching (IF_MODIFIED_SINCE support) + dealing with lengthy server operations + extensions to stylesheet capabilities + embedding arbitrary snippets of XML (e.g., RDF) - IGB-Apollo interactions + Gregg has been doing some background research on HTTP and UDP protocols * Greg plans to have a draft ready to circulate for review by the participating organizations by Tues next week (10/25). * DAS/2 meeting at CSHL Genome Informatics meeting: - Planned for 10/28 evening (7:30pm EDT) in Lincoln's office - Teleconf will be available * Proposal to move the weekly DAS/2 meeting to 9:00am PDT/PST Thursday so that UK folks can participate - Folks will check schedules to see if this works - Chime in if it doesn't work for you Status Reports -------------- Gregg: * Completed treeview of DAS/2 sources in IGB. See his recent post with GUI snapshot. This will be included by default in the next IGB release next week, to coincide with the CSHL GI meeting. * Invited to talk at CSHL GI meeting on genome viz. Good opportunity to present IGB. May accept the invitation if it's later during the meeting. * Is making an effort to respond to questions on the DAS/2 list. * IGB release planned for next week, coinciding with CSHL GI meeting. - Lots of new additions (DAS/2 support, new GUI preferences setting). Might be a 4.0 release. - Access via Java Web Start - Users will be able to launch previous version of IGB, if they have problems with the new version. Discussion: - Ann Loraine puts in a strong plug for JWS-based application launching. It's just so convenient, even for someone who is comfortable building and running java code with command-line tools (but doesn't always have time to do so these days). - Gregg would like to set up an auto-build system for IGB code on SourceForge (or open-bio.org, if SF doesn't support this). This would allow users to download a pre-built jar. A JWS system could then be built on top of this if we want. Ed E: * Lots of IGB commits last week. - Can now set persistent preferences via GUI rather than pref file (bg colors, feature colors, number of levels of a given tier, show/hide prefs, etc.). Much nicer user experience. - No documentation yet. - Gregg adds: Deb Coulton (sp?) is now revising the IGB user guide. Targeting end of Nov for release. Steve C: * Working on improving the DAS/2 spec, resolving bugzilla issues, consistent wording, etc. - See http://biodas.org/documents/das2/das2_get.html - Be sure to force your browser to reload the page daily to make sure you have the latest version. - Will post notable changes to the DAS/2 list. - Will post any issues requiring broader discussion to list before modifying spec. - Gregg adds: See his response to Andreas on the list regarding how to deal with server agreement on what sequence coordinate system is being used. Do folks agree with Gregg's proposal? Allen D: * Bug fixes on biopackages.net DAS/2 server - A DAS2XML parser would help catch bugs in advance. Will ask Andrew about setting this up. * Discussed performance issues. - Feature request is as fast as it's going to get now (in terms of SQL optimization). - Has a SQUID proxy set up to help with relatively static data, but you must configure your client to use it. (also ss Allen's post on 10/21 regarding aache mod_cache module). - Gregg adds: Would like to get support in spec for 'if modified since' tags. - Gregg: What about using a different schema underneath, such as GFF-db as used by gbrowse which is quite fast? - Allen: It's not fast for large segments. The DAS/2 server uses a chado schema with feature indexing optimizations by Allen. It's performance improves for big queries, but degrades for small queries. So the DAS/2 server's response is more flat over a range of query sizes relative to gbrowse. Possibly could improve the DAS/2 server by upgrading from Postgres 7.4 to 8.0, which has improvements on left joins. - Lincoln: How could gbrowse performance be improved? - Allen: Via partial indexing. Should improve for queries covering 200+ MB. From Steve_Chervitz at affymetrix.com Thu Oct 27 01:29:38 2005 From: Steve_Chervitz at affymetrix.com (Chervitz, Steve) Date: Wed, 26 Oct 2005 18:29:38 -0700 Subject: [DAS2] Spec issues Message-ID: In the spec for DAS/2 retrievals, there are some open issues regarding types and features that I'd like to solicit feedback on. This is kind of a long message, so feel free to pick and choose what you want to respond to. For reference, here's the latest retrieval spec: http://biodas.org/documents/das2/das2_get.html Type properties example (only showing relevant attributes): Description: A set of machine-readable configuration information as key/value pairs The spec currently describes the key attribute as "the name of the property. Elaborate on how to interpret the name". So how should name be interpreted? Can it be a URI/URL? If we want it to be just a simple string that can derive from some controlled vocabulary, how does one specify which vocabulary it derives from? (e.g., http://www.biodas.org/ns/das/properties/2.00) Also, we might want to allow some controlled vocabulary terms to be used for the value of type.source (e.g., "das:curated"), to ensure that different users use the same term to specify that a feature type is produced by curation. The spec also seems alarmed by the existence of a xml:base attribute in the TYPE element. The idea is that any relative URL within this element would be resolved using that element's xml:base attribute. How would folks be with having the DAS/2 spec fully support the XML Base spec ( http://www.w3.org/TR/xmlbase/ )? The result of this would be to add an optional xml:base attribute to all elements that contain URLs or subelements with URLs. For an example of how this would work, in the above XML snippet, the absolute URL for TYPE.id would be http://www.wormbase.org/dase/genome/volvox/1/type/gene/curated_gene Next issue: Feature properties example (only showing relevant attributes): Description: Properties are typed using the ptype attribute. The value of the property may be indicated by a URL given by the href attribute, or may be given inline as the CDATA content of the section. 29 2 So in contrast to the TYPE properties which are restricted to being simple string-based key:value pairs, FEATURE properties can be more complex, which seems reasonable, given the wild world of features. We might consider using 'key' rather than 'ptype' for FEATURE properties, for consistency with TYPE prop elements (however, read on). In the feature filter section, the property-based filter describes feature properties as being string-based, a la TYPE properties. More complex feature properties would not necessarily be filterable, so this should be expanded upon, stating that property-based feature filters will only work for feature properties that are simple strings (not properties where the value is a URL or is a CDATA with MIME type not equal to text/plain). One issue that comes up here, which actually pertains to the spec as a whole, is that there are various attributes that are intended to be URLs but are named quite different things. In the FEATURE snippet above, there are four different attributes that are URLs: id, type, ptype, and href. There is a bugzilla entry requesting that all attributes named 'id' which are in fact URLs be named 'uri': http://bugzilla.open-bio.org/show_bug.cgi?id=1788 This seems like a good move to me, since it flags these attributes as resolvable. Does anyone have objections to this? For other attributes that are URLs but are not named 'id' or 'href' (such as type, ptype above), we could either leave as-is, or we could append '_uri' to their name to flag their resolvability. Feature's PROP.ptype is an interesting case, since it is both an identifier (equivalent to type PROP.key) and a URL for describing the property. For this reason, I would also propose either renaming it 'uri' (to capture this dual role) or 'key' (for consistency with type properties). Thoughts? The feature example DASXML above also shows a way to attach a protein translation to a feature as a property. Since this will be a common task, I'd vote for having a feature property of "das:property/protein_translation" among the list of built-in feature properties in the das namespace. Anyone want to take a stab at defining the full list of built-in properties within the "das:" and "bg:" namespaces? I think it's a key requirement for interoperability. Steve From Gregg_Helt at affymetrix.com Thu Oct 27 14:50:17 2005 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Thu, 27 Oct 2005 07:50:17 -0700 Subject: [DAS2] DAS/2 meeting on Friday, no meeting today Message-ID: Since we're having a DAS/2 meeting at CSHL on Friday, I'm canceling our regular meeting today. I'll send out more details later about tomorrow's meeting, but it probably won't start till the evening (Eastern time). Thanks, Gregg From lstein at cshl.edu Thu Oct 27 15:14:55 2005 From: lstein at cshl.edu (Lincoln Stein) Date: Thu, 27 Oct 2005 09:14:55 -0600 Subject: [DAS2] DAS/2 meeting on Friday, no meeting today In-Reply-To: References: Message-ID: <200510270914.55947.lstein@cshl.edu> Gregg, Isn't the DAS/2 meeting at CSHL scheduled for Saturday evening at 6? Room details will follow. Lincoln On Thursday 27 October 2005 08:50 am, Helt,Gregg wrote: > Since we're having a DAS/2 meeting at CSHL on Friday, I'm canceling our > regular meeting today. I'll send out more details later about > tomorrow's meeting, but it probably won't start till the evening > (Eastern time). > > Thanks, > Gregg > > > _______________________________________________ > DAS2 mailing list > DAS2 at portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/das2 -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From lstein at cshl.edu Thu Oct 27 15:41:30 2005 From: lstein at cshl.edu (Lincoln Stein) Date: Thu, 27 Oct 2005 09:41:30 -0600 Subject: [DAS2] Spec issues In-Reply-To: References: Message-ID: <200510270941.30528.lstein@cshl.edu> On Wednesday 26 October 2005 07:29 pm, Chervitz, Steve wrote: > In the spec for DAS/2 retrievals, there are some open issues regarding > types and features that I'd like to solicit feedback on. This is kind of a > long message, so feel free to pick and choose what you want to respond to. > > For reference, here's the latest retrieval spec: > http://biodas.org/documents/das2/das2_get.html > > Type properties example (only showing relevant attributes): > > Description: A set of machine-readable configuration information as > key/value pairs > > > ontology="http://song.sf.net/ontologies/sofa#gene" > source="curated" > xml:base="gene/"> > > > > The spec currently describes the key attribute as "the name of the > property. Elaborate on how to interpret the name". So how should name be > interpreted? Can it be a URI/URL? If we want it to be just a simple string > that can derive from some controlled vocabulary, how does one specify which > vocabulary it derives from? (e.g., > http://www.biodas.org/ns/das/properties/2.00) I thought that the "bg:" and "das:" were straight XML namespaces. I.e: Also, we might want to allow some controlled vocabulary terms to be used > for the value of type.source (e.g., "das:curated"), to ensure that > different users use the same term to specify that a feature type is > produced by curation. Same idea, but see below. > > The spec also seems alarmed by the existence of a xml:base attribute in the > TYPE element. The idea is that any relative URL within this element would > be resolved using that element's xml:base attribute. How would folks be > with having the DAS/2 spec fully support the XML Base spec ( > http://www.w3.org/TR/xmlbase/ )? The result of this would be to add an > optional xml:base attribute to all elements that contain URLs or > subelements with URLs. > For an example of how this would work, in the above XML snippet, the > absolute URL for TYPE.id would be > http://www.wormbase.org/dase/genome/volvox/1/type/gene/curated_gene I'm ok with this. > Next issue: Feature properties example (only showing relevant attributes): > > Description: Properties are typed using the ptype attribute. The value of > the property may be indicated by a URL given by the href attribute, or may > be given inline as the CDATA content of the section. > > > type="type/curated_exon"> > 29 > 2 > href="/das/protein/volvox/2/feature/CTEL54X.1" /> > > > > So in contrast to the TYPE properties which are restricted to being simple > string-based key:value pairs, FEATURE properties can be more complex, which > seems reasonable, given the wild world of features. We might consider using > 'key' rather than 'ptype' for FEATURE properties, for consistency with TYPE > prop elements (however, read on). I'm not so happy with "key" since it is nondescript. Originally this was "type" but the word collided with feature type. I am getting uncomfortable with the dichotomy we've (I've?) created between XML base keys/properties and namespace-based keys/properties. It seems nasty to have the ptype attribute be either a relative URI (property/genefinder-score), or a controlled vocabulary member (das:phase). Is there any reason we shouldn't choose one or the other? For example, does this work? xmlns:prop="http://www.wormbase.org/das/genome/volvox/1/property"> 29 2 This looks so much cleaner to me. > In the feature filter section, the property-based filter describes feature > properties as being string-based, a la TYPE properties. More complex > feature properties would not necessarily be filterable, so this should be > expanded upon, stating that property-based feature filters will only work > for feature properties that are simple strings (not properties where the > value is a URL or is a CDATA with MIME type not equal to text/plain). > > One issue that comes up here, which actually pertains to the spec as a > whole, is that there are various attributes that are intended to be URLs > but are named quite different things. In the FEATURE snippet above, there > are four different attributes that are URLs: id, type, ptype, and href. > There is a bugzilla entry requesting that all attributes named 'id' which > are in fact URLs be named 'uri': > http://bugzilla.open-bio.org/show_bug.cgi?id=1788 > This seems like a good move to me, since it flags these attributes as > resolvable. Does anyone have objections to this? Ok with me, but I'd like to hear what people think about throwing out the base idea entirely and using namespaces as described above. > > For other attributes that are URLs but are not named 'id' or 'href' (such > as type, ptype above), we could either leave as-is, or we could append > '_uri' to their name to flag their resolvability. Feature's PROP.ptype is > an interesting case, since it is both an identifier (equivalent to type > PROP.key) and a URL for describing the property. For this reason, I would > also propose either renaming it 'uri' (to capture this dual role) or 'key' > (for consistency with type properties). Thoughts? > > The feature example DASXML above also shows a way to attach a protein > translation to a feature as a property. Since this will be a common task, > I'd vote for having a feature property of > "das:property/protein_translation" among the list of built-in feature > properties in the das namespace. Anyone want to take a stab at defining the > full list of built-in properties within the "das:" and "bg:" namespaces? I > think it's a key requirement for interoperability. Absolutely -- I'd like to start work on that. Lincoln > > Steve > > > _______________________________________________ > DAS2 mailing list > DAS2 at portal.open-bio.org > http://portal.open-bio.org/mailman/listinfo/das2 -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From lstein at cshl.edu Fri Oct 28 19:58:30 2005 From: lstein at cshl.edu (Lincoln Stein) Date: Fri, 28 Oct 2005 13:58:30 -0600 Subject: [DAS2] Revise meeting time In-Reply-To: References: Message-ID: <200510281558.31204.lstein@cshl.edu> Hi, I totally missed the fact that there's a concert at 6 pm Saturday. Would people prefer to meeting over lunch tomorrow after the end of the morning session? Lincoln -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From Gregg_Helt at affymetrix.com Fri Oct 28 20:20:06 2005 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Fri, 28 Oct 2005 13:20:06 -0700 Subject: [DAS2] RE: Revise meeting time Message-ID: Either time works for me. gregg > -----Original Message----- > From: Lincoln Stein [mailto:lstein at cshl.edu] > Sent: Friday, October 28, 2005 3:59 PM > To: das2 at portal.open-bio.org > Cc: Chervitz, Steve; Helt,Gregg; michelse at cshl.edu > Subject: Revise meeting time > > Hi, > > I totally missed the fact that there's a concert at 6 pm Saturday. Would > people prefer to meeting over lunch tomorrow after the end of the morning > session? > > Lincoln > > -- > Lincoln D. Stein > Cold Spring Harbor Laboratory > 1 Bungtown Road > Cold Spring Harbor, NY 11724 > FOR URGENT MESSAGES & SCHEDULING, > PLEASE CONTACT MY ASSISTANT, > SANDRA MICHELSEN, AT michelse at cshl.edu