From Steve_Chervitz at affymetrix.com Thu Nov 2 18:40:15 2006 From: Steve_Chervitz at affymetrix.com (Steve Chervitz) Date: Thu, 02 Nov 2006 15:40:15 -0800 Subject: [DAS2] community annotation bee genome sequence In-Reply-To: <83722dde0611020704h610239tab7acdbe5a961b99@mail.gmail.com> Message-ID: > http://www.genome.org/cgi/content/full/16/11/1329 Thanks for the link, Ann. They give a nice review of different annotation models. Interesting to see how they made use of centralized resources to enable their decentralized annotation effort. They say: "... the DAS system does not yet involve incorporating the community annotation data into an official set of gene models." Note the optimistic "yet". We're working on it! So presumably, they didn't use a DAS-based genome browser largely because of lack of editing support. They did use Apollo, but it's not clear how much they relied on its editing vs read-only viewing functionality. They cite a need for annotation mapping between different assembly versions. UCSC provides liftOver for this (but curiously, they don't provide a apiMel1 to apiMel2 chain file). Gregg has genometry-based tools for doing this, but they're not part of Genoviz/IGB at present. Steve From Gregg_Helt at affymetrix.com Mon Nov 6 10:18:59 2006 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Mon, 6 Nov 2006 07:18:59 -0800 Subject: [DAS2] Adding an optional "searchable" attribute to element Message-ID: In the last DAS/2 teleconference I brought up again the idea of an optional "searchable" or "filter" attribute for the elements returned from a types query -- if present and "false", then that type should not be used in a feature query filter. Here are snippets about this from discussion during the last code sprint (I've tried to strip it down to just the relevant parts): > -----Original Message----- > Sent: Monday, August 21, 2006 2:43 PM > To: DAS/2 > Subject: [DAS2] Notes from the weekly DAS/2 teleconference, 21 Aug 2006 > > Notes from the weekly DAS/2 teleconference, 21 Aug 2006 > Note taker: Steve Chervitz > ... > [The note taker apologizes for attending late (~30min)] > > gh: could a server in the types doc restrict the types. just say > 'transcripts'? > ls: yes. if not going to allow for searching for feature, only via > parent, then types doc should only include parent. > > gh: types doc specifies which types you can query on. > ls: ontology gives you access to all types that might come back > ad: and how to depict them. > gh: yes, but it can be restrictive of the types. > ad: what does client do to display it? > gh: implies we separate out style into stylesheet info again. > no one is serving or using, so we can change w/o major impl changes. > ad: type doc ties a feature to ontology, how to display it, and > includes this extra source field. > gh: types doc has all types server contains but tags as to what the > server allows searching on. > > ad: feels weird. can't see why i'd want to do in my server. > bo: better than limiting the types doc, just have a searchable field. > ad: easy > gh: if you don't say no, then it's searchable. this is backwards > compatible. > ... ... > > ad: range and non-range filters must both be true for a given feature > > gh: ok, as long as we can say in types doc that some types are not > filtered. > ... > [A] andrew will add searchable flag to type document ... The motivation for this addition to the spec is to allow a server to restrict what feature types a client can use for query _filtering_, while still allowing these types of features to be returned from feature queries and their display properties to be described in stylesheets. This restriction is important for my server implementation to make full use of ontologies in describing feature types. And in the more general case, I think it will be good for visualization clients. To use a concrete example, in a GUI I don't want to have to make the user choose between requesting "genscan-transcript", "genscan-exon", "genscan-intron", or some combination of these types to make sure they get all the "genscan" annotation information -- this is a recipe for confusion. Now a smart client that fully understands the sequence ontology could automatically simplify this for the user, but I don't expect most client implementations to be so smart -- after all, one of our goals is to have a low threshold for simple client and server implementations. In this example it would be much easier for the server to just specify that "genscan-exon" and "genscan-intron" are not usable in a query filter, and the client just shows "genscan-transcript" in the query options. This change in the spec is backward compatible, since elements without a "searchable" attribute would by default be searchable. It should be easy for clients to implement, and servers can implement it or ignore it. Gregg From Gregg_Helt at affymetrix.com Mon Nov 6 10:58:40 2006 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Mon, 6 Nov 2006 07:58:40 -0800 Subject: [DAS2] TYPE[@source] -> TYPE[@method] Message-ID: I agree that multiple uses of "source" makes it confusing, and that for types "method" is a reasonable alternative. On a related note, do we really need both "title" and "source/method" attributes for types? Both are optional and supposed to be short human-readable strings describing the type. For a longer description we also have the optional "description" element. gregg > -----Original Message----- > From: das2-bounces at lists.open-bio.org [mailto:das2-bounces at lists.open- > bio.org] On Behalf Of Andrew Dalke > Sent: Thursday, October 26, 2006 6:46 AM > To: DAS/2 > Subject: [DAS2] TYPE[@source] -> TYPE[@method] > > I would like to change the existing TYPE attribute of "source" > and have it use a different attribute name. Its meaning conflicts > with the other uses of "source" in DAS2. > > The best alternative is "method" because (I believe) it is supposed > to store the same information as the corresponding DAS1 TYPE attribute. > > > Andrew > dalke at dalkescientific.com > > _______________________________________________ > DAS2 mailing list > DAS2 at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/das2 From Gregg_Helt at affymetrix.com Mon Nov 6 11:53:22 2006 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Mon, 6 Nov 2006 08:53:22 -0800 Subject: [DAS2] segments and types Message-ID: > -----Original Message----- > From: das2-bounces at lists.open-bio.org [mailto:das2-bounces at lists.open- > bio.org] On Behalf Of Andrew Dalke > Sent: Friday, October 27, 2006 12:56 PM > To: DAS/2 > Subject: [DAS2] segments and types > > A couple of observations about what I've seen in existing > DAS1 servers. Nothing here concerns format changes. > > There are four different ways to handle segments: > 1) Don't provide segment information > "Our clients know the segment because of the id > so they don't need a segments document" > 2) use "size" (pre-DAS 1.0 spec) > 3) use "start"/"stop" (DAS 1.0 spec) > - with variations, like "0", "0" meaning the length is undefined > (and even "1", "0", with a size="2", for one server!) > 4) use a "version" field > > The last is mostly used for protein sequences, that I've seen. > Its an aspect of #1 ("9pti" means "bovine pancreatic trypsin > inhibitor structure from PDB") as an abstract identifier, with > the version used to make it concrete ("with the update because > the first release had a typo") I think it can be encapsulated > in the uri scheme we now use because each version gets it own > identifier, and since the client knows all versions there's no > problem. > > > The folks at EBI/Sanger (what's the correct collective term; > Hinxton? Genome Campus?) know which servers provide which > systems so many servers don't provide coordinates. > > In some cases, like rabbit, the server will generate about > 120,000 segments, one for each scaffold. It takes quite some time > (a minute or more) to generate the output. In theory this is > static and can be precomputed by the server. > > For my own knowledge, when do people want the complete list > of segments? When do they want the length? You, yes, you > there, in front of the computer. When do you you want to > use it? For (nearly) completely sequenced genomes, it is important to provide a complete list of genome segment ids/names. This allows a visualization client to provide this list for a user to select from if they are interested in particular genome locations or simply browsing, rather than having the id/name of a particular feature in mind. Now you could just have the user type in the id of a segment, but unless they are familiar with the vagaries of that particular server, do they request "chr1", or "1", "I", "chrI", "chrom1", etc? Length information for a segment is needed to place an upper bound on range queries to the server. And in a GUI client it is often more convenient for the user to indicate visually what range on the segment they want to retrieve data from, but this doesn't make sense without the client app knowing the length of the segment. Furthermore, once the client is displaying located annotations on a segment, it can be important to know where the end of the segment is relative to the locations of annotations. For less complete genomes (like rabbit), it's not so clear what advantage there is to having the list of 120,000 scaffolds to choose from. Same applies to list of proteins or mRNAs. > > Let me stress -- this is not a request to change anything. I > would like to know for my own sake, for writing the documentation, > and for how much emphasis to put on this for the validation. > > As another observation, the Sanger/EBI servers also don't > do much with the types document. Some don't even handle the > request. Eugene said that no one had asked him to add it. > It's there now (thanks Eugene). > > I think this is because most of their servers only had a single > type and the solution was "display everything." They are > running into difficulties with this for a few new servers and > will be need type support, and type filter support soonish. > > Andrew > dalke at dalkescientific.com From ap3 at sanger.ac.uk Mon Nov 6 11:34:05 2006 From: ap3 at sanger.ac.uk (Andreas Prlic) Date: Mon, 6 Nov 2006 16:34:05 +0000 Subject: [DAS2] move biodas website to a wiki? Message-ID: Hi! Over the last year several of the open-bio websites like BioPerl or BioJava have been moved to a Wiki. Looking at the current state of the biodas website, which is getting out of date and does not look well maintained I thought it might be good to do the same for biodas.org. We have a couple of announcements which would be good to put there - e.g. Ensembl now provides DAS reference and annotation servers for all its genomes, several new DAS-based applications are in the pipeline, the DAS registry now counts 170+ DAS servers, etc... what do you guys think about this idea? Cheers, Andreas ----------------------------------------------------------------------- Andreas Prlic Wellcome Trust Sanger Institute Hinxton, Cambridge CB10 1SA, UK +44 (0) 1223 49 6891 From Gregg_Helt at affymetrix.com Mon Nov 6 12:13:27 2006 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Mon, 6 Nov 2006 09:13:27 -0800 Subject: [DAS2] move biodas website to a wiki? Message-ID: Sounds like a good idea to me. Steve? Gregg > -----Original Message----- > From: das2-bounces at lists.open-bio.org [mailto:das2-bounces at lists.open- > bio.org] On Behalf Of Andreas Prlic > Sent: Monday, November 06, 2006 8:34 AM > To: DAS/2 > Subject: [DAS2] move biodas website to a wiki? > > Hi! > > Over the last year several of the open-bio websites like BioPerl or > BioJava have been moved to a Wiki. > Looking at the current state of the biodas website, which is getting > out of date and does not look well maintained I thought it might be > good to do the same for biodas.org. > > We have a couple of announcements which would be good to put there - > e.g. Ensembl now provides DAS reference and annotation servers for all > its genomes, several new DAS-based applications are in the pipeline, > the DAS registry now counts 170+ DAS servers, etc... > > what do you guys think about this idea? > > Cheers, > Andreas > > ----------------------------------------------------------------------- > > Andreas Prlic Wellcome Trust Sanger Institute > Hinxton, Cambridge CB10 1SA, UK > +44 (0) 1223 49 6891 > > _______________________________________________ > DAS2 mailing list > DAS2 at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/das2 From dalke at dalkescientific.com Mon Nov 6 13:18:49 2006 From: dalke at dalkescientific.com (Andrew Dalke) Date: Mon, 6 Nov 2006 19:18:49 +0100 Subject: [DAS2] unified rnc schema Message-ID: <59b794823b04b0963ed41cbd2b51fb0d@dalkescientific.com> the unified schema document is in CVS under das/das2/das2_schemas.rnc This is the merge of the existing rnc files, which were developed and distributed in the spring. There are stubs named types.rnc features.rnc segments.rnc sources.rnc which all look like this include "das2_schemas.rnc" start = sources Meaning that they import the main schema and define the root node appropriately for each specific document type. Andrew dalke at dalkescientific.com From Gregg_Helt at affymetrix.com Mon Nov 6 13:24:32 2006 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Mon, 6 Nov 2006 10:24:32 -0800 Subject: [DAS2] DAS/2 retrieval spec docs Message-ID: Location of DAS/2 get HTML docs: In the cvs.biodas.org repository (http://code.open-bio.org/cgi/viewcvs.cgi/das/das2/?cvsroot=biodas) HTML: das2_protocol.html das2_get.html Schema: draft3/*.rnc From Steve_Chervitz at affymetrix.com Mon Nov 6 14:16:49 2006 From: Steve_Chervitz at affymetrix.com (Steve Chervitz) Date: Mon, 06 Nov 2006 11:16:49 -0800 Subject: [DAS2] Notes from the weekly DAS/2 teleconference, 6 Nov 2006 Message-ID: Notes from the weekly DAS/2 teleconference, 6 Nov 2006 $Id: das2-teleconf-2006-11-06.txt,v 1.1 2006/11/06 19:13:26 sac Exp $ Note taker: Steve Chervitz Attendees: Affy: Steve Chervitz, Gregg Helt, Ed Erwin CHSL: Lincoln Stein Dalke Scientific: Andrew Dalke UAB: Ann Loraine UCLA: Allen Day, Brian O'connor Action items are flagged with '[A]'. These notes are checked into the biodas.org CVS repository at das/das2/notes/2006. Instructions on how to access this repository are at http://biodas.org DISCLAIMER: The note taker aims for completeness and accuracy, but these goals are not always achievable, given the desire to get the notes out with a rapid turnaround. So don't consider these notes as complete minutes from the meeting, but rather abbreviated, summarized versions of what was discussed. There may be errors of commission and omission. Participants are welcome to post comments and/or corrections to these as they see fit. Agenda ------- * 2.0 spec freeze discussion ls: hapmap project repercussion: instability of das/2 spec has lead for me to recommend against using it for hapmap db. Have had to put hapmap data onto a soap server. Need to have an internal project, would have control over the process. not best possible protocol, but I could promise delivery of working s'ware on a reliable schedule. ls: I had a real deadline to deliver an adaptor to the NCI by end of Nov, w/o having a spec that is in stone that I can write to, can't deliver by that date, and can't get extension. went on record a year ago saying get spec was stable, good to build on, and it's not. would like to ask that we freeze the spec, remain frozen, the next version be das/3 and we guarantee das/2 is frozen for at least 2 years. gh: ok with that. how do other's feel. ls: if brian gilman can write a das/2 adaptor for cabig by end of nov based on spec now, it's not a crisis. we have two dependent things: (1) is a das/2 adaptor for caCORE that can read das/2 sources, (2) das/2 server for hapmap and das/2 server for vertebrate promoter db. NCI will not accept delivery of das/2 adaptor after end of Nov. If so, then the other two projects (servers) would become irrelevant, and I would withdraw from those two as well. NCI decided the spec was never going to stabilize, so wasn't flexible in giving more time past November. ad: brian wants schema in XSD not rnc. some changes: source -> method ls: two things: (1) is spec changing too much? this conversation was to create enclosing tag to create a group of related feats for streaming purposes. (2) perception issue: killing me because the NCI people read the archives and can see that the das/2 spec is in thrash and is not converging. ad: parent field, adding one single attribute to each feature, not a major alteration. Then we have discussion in ensuing two weeks following teleconf. we can freeze rnc schema now, everything works now. aday: html document is out of date, not sure what's in schema. ad: I sent it around a while back. aday: haven't seen it. gh: I read it. easier to read it than the html doc. ad: html doc doesn't get touched because it's much harder to write. ls: then freeze the rnc, remove html, point people to RNC. gh: nothing wrong with html, but it should say the formal spec is in the RNC. ad: rnc it's not complete (e.g., reference genomes are defined on a web page someplace). schema is not going to be the spec. there are somethings that schema definitions can't describe. gh: should have a pointer to the rnc at top of html and say "it is frozen and will stay frozen as das 2.0" [A] place DAS 2.0 frozen notice on html spec doc (after 1-2 day analysis) gh: salvagable situation with NCI? [A] lincoln will notify NCI of DAS/2 schema freeze ad: before people can say yes, i need to check in unified version of the schema, then folks can sign off on a unified document. current doc is in 8 parts. i'll put the unified schema into version control. ls: it's in the 'draft3' subdir. ad: yes. [A] Andrew will consolidate rnc schema document, check into cvs, notify list aday: describes formatting of xml and what each fields do. incomplete to impl a server because it doesn't describe req/response cycle. ad: yes, this is a description of the format. aday: it's an incomplete format. ad: describes the stuff needed to be returned back from the server, so it's complete from a server implementer's point of view. ad: what is in the html that doesn't agree with schema? aday: property response, fasta. timestamp on the html doc has been changed 10/24/06. Need to read this again. ad: I made some minor changes while at the EBI. gh: freeze the schema, freeze the intent of the html to smooth out clarifications, all devs read both schema and html, OK with freezing it in the next day. [A] freeze schema as DAS 2.0 (get), freeze html intent and clarify, by 7 Nov 2006 gh: in light of that, improving the biodas site. andreas suggested turning into wiki doc. need to allow multiple people to edit. steve=biodas.org admin? sc: I tend to do most biodas.org upkeep. bioperl has migrated to a wiki format. can probably borrow their template and set up something similar for biodas.org. [A] steve will convert das site into wiki style site ad: typos and xml mistakes in april to the interaction document (writeback) in the last 6 mos. gh: not talking about freezing the writeback portion. ee: is ucla das/2 server working now, top level doc, can't use it via IGB. aday: brian is testing against the affy server, that sources doc is still not responding. [A] allen will fix sources doc on ucla server ad: proxy work (email) accept das/1, interface with das/2. Serving das/2 from a das/1 server. initial result was slow with python's templating lang. new stream based parser with stream based output for doing it. in progress now. gh: what about auto testing of das/2 servers from his registry. talk to andreas about it? ping for alive-ness? ad: should still work easily, can't remember what andreas said about it. tho. [A] gregg will ask andreas about live-ness testing das/2 servers via registry gh: uri's that affy server returns only work for a single version of each genome (latest version). trouble with xml:base that was partially fixed last week. [A] gregg/steve will fix affy server xml:base to support all genome versions gh: stabilize spec, read and sign off the spec, need to address and stabilize. when funding agencies start pulling plugs based on das/2, this is serious. ls: these are management consultants. if promised s'ware product cannot be delivered in working order in time expected, so they make the calc that it's better to cut their losses. ls: need a human readable html doc that's consistent with the rnc document. public declaration on the website that people can rely on it for 2 years. bo: Any developer will want to see this, as well as a reference implementation. gh: read html doc today/tomorrow, with an eye towards agreement with schema. I don't think it's that far off. [A] everyone read html doc for agreement with schema, finish by 7 Nov 2006 Other topic: DAS-related projects ---------------------------------- al: NSF plant science cyber infrastructure project: http://www.nsf.gov/pubs/2006/nsf06594/nsf06594.htm ls: univ of georgia, malmberg. Another one I'm doing in collab with myerowitz. al: incorporating anything from das, or viz work? ls: univ of GA plan, my role is in annotating plant pathways. in my project: all community annotation using wiki, kind of a plantopedia. text is not very structured, series of pages which have some constrained fields, genome annotation, genotypes, ontologies, everything else is text annot on top of it. reason: natural language processing has gotten good. people should not start dumbing down their communication with computers, but communicate in english. al: dense and compact abstract text. ls: people are identifying regions of text in an xmly way. al: proposals for centers, akin to ncbi for plant biology... is the plant wiki idea to be a component? ls: three main parts.... al: do people on das want to write something up for the cyber infrastructure? das/2 seems appropriate. ls: should talk about das being part of it. proposing an open source api, basically a bus, that allows you connect the consumers of data with producers of data. a s'ware layer that goes over an opaque transfer protocol. deliberately not be cross platform. A s'ware kit on laptops, prepopulated, autoupdates, spec that s'ware devs can write to. in terms of an xml protocol that people can plug into, people are never going to see that layer. al: like an OS. ls: in fact called the plant OS. al: terrified that nsf will give 1-2 groups all the funding and we'll have a monolithic structure. would like to try many ideas, let free market decide. How can we give the people who do the hard work enough funding to keep them involved, esp if they have 40hr/wk jobs as well. ls: they don't want cyber infrastruct proj to make awards to people. that would be taking over NSF's role. ls: only seven page write ups. al: I want to propose viz using das and microarray data. webservices and microarray data. takers? aday: webservices yes. gh: other commitments now. bo: mark carlson has been integrating das client into MeV. al: incorporate into lincoln's proposals? ls: problem now: i've identified who the PIs are on the project. Microarray viz=owen white at tigr. can't change it now. ls: An even bigger project that TIGR is involved in now is the biofuels initiative - $250M over 5 years. very important project, bigger than cyber infrastructure proj. biologist, engineers, nanotech, ecologists collaboration. al: do we want better das servers or cheap fuels? ls: not for cheap fuels, but global warming. Wrapup ------ gh: to wrap up this meeting, html documents can be edited out of the repository, latest get specs and rnc docs. no need for an editable web page. ls: andrew is going to freeze the rnc, then make one pass over the html, then open it up to all devs? could be chaotic via source code control. gh: will confer with andrew over the plan and let folks know. [A] Gregg will inform all how to proceed re: html doc editing. [A] Next DAS/2 conf call next Monday (13 Nov, 9:30am). From Steve_Chervitz at affymetrix.com Mon Nov 6 17:33:09 2006 From: Steve_Chervitz at affymetrix.com (Steve Chervitz) Date: Mon, 06 Nov 2006 14:33:09 -0800 Subject: [DAS2] move biodas website to a wiki? In-Reply-To: Message-ID: Distributing the load of maintaining this site sounds great to me (the de facto maintainer). Wikification is also good for consistency within the open-bio.org family. I've initiated the process. Might have something preliminary to show next week. Of course, this means we'll have to come with an icon. Suggestions? How about a armadillo driving a submarine that looks like a gene structure in a sea of DNA? Steve > From: "Helt,Gregg" > Date: Mon, 6 Nov 2006 09:13:27 -0800 > To: Andreas Prlic , DAS/2 > Conversation: [DAS2] move biodas website to a wiki? > Subject: Re: [DAS2] move biodas website to a wiki? > > Sounds like a good idea to me. Steve? > > Gregg > >> -----Original Message----- >> From: das2-bounces at lists.open-bio.org [mailto:das2-bounces at lists.open- >> bio.org] On Behalf Of Andreas Prlic >> Sent: Monday, November 06, 2006 8:34 AM >> To: DAS/2 >> Subject: [DAS2] move biodas website to a wiki? >> >> Hi! >> >> Over the last year several of the open-bio websites like BioPerl or >> BioJava have been moved to a Wiki. >> Looking at the current state of the biodas website, which is getting >> out of date and does not look well maintained I thought it might be >> good to do the same for biodas.org. >> >> We have a couple of announcements which would be good to put there - >> e.g. Ensembl now provides DAS reference and annotation servers for > all >> its genomes, several new DAS-based applications are in the pipeline, >> the DAS registry now counts 170+ DAS servers, etc... >> >> what do you guys think about this idea? >> >> Cheers, >> Andreas >> >> > ----------------------------------------------------------------------- >> >> Andreas Prlic Wellcome Trust Sanger Institute >> Hinxton, Cambridge CB10 1SA, UK >> +44 (0) 1223 49 6891 >> >> _______________________________________________ >> DAS2 mailing list >> DAS2 at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/das2 > > _______________________________________________ > DAS2 mailing list > DAS2 at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/das2 From enwired at gmail.com Tue Nov 7 15:53:14 2006 From: enwired at gmail.com (Ed) Date: Tue, 7 Nov 2006 12:53:14 -0800 Subject: [DAS2] Comments on features.rnc Message-ID: <4aa3a7e70611071253j79aba4b2l44bada5613bff598@mail.gmail.com> Here are my comments on the features.rnc document. ** Fix these comments: Yes, feature style can either partially or fully ** override the feature-type style. (Clients are free to ignore the style, though.) # how to represent the feature; overrides the STYLE in the # feature type (but how? completely? or can this override # the fgcolor but not the other settings?) style*, ** Remove this comment # XXX Need use-cases for this # I think clients should just figure it out from the location region = element REGION { ** I would very much like the xid to have an optional "name" attribute. ** (The client may have more than one URL link for each feature and it needs ** to be easy for the user to tell them apart.) ** Less important, I would like optionally more than one xid per feature. ** (If you are waiting until someone needs it, well, I am ready for it!) # Some human-readable external link. # XXX This needs some way to describe the kind of link # (primary id, accession), and other information (eg, # "promotes", "false positive". # Fixing this will wait until someone needs it. xid = element XID { common_attrs, attribute href { text } } ** No, this does not need anything else ** But, optionally, it might be possible to have more than one note. # Does this element need anything else? note = element NOTE { common_attrs, text } Thanks, Ed Erwin From enwired at gmail.com Tue Nov 7 15:52:40 2006 From: enwired at gmail.com (Ed) Date: Tue, 7 Nov 2006 12:52:40 -0800 Subject: [DAS2] Comments on segments.rnc Message-ID: <4aa3a7e70611071252j7306e325k5f9c4028b0cfdc48@mail.gmail.com> Segments.rnc looks good. There is just one typo: This example should say "format=fasta", not "=fast" # http://localhost/das/sequence/Chromosome1?format=fast From enwired at gmail.com Tue Nov 7 14:19:28 2006 From: enwired at gmail.com (Ed) Date: Tue, 7 Nov 2006 11:19:28 -0800 Subject: [DAS2] sources.rnc Message-ID: <4aa3a7e70611071119j69487b14x680367c174696d5b@mail.gmail.com> I have nothing substantial to change in sources.rnc. Just clean up the notes and comments: ** Can this note be improved? # NOTE: the segments capability has optional 'coordinates' # element to state that it implements the given coordinate # system. I could not figure out how to do that in Relax-NG. ##attribute coordinates { text }, ** Several references to "At present...." should be removed ** Can this note be cleaned-up? Which Andreas? Is this the ** full list of reserved words? # 'Chromosome', 'Clone', 'Contig', 'Scaffold', etc. # This is from a restricted vocabulary maintained by Andreas attribute source { text }, From dalke at dalkescientific.com Tue Nov 7 19:59:04 2006 From: dalke at dalkescientific.com (Andrew Dalke) Date: Wed, 8 Nov 2006 01:59:04 +0100 Subject: [DAS2] We need your input for DAS/2 spec freeze! In-Reply-To: <4550A37F.9080201@pantherinformatics.com> References: <4550A37F.9080201@pantherinformatics.com> Message-ID: Brian: > 1) Best practices for namespacing? What will I encounter in the wild? > or . It would be good to let people know what the > DAS2 specification writers preferred from the implementing community. > I know if doesn't really matter from the spec writer point of view > but, from and engineers point of view it really does. Anything I can > do to reduce the number of chars I'm using in and XML document to save > overhead we will try and do. I expect most will use the default namespace. From "an engineer's point of view" how does it make a difference? Every XML parser has to understand either case, plus the use of any other valid namespace prefix as an alternative. To reduce the number of characters, use the default namespace and use http request negotiation to compress the data stream. Terseness was not a design goal in DAS or DAS2. > 2) Error codes: You have sprinkled error codes into the document for > what an implementor will send back to the caller. Would be wonderful > to have all error codes put into one place or at least put into a > little table so we know what to implement when there is an error > condition. Those are the HTTP error codes. See the HTTP spec for them. Experience with DAS1 strongly suggests that another layer of response codes does not work. You end up needing to handle the HTTP response codes plus the higher layer codes -- and in DAS1 you could even return the error in the response XML. None of the DAS1 clients did anything special with the error codes, other than to report the error to the user. So we decided in a meeting some time ago to leave it at that. > Quick question on Features. Spec says, "Servers may respond with an > error if there are too many matching features to return." What error > shall I return here? Would be great to make all the errors explicit so > clients can either display the appropriate error message or recover > gracefully (latter being the most desirable outcome). Depends on the problem and how it's identified. Looking at RFC 2616 section 10 some of the likely ones are 500 - internal server error (eg, if your backend segfaults) 503 - if the server load is too high 504 - if you have a proxy forwarding to an internal server and the internal server takes too long 413 - Request Entity Too Large DAS clients should follow the HTTP spec. Nothing in DAS ended up needing an addition to the HTTP error codes. > 3) For C/C++/Java programmers - it would be great to have a list of > interfaces to code to that are business/institution agnostic - I'm > planning on doing this so maybe put me on the hook for those? Would > like some help with that though... Since I don't know what that means I can't help. > 4) One more plea for XML Schema! Can you guys spit out an XML schema? I can not. I don't understand XML Schema. When I look at it my brain gets fuzzy. None of the tools I regularly use understand XML schema, and my experience with schema-based (DTD) parser generators is that they break, badly, when there is a normally forwards-compatible change to the format. > Sorry to sound like a jerk but the RelaxNG website was last update in > Sept 2003! Probably because it because ISO/IEC 19757 and is part of the ISO DSDL effort. More recent work is under the new name; NVDL perhaps? """ISO DSDL was developed in part as a reaction against the PSVI/Type-Annotation approach adopted by XML Schemas.""" http://www.stylusstudio.com/xmldev/200605/post90040.html > I'll try and use Trang to spit out a schema but, again, this piece of > software is old and crusty. Aren't there any Relax-NG data binders so you don't need the conversion step? Since you want JAXB, have you tried its (experimental) Relax-NG support? http://java.sun.com/webservices/docs/1.5/jaxb/relaxng.html http://java.sun.com/developer/EJTechTips/2005/tt0524.html http://www.oxygenxml.com/ says it can convert between grammars, http://www.oxygenxml.com/ xml_schema_editor.html#converting_between_grammars > The converter allows one to convert a DTD or Relax NG (full or compact > syntax) grammar or a set of XML files to an equivalent XML Schema, DTD > or Relax NG (full or compact syntax) grammar. Where perfect > equivalence is not possible due to limitations of the target language > will generate an approximation of the source grammar. The > conversion functionality is available from Tools -> Trang Converter . As you can see, it's using Trang, which you've said is crufty. (Personally I would love it if 5 year old software of mine was still going strong and didn't need any more TLC from me.) There are also the following, but they also seem too dusty for you. https://relax-ng.dev.java.net/ (linked from Wikipedia) lists the following isorelax-jaxp-bridge ISO RELAX JARV API to JAXP 1.3 validation API bridge relaxer XML Schema Compiler relaxerstudio Model editor for Relaxer relaxngcc Application-level XML parser generator / data-binding tool rngom RELAX NG Object Model / Parser > Not sure what I'm going to get out of it. Why do I keep asking for > this? Because I'm LAZY. And so am I. Why would I want to do this? > I want to use XML parsing libraries that bind XML Schema to Java > objects and vice versa. I can also look at a schema and code a SAX > document handler pretty quickly. Even a DTD would work here because > it's super easy to convert DTD -> XML Schema. Again, can I entice with > Beer/Wine? ;-) The DAS2 schema is not hard. Really. Honestly. We're using a full-blow, ISO standards based schema definition, and a subset of that so parsers need only single token lookahead for disambiguation. It should be as trivially easy to support RNG as to support a DTD, with the added bonus that DTDs and namespaces don't mix. Andrew dalke at dalkescientific.com From dalke at dalkescientific.com Tue Nov 7 20:06:14 2006 From: dalke at dalkescientific.com (Andrew Dalke) Date: Wed, 8 Nov 2006 02:06:14 +0100 Subject: [DAS2] Comments on segments.rnc In-Reply-To: <4aa3a7e70611071252j7306e325k5f9c4028b0cfdc48@mail.gmail.com> References: <4aa3a7e70611071252j7306e325k5f9c4028b0cfdc48@mail.gmail.com> Message-ID: Ed: > Segments.rnc looks good. There is just one typo: This example should > say > "format=fasta", not "=fast" > > # http://localhost/das/sequence/Chromosome1?format=fast Got it. Checked in. Thanks. Andrew dalke at dalkescientific.com From dalke at dalkescientific.com Tue Nov 7 20:36:50 2006 From: dalke at dalkescientific.com (Andrew Dalke) Date: Wed, 8 Nov 2006 02:36:50 +0100 Subject: [DAS2] sources.rnc In-Reply-To: <4aa3a7e70611071119j69487b14x680367c174696d5b@mail.gmail.com> References: <4aa3a7e70611071119j69487b14x680367c174696d5b@mail.gmail.com> Message-ID: Ed: > I have nothing substantial to change in sources.rnc. Just clean up the > notes and comments: > > ** Can this note be improved? > > # NOTE: the segments capability has optional 'coordinates' > # element to state that it implements the given coordinate > # system. I could not figure out how to do that in Relax-NG. > ##attribute coordinates { text }, # NOTE: the segments capability has an optional 'coordinates' # element describing the supported coordinate system. Because # of the 'attribute *' a few lines above there is an ambiguity. # Any capability element may have a 'coordiantes' attribute # so there's no need for an explicit schema declaration. #attribute coordinates { text }?, > ** Several references to "At present...." should be removed At present those have been removed. > ** Can this note be cleaned-up? Which Andreas? Is this the > ** full list of reserved words? I listed Andreas earlier, regarding his use of "das1:types", etc. in a capability 'type'. I've added those as reserved fields. As to the one for > # 'Chromosome', 'Clone', 'Contig', 'Scaffold', etc. I've updated that to # For a full list of the "authority" and "source" values see # http://das.sanger.ac.uk/registry/help_coordsys.jsp # This refers to the "physical dimension" of the annotated data. # The following names are reserved: "Chromosome", "Clone", # "Contig", "Gene_ID", "NT_Contig", "Protein Sequence", # "Protein Structure", "Scaffold", "Volume Map". # The 'source' attribute corresponds to the coordinate # system 'type' in the DAS registry. attribute source { text }, # The name of an authority/institution that defines the accession # codes of a coordinate system or that provides a gene-build. # See the DAS registry help for a full list of reserved names. # A partial list is: "BDGP", "EMBL", "Entrez", "KEGG", "MGI", "NCBI", # "PDBresnum", "SDG" and "UniProt" and "ZFISH". attribute authority { text }, Changes made and das2_schemas.rnc has been checked in. Andrew dalke at dalkescientific.com From enwired at gmail.com Tue Nov 7 20:40:22 2006 From: enwired at gmail.com (Ed) Date: Tue, 7 Nov 2006 17:40:22 -0800 Subject: [DAS2] sources.rnc In-Reply-To: References: <4aa3a7e70611071119j69487b14x680367c174696d5b@mail.gmail.com> Message-ID: <4aa3a7e70611071740w5bdab602ya61944279a9b78a4@mail.gmail.com> thanks 2006/11/7, Andrew Dalke : > > > > Changes made and das2_schemas.rnc has been checked in. > > From ap3 at sanger.ac.uk Wed Nov 8 09:01:40 2006 From: ap3 at sanger.ac.uk (Andreas Prlic) Date: Wed, 8 Nov 2006 14:01:40 +0000 Subject: [DAS2] sources.rnc In-Reply-To: References: <4aa3a7e70611071119j69487b14x680367c174696d5b@mail.gmail.com> Message-ID: <76b3521b73785f6cbb2540cdde62ed03@sanger.ac.uk> Hi Andrew! > I listed Andreas earlier, regarding his use of "das1:types", > etc. in a capability 'type'. I've added those as reserved > fields. that is good - we are using this sources command now also as a back-port to describe the DAS/1 servers in the DAS registry. It might be good to have a link to the DAS - registry in general somewhere in the sources.rnc It now has its own domain at http://www.dasregistry.org/ so the sources command is available via: http://www.dasregistry.org/registry/das1/sources can you also add das1:stylesheet das1:sequence das1:dna das1:entry_points das1:structure das1:alignment which are supported by the registry? das1:segments is not being used currently, so this could be removed. > I've updated that to > > # For a full list of the "authority" and "source" values see > # http://das.sanger.ac.uk/registry/help_coordsys.jsp can you change that to http://www.dasregistry.org/registry/help_coordsys.jsp please? Thanks. Andreas ----------------------------------------------------------------------- Andreas Prlic Wellcome Trust Sanger Institute Hinxton, Cambridge CB10 1SA, UK +44 (0) 1223 49 6891 From Gregg_Helt at affymetrix.com Wed Nov 8 14:54:14 2006 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Wed, 8 Nov 2006 11:54:14 -0800 Subject: [DAS2] Ontology URIs (was RE: types.rnc) Message-ID: I'll talk to Suzi in her role as co-PI at NCBO (National Center for Biomedical Ontolgoy). We may be able to quickly work out a URI syntax (even if implementation of what the URIs resolve to comes later). gregg > -----Original Message----- > From: Andrew Dalke [mailto:dalke at dalkescientific.com] > Sent: Tuesday, November 07, 2006 6:23 PM > To: Ed > Cc: Helt,Gregg > Subject: Re: types.rnc > > Ed: > > What bothers me is "still undecided".? That doesn't belong in a > > "frozen" spec.? Though I have no idea what the correct text to put > > here is. > > Take for example > > http://genome.cbs.dtu.dk:9000/das/secretomep/types > > > category="protein sorting" description="Ab initio predictions of > non-classical i.e. not signal peptide triggered protein secretion" > evidence="IEA" > > ontology="http://www.geneontology.org/GO.evidence.shtml">35138 > > It uses an ontology URI to describe which ontology scheme is > used to describe the "evidence" value. In this case it means > "Inferred from Electronic Annotation" > > There is no long-term/stable URL scheme for GO. Do we > make something up? Do we say "use a URL" and leave it > at that? I'll go for the latter as every reasonable > scheme should end up as a URL. > > Except for those which are annotated from multiple ontologies. > > > > Andrew > dalke at dalkescientific.com From dalke at dalkescientific.com Wed Nov 8 17:19:53 2006 From: dalke at dalkescientific.com (Andrew Dalke) Date: Wed, 8 Nov 2006 23:19:53 +0100 Subject: [DAS2] Ontology URIs (was RE: types.rnc) In-Reply-To: References: Message-ID: <25e4f92b7df100b5e41a829b9dd1737e@dalkescientific.com> On Nov 8, 2006, at 8:54 PM, Helt,Gregg wrote: > I'll talk to Suzi in her role as co-PI at NCBO (National Center for > Biomedical Ontolgoy). We may be able to quickly work out a URI syntax > (even if implementation of what the URIs resolve to comes later). Doesn't saying that it's a URI suffice? Surely we aren't going to restrict it to a single ontology specification? Eg, what about people working on structure feature ontologies? Andrew dalke at dalkescientific.com From cjm at fruitfly.org Wed Nov 8 17:11:09 2006 From: cjm at fruitfly.org (Chris Mungall) Date: Wed, 8 Nov 2006 17:11:09 -0500 Subject: [DAS2] Ontology URIs (was RE: types.rnc) In-Reply-To: References: Message-ID: <8D4E0D97-D083-4A85-AED3-EFD369F9390A@fruitfly.org> There absolutely needs to be a stable URI scheme for referencing types defined in ontologies. The details of the scheme aren't clear yet. It will probably be http based (ie not LSID). Do you have specific requirements? Should the URI be a URL dereferenceable in any browser? Should it dereference to html or RDF or use content negotion to decide which? etc On Nov 8, 2006, at 2:54 PM, Helt,Gregg wrote: > I'll talk to Suzi in her role as co-PI at NCBO (National Center for > Biomedical Ontolgoy). We may be able to quickly work out a URI > syntax (even if implementation of what the URIs resolve to comes > later). > > gregg > >> -----Original Message----- >> From: Andrew Dalke [mailto:dalke at dalkescientific.com] >> Sent: Tuesday, November 07, 2006 6:23 PM >> To: Ed >> Cc: Helt,Gregg >> Subject: Re: types.rnc >> >> Ed: >>> What bothers me is "still undecided". That doesn't belong in a >>> "frozen" spec. Though I have no idea what the correct text to put >>> here is. >> >> Take for example >> >> http://genome.cbs.dtu.dk:9000/das/secretomep/types >> >> >> > category="protein sorting" description="Ab initio >> predictions of >> non-classical i.e. not signal peptide triggered protein secretion" >> evidence="IEA" >> >> ontology="http://www.geneontology.org/GO.evidence.shtml">35138 >> >> It uses an ontology URI to describe which ontology scheme is >> used to describe the "evidence" value. In this case it means >> "Inferred from Electronic Annotation" >> >> There is no long-term/stable URL scheme for GO. Do we >> make something up? Do we say "use a URL" and leave it >> at that? I'll go for the latter as every reasonable >> scheme should end up as a URL. >> >> Except for those which are annotated from multiple ontologies. >> >> >> >> Andrew >> dalke at dalkescientific.com > > > _______________________________________________ > DAS2 mailing list > DAS2 at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/das2 > From enwired at gmail.com Wed Nov 8 17:51:30 2006 From: enwired at gmail.com (Ed) Date: Wed, 8 Nov 2006 14:51:30 -0800 Subject: [DAS2] Fwd: DAS2 unsubscribe notification In-Reply-To: References: Message-ID: <4aa3a7e70611081451lce1ca8dt3bb260ad802065ca@mail.gmail.com> Don't worry, I simply moved my subscription to enwired at gmail.com Ed ---------- Forwarded message ---------- From: mailman-bounces at lists.open-bio.org Date: 8 nov. 2006 14:48 Subject: DAS2 unsubscribe notification To: enwired at gmail.com ed_erwin at affymetrix.com has been removed from DAS2. From dalke at dalkescientific.com Wed Nov 8 19:40:57 2006 From: dalke at dalkescientific.com (Andrew Dalke) Date: Thu, 9 Nov 2006 01:40:57 +0100 Subject: [DAS2] Ontology URIs (was RE: types.rnc) In-Reply-To: <8D4E0D97-D083-4A85-AED3-EFD369F9390A@fruitfly.org> References: <8D4E0D97-D083-4A85-AED3-EFD369F9390A@fruitfly.org> Message-ID: Chris Mungall: > There absolutely needs to be a stable URI scheme for referencing types > defined in ontologies. The details of the scheme aren't clear yet. It > will probably be http based (ie not LSID). > > Do you have specific requirements? Should the URI be a URL > dereferenceable in any browser? Should it dereference to html or RDF > or use content negotion to decide which? etc Browsers can treat them as opaque strings if they don't understand the ontology. Only if they want to do inferencing or interesting visualizations do they need to know about the ontology. As such, for now we expect clients to have a hard-coded list of known ontology identifiers. They do not need to have a default resolver and we have no use case for what that response might look like. Andrew dalke at dalkescientific.com From dalke at dalkescientific.com Wed Nov 8 20:01:08 2006 From: dalke at dalkescientific.com (Andrew Dalke) Date: Thu, 9 Nov 2006 02:01:08 +0100 Subject: [DAS2] sources.rnc In-Reply-To: <76b3521b73785f6cbb2540cdde62ed03@sanger.ac.uk> References: <4aa3a7e70611071119j69487b14x680367c174696d5b@mail.gmail.com> <76b3521b73785f6cbb2540cdde62ed03@sanger.ac.uk> Message-ID: Andreas: > It might be good to have a link to the DAS - registry in general > somewhere in the sources.rnc > It now has its own domain at http://www.dasregistry.org/ Yes. I had forgotten its name last night. > so the sources command is available via: > http://www.dasregistry.org/registry/das1/sources Any chance of making that URL shorter? It seems long. And it no longer includes das1 sources. Also, I can't find anywhere on the HTML which points to that sources document. How does someone find it? Without doing like I did and look in the back mailing list archive. ;) > can you also add > das1:stylesheet > das1:sequence > das1:dna > das1:entry_points > das1:structure > das1:alignment > > which are supported by the registry? > das1:segments is not being used currently, so this could be removed. Ahh, had gotten the terminology mixed up. All added. > >> I've updated that to >> >> # For a full list of the "authority" and "source" values see >> # http://das.sanger.ac.uk/registry/help_coordsys.jsp > > can you change that to > http://www.dasregistry.org/registry/help_coordsys.jsp > please? Done. All the above checked in. Andrew dalke at dalkescientific.com From Steve_Chervitz at affymetrix.com Wed Nov 8 20:07:42 2006 From: Steve_Chervitz at affymetrix.com (Steve Chervitz) Date: Wed, 08 Nov 2006 17:07:42 -0800 Subject: [DAS2] Ontology URIs (was RE: types.rnc) In-Reply-To: <8D4E0D97-D083-4A85-AED3-EFD369F9390A@fruitfly.org> Message-ID: Seems like we may need to freeze the spec in a state that is fairly non-committal w/r/t how ontology identifiers work. I propose to remove the parts that are still not nailed down, so that we don't engender the creation of mutually incompatible implementations (one of the problems with DAS/1 which DAS/2 is aiming at). The ontology attribute in the type element is currently documented as: # ontology identifier. The naming scheme is still undecided. # This will be a URI. attribute ontology { text }?, I think this is too vague. It's subject to lots of interpretation as to what it could point at and what it might resolve to. It could justifiably be used to identify any of these: - a specific term in an ontology - the ontology as a whole (e.g., homepage of GO) - evidence code (as in the example below) The so_accession attribute gets us most of what we want and should suffice for this freeze. In one fell swoop it identifies the ontology and a particular term within it, and it defers the issue of ontology URIs. Some SO things to consider: 1) Should so_accession be restricted to SOFA (only locatable feature types)? If so, call it sofa_accession. (maybe too limiting) 2) What about SO versioning? Maybe a 'so_version' attribute would make sense (so_version="SOFA 2.1"). SO term IDs are stable across releases, but sometimes terms become obsolete and are no longer listed. Steve > From: Chris Mungall > Date: Wed, 8 Nov 2006 17:11:09 -0500 > To: "Helt,Gregg" > Cc: DAS/2 > Subject: Re: [DAS2] Ontology URIs (was RE: types.rnc) > > > There absolutely needs to be a stable URI scheme for referencing > types defined in ontologies. The details of the scheme aren't clear > yet. It will probably be http based (ie not LSID). > > Do you have specific requirements? Should the URI be a URL > dereferenceable in any browser? Should it dereference to html or RDF > or use content negotion to decide which? etc > > On Nov 8, 2006, at 2:54 PM, Helt,Gregg wrote: > >> I'll talk to Suzi in her role as co-PI at NCBO (National Center for >> Biomedical Ontolgoy). We may be able to quickly work out a URI >> syntax (even if implementation of what the URIs resolve to comes >> later). >> >> gregg >> >>> -----Original Message----- >>> From: Andrew Dalke [mailto:dalke at dalkescientific.com] >>> Sent: Tuesday, November 07, 2006 6:23 PM >>> To: Ed >>> Cc: Helt,Gregg >>> Subject: Re: types.rnc >>> >>> Ed: >>>> What bothers me is "still undecided". That doesn't belong in a >>>> "frozen" spec. Though I have no idea what the correct text to put >>>> here is. >>> >>> Take for example >>> >>> http://genome.cbs.dtu.dk:9000/das/secretomep/types >>> >>> >>> >> category="protein sorting" description="Ab initio >>> predictions of >>> non-classical i.e. not signal peptide triggered protein secretion" >>> evidence="IEA" >>> >>> ontology="http://www.geneontology.org/GO.evidence.shtml">35138 >>> >>> It uses an ontology URI to describe which ontology scheme is >>> used to describe the "evidence" value. In this case it means >>> "Inferred from Electronic Annotation" >>> >>> There is no long-term/stable URL scheme for GO. Do we >>> make something up? Do we say "use a URL" and leave it >>> at that? I'll go for the latter as every reasonable >>> scheme should end up as a URL. >>> >>> Except for those which are annotated from multiple ontologies. >>> >>> >>> >>> Andrew >>> dalke at dalkescientific.com >> >> >> _______________________________________________ >> DAS2 mailing list >> DAS2 at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/das2 >> > > _______________________________________________ > DAS2 mailing list > DAS2 at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/das2 From enwired at gmail.com Wed Nov 8 20:13:19 2006 From: enwired at gmail.com (Ed) Date: Wed, 8 Nov 2006 17:13:19 -0800 Subject: [DAS2] Ontology URIs (was RE: types.rnc) In-Reply-To: References: <8D4E0D97-D083-4A85-AED3-EFD369F9390A@fruitfly.org> Message-ID: <4aa3a7e70611081713w475270b9i3d7b5c3072efc1f4@mail.gmail.com> so_accession sounds fine with me. 2006/11/8, Steve Chervitz : > > Seems like we may need to freeze the spec in a state that is fairly > non-committal w/r/t how ontology identifiers work. I propose to remove the > parts that are still not nailed down, so that we don't engender the > creation > of mutually incompatible implementations (one of the problems with DAS/1 > which DAS/2 is aiming at). > > The ontology attribute in the type element is currently documented as: > > # ontology identifier. The naming scheme is still undecided. > # This will be a URI. > attribute ontology { text }?, > > I think this is too vague. It's subject to lots of interpretation as to > what > it could point at and what it might resolve to. It could justifiably be > used > to identify any of these: > > - a specific term in an ontology > - the ontology as a whole (e.g., homepage of GO) > - evidence code (as in the example below) > > The so_accession attribute gets us most of what we want and should suffice > for this freeze. In one fell swoop it identifies the ontology and a > particular term within it, and it defers the issue of ontology URIs. > > Some SO things to consider: > > 1) Should so_accession be restricted to SOFA (only locatable feature > types)? > If so, call it sofa_accession. (maybe too limiting) > > 2) What about SO versioning? Maybe a 'so_version' attribute would make > sense > (so_version="SOFA 2.1"). SO term IDs are stable across releases, but > sometimes terms become obsolete and are no longer listed. > > Steve > > > From Steve_Chervitz at affymetrix.com Wed Nov 8 20:51:06 2006 From: Steve_Chervitz at affymetrix.com (Steve Chervitz) Date: Wed, 08 Nov 2006 17:51:06 -0800 Subject: [DAS2] Identifiers and URIs Message-ID: All DAS/2 elements are identified with a uri attribute, but their documentation isn't consistent. So I'm recommending this be tighted up a bit. Some examples from das2_schemas.rnc: # URL pointing directly to the given TYPE uri, # URL pointing directly to the feature uri, # URL for the actual sequence data. It implements the DAS2 # sequence request interface. uri, # A unique identifier for this coordinate. # This is an abstract identifier and might not be resolvable. # Two coordinates are the same if and only if they have the # same URI. uri, # unique URI for the SOURCE # Each source URI must be unique in sources list uri, I propose that all such comments have a consistent wording. How about this: # A unique identifier for this [object-type] uri If the entity is resolvable, then add: # This URL is resolvable to this [object-type] from a DAS/2 server. Otherwise: # This is an abstract identifier and might not be resolvable. In the abbreviations section of the rnc, the uri itself is described as: # URI to an object defined by the DAS spec uri = attribute uri { text } I'd change this to: # URL used to identify an object defined in a DAS/2 document. There are some places in the HTML retrieval document that could be updated to state 'uri' instead of 'id'. In the sources section: "All identifiers and href attributes ... follow the XML Base ..." Recommended change: "All uri and href attributes ... follow the XML Base ..." Another sentence in sources that could use a s/id/uri/g: "Each SOURCE id and VERSION id is fetchable." In the types section: "The 'uri' attribute is a URI ..." Change to: "The 'uri' attribute is a URL ..." Steve From boconnor at ucla.edu Wed Nov 8 19:01:01 2006 From: boconnor at ucla.edu (Brian O'Connor) Date: Wed, 08 Nov 2006 16:01:01 -0800 Subject: [DAS2] DAS/2 Server on biopackages.net Message-ID: <45526FBD.1010201@ucla.edu> Hi, FYI: the DAS/2 server on biopackges.net will be down while I try to fix the bug Ed reported on empty domain/source/versioned source documents. I'll email the list when the server is available again, should be a couple hours. --Brian From boconnor at ucla.edu Wed Nov 8 23:05:52 2006 From: boconnor at ucla.edu (Brian O'Connor) Date: Wed, 08 Nov 2006 20:05:52 -0800 Subject: [DAS2] DAS/2 Server on Biopackages.net Message-ID: <4552A920.8020001@ucla.edu> Hi, I brought the DAS/2 server back online and the bug with empty domain/source/versioned source documents should now be fixed. See: http://das.biopackages.net/das/genome. Also, I temporarily turned server caching off so I can make sure the server is responding correctly. I'll turn caching back on tomorrow after I've finished debugging/checking the responses. The server is now using CVS HEAD. Anyway, I've been looking over the output from our server and comparing it to the HTML spec and the RNC schema doc from cvs. I have a few comments/questions/bugs: Potential bugs on das.biopackages.net: * FIXED: segments response was missing xmlns * FIXED: domain/source/versioned source docs are not populated correctly * Coordinates is missing the source attribute (it's empty) which is required in the RNC * The capability responses look like: >>>> <<<< Whereas the HTML spec and RNC doc use "features", "types", and "segments". Should this be changed on the biopackages.net server? Questions about HTML spec/RNC doc: * The segments element has a required attribute of "uri" in the RNC doc, is this correct? The biopackages.net server only has a uri for a given segment and the examples from the HTML are the same. * It's a little confusing to have the "overview" and "detailed" sections separate in the HTML spec. I think it would make more sense to put the detailed section right after each overview or at least provide an anchor link at the end of each overview. * Anchor links are broken throughout the html doc. * the RNC mentions the type attribute under capability with: >>>> # A term describing the capability. The following are reserved # terms: segments, features, locks, writeback, das1:segments, # das1:types, das1:features attribute type { text }, <<<< Types should be listed here too. Also, could this be defined with: attribute type { "segments" | "features" | "types" | "locks" | "..." } to make it more clear? Please let Allen or I know if you have any problems using the biopackages.net server. --Brian From Gregg_Helt at affymetrix.com Thu Nov 9 13:07:32 2006 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Thu, 9 Nov 2006 10:07:32 -0800 Subject: [DAS2] Ontology URIs (was RE: types.rnc) Message-ID: > -----Original Message----- > From: Chervitz, Steve > Sent: Wednesday, November 08, 2006 5:08 PM > To: Chris Mungall; Helt,Gregg > Cc: DAS/2 Discussion > Subject: Re: [DAS2] Ontology URIs (was RE: types.rnc) > > Seems like we may need to freeze the spec in a state that is fairly > non-committal w/r/t how ontology identifiers work. I propose to remove the > parts that are still not nailed down, so that we don't engender the > creation > of mutually incompatible implementations (one of the problems with DAS/1 > which DAS/2 is aiming at). > > The ontology attribute in the type element is currently documented as: > > # ontology identifier. The naming scheme is still undecided. > # This will be a URI. > attribute ontology { text }?, > > I think this is too vague. It's subject to lots of interpretation as to > what > it could point at and what it might resolve to. It could justifiably be > used > to identify any of these: > > - a specific term in an ontology > - the ontology as a whole (e.g., homepage of GO) > - evidence code (as in the example below) > The so_accession attribute gets us most of what we want and should suffice > for this freeze. In one fell swoop it identifies the ontology and a > particular term within it, and it defers the issue of ontology URIs. > > Some SO things to consider: > > 1) Should so_accession be restricted to SOFA (only locatable feature > types)? > If so, call it sofa_accession. (maybe too limiting) > > 2) What about SO versioning? Maybe a 'so_version' attribute would make > sense > (so_version="SOFA 2.1"). SO term IDs are stable across releases, but > sometimes terms become obsolete and are no longer listed. > > Steve > The "ontology" attribute of the TYPE element is meant to be an identifier for a specific ontology term in the SO or SOFA. It (and its placeholder, "so_accession") is the only place where any part of DAS/2 depends directly on an ontology. GO terms (or any other ontology) can be used as properties of features -- the biopackages server does this for example. But it is done using a generic property mechanism that makes no mention of ontologies, and the DAS/2 spec does not mention or depend on any ontology other than SO. The reason there is both an "ontology" and "so_accession" attribute is that we didn't have an official SO URI syntax to refer to, so we created a temporary "so_accession" attribute to use until we had something to put in for "ontology". Since the ontology attribute can _only_ be from SO or SOFA, I agree with Steve that we could collapse "so_accession" and "ontology" down to one attribute and use a prefix shorthand for SO/SOFA terms, for example "SO:0000147". This has the nice property that the shorthand is in fact a legal absolute URI, and therefore unaffected by any "xml:base" attributes in the document. I'd instead prefer this URI to be a URL that points to a description at the biomedical ontology center. But specifying that the attribute is a URI allows both the shorthand and later a more official link. Allen Day and Brian O'Connor have implemented an ontology server with an HTTP API that fits in very well with DAS/2, where each ontology term has its own URI. This was discussed back on the DAS/2 mailing list in February and I think Chris had some concerns, here's the start of the thread: http://portal.open-bio.org/pipermail/das2/2006-February/000507.html . To avoid divergence I've been reluctant to devote more resources to this unless it was in collaboration with the ontology center. I don't think we really need SO versioning -- to be useful it places an extra burden on the ontology maintainers. And looking at the current SO, when a term becomes obsolete it is still included in the ontology, it just gets flagged with an "is_obsolete:true" tag. Andrew's comment below made me realize we may have another problem -- not annotation with multiple ontologies, but rather annotation with multiple terms from the SO. I had thought each feature type could be based on a single ontology term (maybe using SO composite terms: http://www.bioontology.org/wiki/index.php/SO:Composite_Terms), but looking at the latest SO I don't think we can make this assumption. Which argues that "so_accession" should be a child element of TYPE rather than an attribute, and one or more be allowed. Or am I reading the SO wrong? Lincoln? Chris? As far as Chris' question as to what exactly an ontology URL should dereference to, relative to the DAS/2 spec I don't think it matters too much. An XML response with some structured description like what Allen's server returns would be nice, but I could see the benefits of HTML as the default too. Did I mention I'm a fan of content negotiation? In most of the DAS/2 HTTP GET requests, we have optional "format=" query parameter arguments to allow alternative format requests even in situations where HTTP content negotiation is not straightforward. Gregg > > From: Chris Mungall > > Date: Wed, 8 Nov 2006 17:11:09 -0500 > > To: "Helt,Gregg" > > Cc: DAS/2 > > Subject: Re: [DAS2] Ontology URIs (was RE: types.rnc) > > > > > > There absolutely needs to be a stable URI scheme for referencing > > types defined in ontologies. The details of the scheme aren't clear > > yet. It will probably be http based (ie not LSID). > > > > Do you have specific requirements? Should the URI be a URL > > dereferenceable in any browser? Should it dereference to html or RDF > > or use content negotion to decide which? etc > > > > On Nov 8, 2006, at 2:54 PM, Helt,Gregg wrote: > > > >> I'll talk to Suzi in her role as co-PI at NCBO (National Center for > >> Biomedical Ontolgoy). We may be able to quickly work out a URI > >> syntax (even if implementation of what the URIs resolve to comes > >> later). > >> > >> gregg > >> > >>> -----Original Message----- > >>> From: Andrew Dalke [mailto:dalke at dalkescientific.com] > >>> Sent: Tuesday, November 07, 2006 6:23 PM > >>> To: Ed > >>> Cc: Helt,Gregg > >>> Subject: Re: types.rnc > >>> > >>> Ed: > >>>> What bothers me is "still undecided". That doesn't belong in a > >>>> "frozen" spec. Though I have no idea what the correct text to put > >>>> here is. > >>> > >>> Take for example > >>> > >>> http://genome.cbs.dtu.dk:9000/das/secretomep/types > >>> > >>> > >>> >>> category="protein sorting" description="Ab initio > >>> predictions of > >>> non-classical i.e. not signal peptide triggered protein secretion" > >>> evidence="IEA" > >>> > >>> ontology="http://www.geneontology.org/GO.evidence.shtml">35138 > >>> > >>> It uses an ontology URI to describe which ontology scheme is > >>> used to describe the "evidence" value. In this case it means > >>> "Inferred from Electronic Annotation" > >>> > >>> There is no long-term/stable URL scheme for GO. Do we > >>> make something up? Do we say "use a URL" and leave it > >>> at that? I'll go for the latter as every reasonable > >>> scheme should end up as a URL. > >>> > >>> Except for those which are annotated from multiple ontologies. > >>> > >>> > >>> > >>> Andrew > >>> dalke at dalkescientific.com > >> > >> > >> _______________________________________________ > >> DAS2 mailing list > >> DAS2 at lists.open-bio.org > >> http://lists.open-bio.org/mailman/listinfo/das2 > >> > > > > _______________________________________________ > > DAS2 mailing list > > DAS2 at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/das2 From dalke at dalkescientific.com Thu Nov 9 17:30:51 2006 From: dalke at dalkescientific.com (Andrew Dalke) Date: Thu, 9 Nov 2006 23:30:51 +0100 Subject: [DAS2] TYPE[@source] -> TYPE[@method] In-Reply-To: References: Message-ID: <63d2531f67ae3890c9fa4aacf7bd0dff@dalkescientific.com> [dup to Gregg; had forgotten change the reply to all of das2] On Nov 6, 2006, at 4:58 PM, Helt,Gregg wrote: > I agree that multiple uses of "source" makes it confusing, and that for > types "method" is a reasonable alternative. On a related note, do we > really need both "title" and "source/method" attributes for types? The "method" attribute is the method used to generate features of the given type. Eg, "Genscan 1.23". The title is a human readable string about the type. I've been thinking of it as Server A: Type1 = "high confidence gene predictions" from "Genscan 1.23" so_accession="0000704" Type2 = "low confidence gene predictions" from "Genscan 1.23" so_accession="0000704" Server B: Type3 = "high confidence gene predictions" from "HMMGene 1.1" so_accession="0000704" Type4 = "low confidence gene predictions" from "HMMGene 1.1" so_accession="0000704" where the types are used to get different styles; perhaps different colors. The example in the RNC was ambiguous on this. It used "binding site" as the sole example. I've added "High confidence Genscan predictions" as a title and changed the genscan method example from "genscan" to "Genscan 1.23" BTW, as a client implementor, how do you lay these on a track? I presume information about track sharing goes in the stylesheet? Andrew dalke at dalkescientific.com From dalke at dalkescientific.com Thu Nov 9 17:55:49 2006 From: dalke at dalkescientific.com (Andrew Dalke) Date: Thu, 9 Nov 2006 23:55:49 +0100 Subject: [DAS2] Ontology URIs (was RE: types.rnc) In-Reply-To: References: Message-ID: Steve: > The ontology attribute in the type element is currently documented as: > > # ontology identifier. The naming scheme is still undecided. > # This will be a URI. > attribute ontology { text }?, > > I think this is too vague. It's subject to lots of interpretation as > to what > it could point at and what it might resolve to. It could justifiably > be used > to identify any of these: Indeed it could. There are other parts of DAS which are URI identifiable but which are not guaranteed to be resolvable. Eg, the individual features from a feature search don't need to be resolvable. It could be dealt with as an opaque string. Excepting how it interacts with relative url absolutizing. > The so_accession attribute gets us most of what we want and should > suffice > for this freeze. In one fell swoop it identifies the ontology and a > particular term within it, and it defers the issue of ontology URIs. What about leaving it there as "this is reserved for future use"? > Some SO things to consider: > > 1) Should so_accession be restricted to SOFA (only locatable feature > types)? > If so, call it sofa_accession. (maybe too limiting) I have no experience with this to guide me. I'm a structure guy. ;) > 2) What about SO versioning? Maybe a 'so_version' attribute would make > sense > (so_version="SOFA 2.1"). SO term IDs are stable across releases, but > sometimes terms become obsolete and are no longer listed. No. That does not work, for two reasons. You say the IDs are stable across releases. I assume that includes that obsolete ones are not reused. If the client knows how to interpret "2.1" to get information about an old identifier then it knows how to find the identifier in a list. Other reason - you're reinventing the semantics described by LSIDs. Why not just create an lsid naming scheme like urn:lsid:biodas.org:sofa-2.1:0000123 and use the URI. Okay, there's a third. Suppose the client knows nothing about the so term, even with the version information. (Eg, it's a new version, new term, and the client hasn't been updated; or there's a bug on in the server code causing all numbers to be twice as large.) What does the client do? I assert that it will treat unknown or missing ontology terms as being identical to an direct descendent from the root node of SO. Hence obsolete, new and erroneous terms are treated the same, so having the extra version field doesn't help the client. Andrew dalke at dalkescientific.com From dalke at dalkescientific.com Thu Nov 9 18:17:04 2006 From: dalke at dalkescientific.com (Andrew Dalke) Date: Fri, 10 Nov 2006 00:17:04 +0100 Subject: [DAS2] Adding an optional "searchable" attribute to element In-Reply-To: References: Message-ID: Gregg: > In the last DAS/2 teleconference I brought up again the idea of an > optional "searchable" or "filter" attribute for the elements > returned from a types query -- if present and "false", then that type > should not be used in a feature query filter. Too tired to work on this. Tomorrow. Andrew dalke at dalkescientific.com From dalke at dalkescientific.com Thu Nov 9 18:32:59 2006 From: dalke at dalkescientific.com (Andrew Dalke) Date: Fri, 10 Nov 2006 00:32:59 +0100 Subject: [DAS2] Ontology URIs (was RE: types.rnc) In-Reply-To: References: Message-ID: Gregg: > The reason there is both an "ontology" and "so_accession" attribute is > that we didn't have an official SO URI syntax to refer to, Didn't -- and still don't, right? > I agree with Steve that we could collapse "so_accession" and > "ontology" down to one attribute and use a prefix shorthand for SO/SOFA > terms, for example "SO:0000147". We could. When making the RNC I left the "SO:" prefix out deliberately in the number, leaving "0000147". The reason was to be very insistent that it was SO and only SO that could to there. Else people would start adding other terms, because after all the format is obviously "namespace" + "version number". > This has the nice property that the > shorthand is in fact a legal absolute URI, and therefore unaffected by > any "xml:base" attributes in the document. I'd instead prefer this URI > to be a URL that points to a description at the biomedical ontology > center. But specifying that the attribute is a URI allows both the > shorthand and later a more official link. But if we go for systems with no default resolver, why not use LSIDs? url:lsid:biodas.org:go:0000147 > Andrew's comment below made me realize we may have another problem -- > not annotation with multiple ontologies, but rather annotation with > multiple terms from the SO. The type record (like most other records) have a slot at the end for arbitrary non-das2-namespaced XML elements. When this gets to be a problem let people experiment with various ways to do it. Eg, No reason to solve it now, as we've no data which needs this. Andrew dalke at dalkescientific.com From cjm at fruitfly.org Thu Nov 9 18:35:46 2006 From: cjm at fruitfly.org (Chris Mungall) Date: Thu, 9 Nov 2006 15:35:46 -0800 Subject: [DAS2] Ontology URIs (was RE: types.rnc) In-Reply-To: References: Message-ID: <60DAD5CB-763E-4F65-85FA-3FC2E33B5A9C@fruitfly.org> On Nov 9, 2006, at 10:07 AM, Helt,Gregg wrote: > > >> -----Original Message----- >> From: Chervitz, Steve >> Sent: Wednesday, November 08, 2006 5:08 PM >> To: Chris Mungall; Helt,Gregg >> Cc: DAS/2 Discussion >> Subject: Re: [DAS2] Ontology URIs (was RE: types.rnc) >> >> Seems like we may need to freeze the spec in a state that is fairly >> non-committal w/r/t how ontology identifiers work. I propose to >> remove > the >> parts that are still not nailed down, so that we don't engender the >> creation >> of mutually incompatible implementations (one of the problems with > DAS/1 >> which DAS/2 is aiming at). >> >> The ontology attribute in the type element is currently documented >> as: >> >> # ontology identifier. The naming scheme is still undecided. >> # This will be a URI. >> attribute ontology { text }?, >> >> I think this is too vague. It's subject to lots of interpretation as > to >> what >> it could point at and what it might resolve to. It could justifiably > be >> used >> to identify any of these: >> >> - a specific term in an ontology >> - the ontology as a whole (e.g., homepage of GO) >> - evidence code (as in the example below) >> The so_accession attribute gets us most of what we want and should > suffice >> for this freeze. In one fell swoop it identifies the ontology and a >> particular term within it, and it defers the issue of ontology URIs. >> >> Some SO things to consider: >> >> 1) Should so_accession be restricted to SOFA (only locatable feature >> types)? >> If so, call it sofa_accession. (maybe too limiting) >> >> 2) What about SO versioning? Maybe a 'so_version' attribute would >> make >> sense >> (so_version="SOFA 2.1"). SO term IDs are stable across releases, but >> sometimes terms become obsolete and are no longer listed. >> >> Steve >> > > The "ontology" attribute of the TYPE element is meant to be an > identifier for a specific ontology term in the SO or SOFA. It (and > its > placeholder, "so_accession") is the only place where any part of DAS/2 > depends directly on an ontology. GO terms (or any other ontology) can > be used as properties of features -- the biopackages server does this > for example. But it is done using a generic property mechanism that > makes no mention of ontologies, and the DAS/2 spec does not mention or > depend on any ontology other than SO. > > The reason there is both an "ontology" and "so_accession" attribute is > that we didn't have an official SO URI syntax to refer to, so we > created > a temporary "so_accession" attribute to use until we had something to > put in for "ontology". Since the ontology attribute can _only_ be > from > SO or SOFA, I agree with Steve that we could collapse > "so_accession" and > "ontology" down to one attribute and use a prefix shorthand for SO/ > SOFA > terms, for example "SO:0000147". This has the nice property that the > shorthand is in fact a legal absolute URI, and therefore unaffected by > any "xml:base" attributes in the document. I'd instead prefer this > URI > to be a URL that points to a description at the biomedical ontology > center. But specifying that the attribute is a URI allows both the > shorthand and later a more official link. > > Allen Day and Brian O'Connor have implemented an ontology server > with an > HTTP API that fits in very well with DAS/2, where each ontology > term has > its own URI. This was discussed back on the DAS/2 mailing list in > February and I think Chris had some concerns, here's the start of the > thread: > http://portal.open-bio.org/pipermail/das2/2006-February/000507.html . > To avoid divergence I've been reluctant to devote more resources to > this > unless it was in collaboration with the ontology center. well I wouldn't like to hold anything up! By december it will be possible to browse all OBO ontologies, but any plans for providing stables URIs and programmatic access will probably wait til next year. If you have an ontology server ready, go with it. It's still unclear what the best approach is for serving up ontologies is, though the future is looking decidedly rdf/owl/sparqly. > I don't think we really need SO versioning -- to be useful it > places an > extra burden on the ontology maintainers. And looking at the current > SO, when a term becomes obsolete it is still included in the ontology, > it just gets flagged with an "is_obsolete:true" tag. I agree. This is policy for all good OBO ontologies; any change in the substance of a definition results in a new ID. > Andrew's comment below made me realize we may have another problem -- > not annotation with multiple ontologies, but rather annotation with > multiple terms from the SO. I had thought each feature type could be > based on a single ontology term (maybe using SO composite terms: > http://www.bioontology.org/wiki/index.php/SO:Composite_Terms), but > looking at the latest SO I don't think we can make this assumption. > Which argues that "so_accession" should be a child element of TYPE > rather than an attribute, and one or more be allowed. Or am I reading > the SO wrong? Lincoln? Chris? Any DAS feature F should be associated with a single SO:located_sequence_feature T(I would submit that the formal interpretation of this be: all actual genomic entities that instantiate the pattern represented by F should instantiate the pattern represented by T) However, a feature can be associated with multiple properties - these will be subtypes of SO:atribute. > As far as Chris' question as to what exactly an ontology URL should > dereference to, relative to the DAS/2 spec I don't think it matters > too > much. An XML response with some structured description like what > Allen's server returns would be nice, but I could see the benefits of > HTML as the default too. There is a discussion on public-semweb-lifesci on the relative merits of content negaotiation with URIs right now.. > Did I mention I'm a fan of content > negotiation? In most of the DAS/2 HTTP GET requests, we have optional > "format=" query parameter arguments to allow alternative format > requests > even in situations where HTTP content negotiation is not > straightforward. That's fine, on the understand that suffixing the "format=" creates a different URI > Gregg > >>> From: Chris Mungall >>> Date: Wed, 8 Nov 2006 17:11:09 -0500 >>> To: "Helt,Gregg" >>> Cc: DAS/2 >>> Subject: Re: [DAS2] Ontology URIs (was RE: types.rnc) >>> >>> >>> There absolutely needs to be a stable URI scheme for referencing >>> types defined in ontologies. The details of the scheme aren't clear >>> yet. It will probably be http based (ie not LSID). >>> >>> Do you have specific requirements? Should the URI be a URL >>> dereferenceable in any browser? Should it dereference to html or RDF >>> or use content negotion to decide which? etc >>> >>> On Nov 8, 2006, at 2:54 PM, Helt,Gregg wrote: >>> >>>> I'll talk to Suzi in her role as co-PI at NCBO (National Center for >>>> Biomedical Ontolgoy). We may be able to quickly work out a URI >>>> syntax (even if implementation of what the URIs resolve to comes >>>> later). >>>> >>>> gregg >>>> >>>>> -----Original Message----- >>>>> From: Andrew Dalke [mailto:dalke at dalkescientific.com] >>>>> Sent: Tuesday, November 07, 2006 6:23 PM >>>>> To: Ed >>>>> Cc: Helt,Gregg >>>>> Subject: Re: types.rnc >>>>> >>>>> Ed: >>>>>> What bothers me is "still undecided". That doesn't belong in a >>>>>> "frozen" spec. Though I have no idea what the correct text to > put >>>>>> here is. >>>>> >>>>> Take for example >>>>> >>>>> http://genome.cbs.dtu.dk:9000/das/secretomep/types >>>>> >>>>> >>>>> >>>> category="protein sorting" description="Ab initio >>>>> predictions of >>>>> non-classical i.e. not signal peptide triggered protein secretion" >>>>> evidence="IEA" >>>>> >>>>> > ontology="http://www.geneontology.org/GO.evidence.shtml">35138 >>>>> >>>>> It uses an ontology URI to describe which ontology scheme is >>>>> used to describe the "evidence" value. In this case it means >>>>> "Inferred from Electronic Annotation" >>>>> >>>>> There is no long-term/stable URL scheme for GO. Do we >>>>> make something up? Do we say "use a URL" and leave it >>>>> at that? I'll go for the latter as every reasonable >>>>> scheme should end up as a URL. >>>>> >>>>> Except for those which are annotated from multiple ontologies. >>>>> >>>>> >>>>> >>>>> Andrew >>>>> dalke at dalkescientific.com >>>> >>>> >>>> _______________________________________________ >>>> DAS2 mailing list >>>> DAS2 at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/das2 >>>> >>> >>> _______________________________________________ >>> DAS2 mailing list >>> DAS2 at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/das2 > > From dalke at dalkescientific.com Thu Nov 9 18:39:06 2006 From: dalke at dalkescientific.com (Andrew Dalke) Date: Fri, 10 Nov 2006 00:39:06 +0100 Subject: [DAS2] Identifiers and URIs In-Reply-To: References: Message-ID: Steve: > All DAS/2 elements are identified with a uri attribute, but their > documentation isn't consistent. So I'm recommending this be tighted up > a > bit. I did. I said that "URI" refers to DAS2 objects and to external resources (like ontology) treated mostly as an identifier. "URL" and "href" are used for things viewed in more generic browsers. > I propose that all such comments have a consistent wording. How about > this: > > # A unique identifier for this [object-type] > uri Is the spec really so advanced that it's time to do proofing at this level? There are sections in the HTML spec labeled "XXX" because I'm hoping for feedback people concerning the questions listed therein. In talking with Gregg we finished up one of the biggest ones; the XID. We decided to steal from HTML4' "link" element. I've cleaned up the wording and filled in some more details. All checked in. It's 12:40. g'night. Andrew dalke at dalkescientific.com From Gregg_Helt at affymetrix.com Thu Nov 9 18:56:42 2006 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Thu, 9 Nov 2006 15:56:42 -0800 Subject: [DAS2] DAS/2 teleconference on Monday Message-ID: Just wanted to remind everyone that we're having an extra DAS/2 teleconference next Monday, at 9:30 AM PST. The agenda is to review this week's spec finalization for release of a frozen DAS/2.0 protocol. Dialin (US): 800-531-3250 Dialin (Intl): 303-928-2693 Conference ID: 2879055 Passcode: 1365 From cjm at fruitfly.org Thu Nov 9 18:57:04 2006 From: cjm at fruitfly.org (Chris Mungall) Date: Thu, 9 Nov 2006 15:57:04 -0800 Subject: [DAS2] Ontology URIs (was RE: types.rnc) In-Reply-To: References: Message-ID: <1FF932B3-EEBD-47C0-8C55-7F9DB1FD937A@fruitfly.org> On Nov 9, 2006, at 3:32 PM, Andrew Dalke wrote: > Gregg: >> The reason there is both an "ontology" and "so_accession" >> attribute is >> that we didn't have an official SO URI syntax to refer to, > > Didn't -- and still don't, right? is this something we should be doing? Karen, is this why you wanted to serve up per-term xml off of sequenceontology.org? It may make more sense to serve up RDF/OWL here I would really like to have a scheme that leaves the OBO ID unviolated; eg http://www.sequenceontology.org/owl#SO:0000001 Unfortunately jena has fatal problems with numbers immediately following the ':'. And Jena is one of the most commonly used RDF tools. Sigh This would work: http://www.sequenceontology.org/owl/SO#SO_0000001 but unfortunately involves string hacking on the ID >> I agree with Steve that we could collapse "so_accession" and >> "ontology" down to one attribute and use a prefix shorthand for SO/ >> SOFA >> terms, for example "SO:0000147". > > We could. When making the RNC I left the "SO:" prefix out > deliberately in the number, leaving "0000147". The reason was to > be very insistent that it was SO and only SO that could to there. This seems overly defensive. It would seem cleaner to use the same ID scheme throughout > Else people would start adding other terms, because after all the > format is obviously "namespace" + "version number". > >> This has the nice property that the >> shorthand is in fact a legal absolute URI, and therefore >> unaffected by >> any "xml:base" attributes in the document. I'd instead prefer >> this URI >> to be a URL that points to a description at the biomedical ontology >> center. But specifying that the attribute is a URI allows both the >> shorthand and later a more official link. > > But if we go for systems with no default resolver, why not use > LSIDs? > > url:lsid:biodas.org:go:0000147 LSIDs uniquely identify sequences of bytes. The sequence of bytes in the record GO:00000147 may change although the universal it refers to does not >> Andrew's comment below made me realize we may have another problem -- >> not annotation with multiple ontologies, but rather annotation with >> multiple terms from the SO. > > The type record (like most other records) have a slot at the end > for arbitrary non-das2-namespaced XML elements. When this gets to > be a problem let people experiment with various ways to do it. > > Eg, > > > > No reason to solve it now, as we've no data which needs > this. > > Andrew > dalke at dalkescientific.com > > _______________________________________________ > DAS2 mailing list > DAS2 at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/das2 > From enwired at gmail.com Thu Nov 9 19:43:20 2006 From: enwired at gmail.com (Ed) Date: Thu, 9 Nov 2006 16:43:20 -0800 Subject: [DAS2] DAS/2 teleconference on Monday In-Reply-To: <4aa3a7e70611091605o25b5555dsa90d6f6b1a8b8f61@mail.gmail.com> References: <4aa3a7e70611091605o25b5555dsa90d6f6b1a8b8f61@mail.gmail.com> Message-ID: <4aa3a7e70611091643j145f409elf50042b9218dfea1@mail.gmail.com> Sorry, reverse that: >From France: 08 00 907 839 >From UK: 08 00 40 49 467 2006/11/9, Ed : > > Just FYI: There international toll-free numbers for some countries: > > From UK: 08 00 907 839 > From France: 08 00 40 49 467 > > Some other countries are covered, too, but those are the only 2 I have on > hand. > > > 2006/11/9, Helt,Gregg : > > > > Just wanted to remind everyone that we're having an extra DAS/2 > > teleconference next Monday, at 9:30 AM PST. The agenda is to review > > this week's spec finalization for release of a frozen DAS/2.0 protocol. > > > > Dialin (US): 800-531-3250 > > Dialin (Intl): 303-928-2693 > > Conference ID: 2879055 > > Passcode: 1365 > > > > > > > > _______________________________________________ > > DAS2 mailing list > > DAS2 at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/das2 > > > > From enwired at gmail.com Thu Nov 9 19:05:40 2006 From: enwired at gmail.com (Ed) Date: Thu, 9 Nov 2006 16:05:40 -0800 Subject: [DAS2] DAS/2 teleconference on Monday In-Reply-To: References: Message-ID: <4aa3a7e70611091605o25b5555dsa90d6f6b1a8b8f61@mail.gmail.com> Just FYI: There international toll-free numbers for some countries: >From UK: 08 00 907 839 >From France: 08 00 40 49 467 Some other countries are covered, too, but those are the only 2 I have on hand. 2006/11/9, Helt,Gregg : > > Just wanted to remind everyone that we're having an extra DAS/2 > teleconference next Monday, at 9:30 AM PST. The agenda is to review > this week's spec finalization for release of a frozen DAS/2.0 protocol. > > Dialin (US): 800-531-3250 > Dialin (Intl): 303-928-2693 > Conference ID: 2879055 > Passcode: 1365 > > > > _______________________________________________ > DAS2 mailing list > DAS2 at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/das2 > From dalke at dalkescientific.com Fri Nov 10 02:04:31 2006 From: dalke at dalkescientific.com (Andrew Dalke) Date: Fri, 10 Nov 2006 08:04:31 +0100 Subject: [DAS2] Ontology URIs (was RE: types.rnc) In-Reply-To: <1FF932B3-EEBD-47C0-8C55-7F9DB1FD937A@fruitfly.org> References: <1FF932B3-EEBD-47C0-8C55-7F9DB1FD937A@fruitfly.org> Message-ID: Chris: > Andrew >> We could. When making the RNC I left the "SO:" prefix out >> deliberately in the number, leaving "0000147". The reason was to >> be very insistent that it was SO and only SO that could to there. > > This seems overly defensive. It would seem cleaner to use the same ID > scheme throughout What I wrote was > The sequence ontology (SO) is widely used but its identifiers are not > URIs. The 'so_accession' attribute contains the SO accession number > without the leading "SO:", as in "0000316". Note that the leading > zeros are important. This field should be interpreted as an opaque > string. (XXX should this be "0000316" or "SO:0000316"? I prefer the > latter.) The "XXX" in the spec mark places where I'm hoping for feedback. So far I haven't received any. Chris: > Andrew: >> But if we go for systems with no default resolver, why not use >> LSIDs? >> >> url:lsid:biodas.org:go:0000147 > > LSIDs uniquely identify sequences of bytes. The sequence of bytes in > the record GO:00000147 may change although the universal it refers to > does not LSIDs have concrete objects and abstract objects. The abstract object, if resolved, only returns metadata. This would be an LSID for an abstract object. If we have a so_version and so_accesssion, etc. as attributes then we could identically have an LSID referencing an abstract object. It makes no difference to clients and for the spec is promotes the push towards URIs. Andrew dalke at dalkescientific.com From Gregg_Helt at affymetrix.com Fri Nov 10 10:28:07 2006 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Fri, 10 Nov 2006 07:28:07 -0800 Subject: [DAS2] Progress on freezing the DAS/2.0 genome retrieval specification Message-ID: Thanks everyone for reviewing the DAS/2 genome retrieval documents this week and posting your comments. Andrew has incorporated this feedback into the latest das2_schemas.rnc document (das/das2/das2_schemas.rnc). It looks good to me, there is just one optional attribute to add that Andrew and I discussed yesterday, and then I think the schema can be frozen today. The genome retrieval HTML doc (das/das2/das2_get.html) still needs some editing before it can be frozen. Andrew and I will both be editing the doc this weekend. Anyone else with write access to the biodas CVS repository is welcome to help with the editing. If you plan to edit it in the next three days please let me know what sections so I can focus on other sections. Thanks again everyone, talk to you on Monday. Gregg From Gregg_Helt at affymetrix.com Fri Nov 10 10:40:09 2006 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Fri, 10 Nov 2006 07:40:09 -0800 Subject: [DAS2] Progress on XML Schema version of DAS/2.0 Message-ID: I'm working on an XML Schema version of the DAS/2.0 schema. I used the Trang schema tool to automatically convert the RelaxNG schema to an XSD doc. This has provided a good skeleton to start with, but it looks like there are a number of issues I'll have to fix by hand. There are many places where the XSD specifies ordered sequences of elements where there shouldn't be any ordering restrictions. Also the way Trang translated the idea of non-DAS extensions is messy. And a lot of comments got lost in translation. None of these issues look too problematic. I expect more problems will come up, but I am making progress. Gregg From Gregg_Helt at affymetrix.com Sat Nov 11 06:11:31 2006 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Sat, 11 Nov 2006 03:11:31 -0800 Subject: [DAS2] Progress on XML Schema version of DAS/2.0 Message-ID: I've checked in an XML-Schema translation of the das2_schemas.rnc doc in the biodas CVS repository as das2_schemas.xsd (http://cvs.biodas.org/cgi-bin/viewcvs/viewcvs.cgi/das/das2/das2_schemas .xsd?rev=HEAD&cvsroot=biodas). I still have some concerns about how it will handle non-DAS extensions, and I also need to add back in some of the comments from the rnc doc. But I have tested that I can generate Java bindings from the XSD using Apache XMLBeans. To get that to work I also had to remove use of "xml:id" for now, it was causing XMLBeans to throw errors. Gregg > -----Original Message----- > From: das2-bounces at lists.open-bio.org [mailto:das2-bounces at lists.open- > bio.org] On Behalf Of Helt,Gregg > Sent: Friday, November 10, 2006 7:40 AM > To: DAS/2 Discussion > Subject: [DAS2] Progress on XML Schema version of DAS/2.0 > > I'm working on an XML Schema version of the DAS/2.0 schema. I used the > Trang schema tool to automatically convert the RelaxNG schema to an XSD > doc. This has provided a good skeleton to start with, but it looks like > there are a number of issues I'll have to fix by hand. There are many > places where the XSD specifies ordered sequences of elements where there > shouldn't be any ordering restrictions. Also the way Trang translated > the idea of non-DAS extensions is messy. And a lot of comments got lost > in translation. None of these issues look too problematic. I expect > more problems will come up, but I am making progress. > > Gregg > > > > _______________________________________________ > DAS2 mailing list > DAS2 at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/das2 From aloraine at uab.edu Sun Nov 12 09:13:14 2006 From: aloraine at uab.edu (Ann Loraine) Date: Sun, 12 Nov 2006 08:13:14 -0600 Subject: [DAS2] Arabidopsis DAS-es? Message-ID: <83722dde0611120613h4b08cf88r3764ed83d4112602@mail.gmail.com> Dear all, I heard that there is at least one working & supported DAS for Arabidopsis at EBI or NASC. I've looked all over the NASC site and although they mention DAS, they don't give the URL for a DAS server, so far as I can tell. Same for EBI & Ensembl, but of course it's very possible I missed it. Maybe some-one on the list from EBI could fill me in on the details? I would need the URL, obviously :-) Yours, Ann -- Ann Loraine Assistant Professor Section on Statistical Genetics University of Alabama at Birmingham http://www.ssg.uab.edu http://www.transvar.org From aloraine at gmail.com Sun Nov 12 10:24:22 2006 From: aloraine at gmail.com (Ann Loraine) Date: Sun, 12 Nov 2006 09:24:22 -0600 Subject: [DAS2] FYI: Arabidopsis DAS at PlantGDB Message-ID: <83722dde0611120724r641e3dak18a8cc8119baf948@mail.gmail.com> Hi, This is an update on Arabidopsis DAS sites: Iowa State hosts a plant DAS site, but so far I haven't been able to get IGB to talk to it. It fails with this error: [java] DAS request1: http://www.plantgdb.org/cgi-bin/das/ATGDB151_das/features?segment=4:15849536,15854959;type=EST_alignment%3AGeneSeqer_cognate;type=cDNA_alignment%3AGeneSeqer_cognate [java] Attempting to load data from URL: http://www.plantgdb.org/cgi-bin/das/ATGDB151_das/features?segment=4:15849536,15854959;type=EST_alignment%3AGeneSeqer_cognate;type=cDNA_alignment%3AGeneSeqer_cognate [java] [Fatal Error] :14:83: The reference to entity "dbid" must end with the ';' delimiter. [java] Problem parsing DAS XML data: The reference to entity "dbid" must end with the ';' delimiter. The problem appears to be lines such as: 23308168 which include "&" symbols that don't signal the start of an entity. I have written to PlantGDB to ask about this...I'll keep you posted! It might be useful to add a few links to trusted on-line XML validators to the upcoming re-done bioDAS Web site to make it easier for DAS providers to check their XML well-formedness. Here's one I just now used: http://validator.aborla.net/ Many people who implement DAS services are likely to be beginning programmers...or programmers like me who don't do it full-time & can use refreshers :-) Yours, Ann PS If this doesn't get posted to the list, could some-one post it for me? -- Ann Loraine Assistant Professor Section on Statistical Genetics University of Alabama at Birmingham http://www.ssg.uab.edu http://www.transvar.org From ap3 at sanger.ac.uk Mon Nov 13 05:27:52 2006 From: ap3 at sanger.ac.uk (Andreas Prlic) Date: Mon, 13 Nov 2006 10:27:52 +0000 Subject: [DAS2] sources.rnc In-Reply-To: References: <4aa3a7e70611071119j69487b14x680367c174696d5b@mail.gmail.com> <76b3521b73785f6cbb2540cdde62ed03@sanger.ac.uk> Message-ID: <1ad41038300a35c6dda74dde4f4e2951@sanger.ac.uk> Hi Andrew, >> so the sources command is available via: >> http://www.dasregistry.org/registry/das1/sources > > Any chance of making that URL shorter? done. - thanks to our webteam this is now http://www.dasregistry.org/das1/sources > And it > no longer includes das1 sources. I guess you mean das2 sources? - I hope that will change now with the frozen spec ;-) > Also, I can't find anywhere on the HTML which points to that we have a documentation page that explains how scripts (or DAS clients) can talk to the registry and get the list of available DAS servers at: http://www.dasregistry.org/help_scripting.jsp Cheers, Andreas ----------------------------------------------------------------------- Andreas Prlic Wellcome Trust Sanger Institute Hinxton, Cambridge CB10 1SA, UK +44 (0) 1223 49 6891 From ap3 at sanger.ac.uk Mon Nov 13 05:52:15 2006 From: ap3 at sanger.ac.uk (Andreas Prlic) Date: Mon, 13 Nov 2006 10:52:15 +0000 Subject: [DAS2] FYI: Arabidopsis DAS at PlantGDB In-Reply-To: <83722dde0611120724r641e3dak18a8cc8119baf948@mail.gmail.com> References: <83722dde0611120724r641e3dak18a8cc8119baf948@mail.gmail.com> Message-ID: <5032842288e655b2d3d2a5f9fb534d5f@sanger.ac.uk> Hi Ann, > Iowa State hosts a plant DAS site, but so far I haven't been able to > get IGB to > talk to it. I was not aware of this DAS site - I will contact them and invite them to get their DAS servers registered in the DAS registry ... Cheers, Andreas ----------------------------------------------------------------------- Andreas Prlic Wellcome Trust Sanger Institute Hinxton, Cambridge CB10 1SA, UK +44 (0) 1223 49 6891 From Gregg_Helt at affymetrix.com Mon Nov 13 06:53:22 2006 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Mon, 13 Nov 2006 03:53:22 -0800 Subject: [DAS2] URIs for coordinates Message-ID: I'm editing the spec docs to clarify use of coordinates. Where can one find the URIs a server should use for the uri attribute in a COORDINATES element? I've looked at the HTML summary at http://www.dasregistry.org/help_coordsys.jsp , but this doesn't list any of the actual URIs I'm seeing used in the DAS registry. For example, http://das.sanger.ac.uk/dasregistry/coordsys/CS_SPICEDS5 for NCBI human assembly v35 from the DAS/2 registry sources doc: or http://das.sanger.ac.uk/dasregistry/coordsys/CS_DS5 for the same assembly from the DAS/1 registry sources doc: Also, shouldn't these be the same URI? thanks, Gregg From Gregg_Helt at affymetrix.com Mon Nov 13 08:37:25 2006 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Mon, 13 Nov 2006 05:37:25 -0800 Subject: [DAS2] Agenda for DAS/2 teleconference today Message-ID: DAS/2 Teleconference today at 9:30 AM PST Dialin (US): 800-531-3250 Dialin (Intl): 303-928-2693 Conference ID: 2879055 Passcode: 1365 Agenda: Specification Status of schema (das2_schemas.rnc) Ratification of schema freeze Status of XML Schema translation (das2_schemas.xsd) Formalizing query syntax? Status of genome retrieval specification doc (das2_get.html) Review of remaining issues in genome retrieval spec. Coordinates URIs Segment reference URIs Ontology URIs Revising example queries / responses Timeline for DAS/2 genome retrieval spec freeze. Other docs? Implementation status Validator Genome retrieval servers NetAffx queries responses biopackages queries responses DAS/1 --> DAS/2 conversion server cgi.biodas.org test server Sanger registry others? Example queries Biopackages ontology server Genome retrieval clients IGB queries responses others? From ap3 at sanger.ac.uk Mon Nov 13 09:05:55 2006 From: ap3 at sanger.ac.uk (Andreas Prlic) Date: Mon, 13 Nov 2006 14:05:55 +0000 Subject: [DAS2] URIs for coordinates In-Reply-To: References: Message-ID: <267c26ea87f6262d2b551af71655c4b6@sanger.ac.uk> Hi Gregg! > I?m editing the spec docs to clarify use of coordinates.? Where can > one find the URIs a server should use for the uri attribute in a > COORDINATES element?? Hm. The das registry does not provide a list of uris so far, I can provide such a listing. I believe the correct uri for the NCBI assembly version 35 for human should be something like http://www.dasregistry.org/coordsys/CS_DS5 > > Also, shouldn?t these be the same URI? yes they should. Cheers, Andreas ----------------------------------------------------------------------- Andreas Prlic Wellcome Trust Sanger Institute Hinxton, Cambridge CB10 1SA, UK +44 (0) 1223 49 6891 From Gregg_Helt at affymetrix.com Mon Nov 13 10:21:39 2006 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Mon, 13 Nov 2006 07:21:39 -0800 Subject: [DAS2] Agenda for DAS/2 teleconference today Message-ID: The version on biodas.org gets auto-updated from the cvs repository every night. However, for today and probably the rest of the week I'd recommend looking directly at the head of the CVS repository to make sure you've got the most recent version. Thanks, Gregg > -----Original Message----- > From: Brian Gilman [mailto:gilmanb at pantherinformatics.com] > Sent: Monday, November 13, 2006 5:42 AM > To: Helt,Gregg > Subject: Re: [DAS2] Agenda for DAS/2 teleconference today > > Hey Greg, > > Is the latest version of the get spec up on at biodas.org? Or should > I also look in cvs? > > Best, > > -B From gilmanb at pantherinformatics.com Mon Nov 13 10:17:52 2006 From: gilmanb at pantherinformatics.com (Brian Gilman) Date: Mon, 13 Nov 2006 10:17:52 -0500 Subject: [DAS2] Agenda for DAS/2 teleconference today In-Reply-To: References: Message-ID: <415FFF30-CDF6-4EB4-A8FF-2AFF203F595E@pantherinformatics.com> I'm going to be a little late to the call. I have a meeting from 11:30 - 1 today. Will that pose a problem? -B -- Brian Gilman President Panther Informatics Inc. E-Mail: gilmanb at pantherinformatics.com gilmanb at jforge.net AIM: gilmanb1 01000010 01101001 01101111 01001001 01101110 01100110 01101111 01110010 01101101 01100001 01110100 01101001 01100011 01101001 01100001 01101110 On Nov 13, 2006, at 8:37 AM, Helt,Gregg wrote: > DAS/2 Teleconference today at 9:30 AM PST > Dialin (US): 800-531-3250 > Dialin (Intl): 303-928-2693 > Conference ID: 2879055 > Passcode: 1365 > > Agenda: > > Specification > Status of schema (das2_schemas.rnc) > Ratification of schema freeze > Status of XML Schema translation (das2_schemas.xsd) > Formalizing query syntax? > > Status of genome retrieval specification doc (das2_get.html) > Review of remaining issues in genome retrieval spec. > Coordinates URIs > Segment reference URIs > Ontology URIs > Revising example queries / responses > Timeline for DAS/2 genome retrieval spec freeze. > Other docs? > Implementation status > Validator > Genome retrieval servers > NetAffx > queries > responses > biopackages > queries > responses > DAS/1 --> DAS/2 conversion server > cgi.biodas.org test server > Sanger registry > others? > Example queries > Biopackages ontology server > Genome retrieval clients > IGB > queries > responses > others? > > > > _______________________________________________ > DAS2 mailing list > DAS2 at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/das2 > From Gregg_Helt at affymetrix.com Mon Nov 13 12:10:05 2006 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Mon, 13 Nov 2006 09:10:05 -0800 Subject: [DAS2] URIs for coordinates Message-ID: I think having a URI for each coordinate system is important. We could use a simple syntax that constructs the URI from the coordinate system's authority, organism, and type. If it resolves to something informative that would be nice, but not necessary. Gregg > -----Original Message----- > From: Andreas Prlic [mailto:ap3 at sanger.ac.uk] > Sent: Monday, November 13, 2006 6:06 AM > To: Helt,Gregg > Cc: DAS/2 Discussion > Subject: Re: URIs for coordinates > > Hi Gregg! > > > I'm editing the spec docs to clarify use of coordinates.? Where can > > one find the URIs a server should use for the uri attribute in a > > COORDINATES element? > > > Hm. The das registry does not provide a list of uris so far, I can > provide such a listing. > > I believe the correct uri for the NCBI assembly version 35 for human > should be something like > > http://www.dasregistry.org/coordsys/CS_DS5 > > > > > > Also, shouldn't these be the same URI? > > yes they should. > > Cheers, > Andreas > > > ----------------------------------------------------------------------- > > Andreas Prlic Wellcome Trust Sanger Institute > Hinxton, Cambridge CB10 1SA, UK > +44 (0) 1223 49 6891 From gilmanb at pantherinformatics.com Mon Nov 13 15:33:12 2006 From: gilmanb at pantherinformatics.com (Brian Gilman) Date: Mon, 13 Nov 2006 15:33:12 -0500 Subject: [DAS2] XML Instance documents generate valid XML from XML Spy Message-ID: <4558D688.4000404@pantherinformatics.com> Hey Guys, I had XMLSpy output some instance documents based off the xsd and things look good. I've also bound the document to xmlbeans and will dump some documents and run them through the validator to make sure everything's working on that end. I did experience issues when trying to his current DAS2 servers and understand that everyone is working to make those compliant. Thanks very, very much for outputting the xsd. Client writing is now much, much easier and can be automated :-) Best, -B From gilmanb at pantherinformatics.com Mon Nov 13 15:50:41 2006 From: gilmanb at pantherinformatics.com (Brian Gilman) Date: Mon, 13 Nov 2006 15:50:41 -0500 Subject: [DAS2] XML Instance documents generate valid XML from XML Spy In-Reply-To: <4558D688.4000404@pantherinformatics.com> References: <4558D688.4000404@pantherinformatics.com> Message-ID: <4558DAA1.40302@pantherinformatics.com> I think I forgot to attach the XMl instance document!! Sorry! Here they are... -B Brian Gilman wrote: > Hey Guys, > > I had XMLSpy output some instance documents based off the xsd and > things look good. I've also bound the document to xmlbeans and will > dump some documents and run them through the validator to make sure > everything's working on that end. I did experience issues when trying > to his current DAS2 servers and understand that everyone is working to > make those compliant. Thanks very, very much for outputting the xsd. > Client writing is now much, much easier and can be automated :-) > > Best, > > -B > -------------- next part -------------- A non-text attachment was scrubbed... Name: features_from_xsd.xml Type: text/xml Size: 1324 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: segments_from_xsd.xml Type: text/xml Size: 751 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: sources_from_xsd.xml Type: text/xml Size: 1755 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: types_from_xsd.xml Type: text/xml Size: 912 bytes Desc: not available URL: From Steve_Chervitz at affymetrix.com Mon Nov 13 16:25:20 2006 From: Steve_Chervitz at affymetrix.com (Steve Chervitz) Date: Mon, 13 Nov 2006 13:25:20 -0800 Subject: [DAS2] HTML document re-org complete Message-ID: I just committed my large, re-organizational changes to das2_get.html, so others can now feel free to edit at will. Summary of what I did: * Re-organized into three main sections for consistency, readability: - General - Overview - Detailed * Simplified top summary table and fixed in-page navigation links. * Added TOC and subsection TOCs. * Added section numbers. * Added global sequence id section. * Misc typo fixes and wording improvements. * Noted a bad sentence in the third paragraph. Not sure the intent here ("some fetching some of the documents"?) The biodas.org viewable version of this document does not yet have these changes as I write: http://biodas.org/documents/das2/das2_get.html . It operates off of the anonymous CVS server which hasn't yet sync'd with the dev CVS server. Not sure how often this sync happens. I updated the biodas.org site to sync with CVS hourly, so the docs viewable from there will stay more current, but still may be out of date during this time of frequent updates. Steve From boconnor at ucla.edu Tue Nov 14 19:46:01 2006 From: boconnor at ucla.edu (Brian O'Connor) Date: Tue, 14 Nov 2006 16:46:01 -0800 Subject: [DAS2] biopackages DAS/2 server passed validation Message-ID: <455A6349.40301@ucla.edu> Hi, I finished validating the DAS/2 server at biopackages.net using Andrew's validator. After making a few small tweaks all document types pass. Here are the URLs I validated with: * http://das.biopackages.net/das/genome * http://das.biopackages.net/das/genome/human/17/segment * http://das.biopackages.net/das/genome/human/17/type * http://das.biopackages.net/das/genome/human/17/feature?segment=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fgenome%2FH_sapiens%2FB36.1%2Fdna%2Fchr1;overlaps=1:1000 I also fixed the bug with the "type" attribute in the CAPABILITY elements. They now are "features", "types", or "segments" to be compliant with the spec. --Brian From gilmanb at pantherinformatics.com Tue Nov 14 21:28:01 2006 From: gilmanb at pantherinformatics.com (Brian Gilman) Date: Tue, 14 Nov 2006 21:28:01 -0500 Subject: [DAS2] biopackages DAS/2 server passed validation In-Reply-To: <455A6349.40301@ucla.edu> References: <455A6349.40301@ucla.edu> Message-ID: <71C4FB6F-4621-4A72-8B74-64AC105C3ECE@pantherinformatics.com> Hey Guys, Is the source posted for the BioPackages and Affy server code posted on biodas? I'd like to utilize it to start on my other scientific projects. Best, -B -- Brian Gilman President Panther Informatics Inc. E-Mail: gilmanb at pantherinformatics.com gilmanb at jforge.net AIM: gilmanb1 01000010 01101001 01101111 01001001 01101110 01100110 01101111 01110010 01101101 01100001 01110100 01101001 01100011 01101001 01100001 01101110 On Nov 14, 2006, at 7:46 PM, Brian O'Connor wrote: > Hi, > > I finished validating the DAS/2 server at biopackages.net using > Andrew's > validator. After making a few small tweaks all document types pass. > Here are the URLs I validated with: > > * http://das.biopackages.net/das/genome > * http://das.biopackages.net/das/genome/human/17/segment > * http://das.biopackages.net/das/genome/human/17/type > * > http://das.biopackages.net/das/genome/human/17/feature?segment=http% > 3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fgenome%2FH_sapiens%2FB36.1%2Fdna% > 2Fchr1;overlaps=1:1000 > > I also fixed the bug with the "type" attribute in the CAPABILITY > elements. They now are "features", "types", or "segments" to be > compliant with the spec. > > --Brian > _______________________________________________ > DAS2 mailing list > DAS2 at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/das2 > From Gregg_Helt at affymetrix.com Tue Nov 14 21:45:00 2006 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Tue, 14 Nov 2006 18:45:00 -0800 Subject: [DAS2] biopackages DAS/2 server passed validation Message-ID: Thanks! Looks like most of the problem IGB was having with the biopackages server were due to the truncated 'type' attributes in CAPABILITY. Using IGB I'm still not getting features back from the biopackages server from a features query with overlaps and type filters, but I think that's a bug in IGB's request. Hope to fix tonight. Gregg > -----Original Message----- > From: das2-bounces at lists.open-bio.org [mailto:das2-bounces at lists.open- > bio.org] On Behalf Of Brian O'Connor > Sent: Tuesday, November 14, 2006 4:46 PM > To: das2 at lists.open-bio.org > Subject: [DAS2] biopackages DAS/2 server passed validation > > Hi, > > I finished validating the DAS/2 server at biopackages.net using Andrew's > validator. After making a few small tweaks all document types pass. > Here are the URLs I validated with: > > * http://das.biopackages.net/das/genome > * http://das.biopackages.net/das/genome/human/17/segment > * http://das.biopackages.net/das/genome/human/17/type > * > http://das.biopackages.net/das/genome/human/17/feature?segment=http%3A%2 F% > 2Fwww.ncbi.nlm.nih.gov%2Fgenome%2FH_sapiens%2FB36.1%2Fdna%2Fchr1;overlap s= > 1:1000 > > I also fixed the bug with the "type" attribute in the CAPABILITY > elements. They now are "features", "types", or "segments" to be > compliant with the spec. > > --Brian > _______________________________________________ > DAS2 mailing list > DAS2 at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/das2 From Gregg_Helt at affymetrix.com Wed Nov 15 00:43:31 2006 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Tue, 14 Nov 2006 21:43:31 -0800 Subject: [DAS2] biopackages DAS/2 server passed validation Message-ID: The Affy Genometry DAS/2 server code is in the Genoviz CVS repository on sourceforge (http://sourceforge.net/projects/genoviz/), under the das2_server directory. The core of it is a servlet, com.affymetrix.genometry.servlets.GenometryDas2Servlet. There is also a main class com.affymetrix.genometry.servlets.GenometryDas2Server that wraps the servlet inside a Jetty server and initializes server and servlet (though with enough configuration tinkering the servlet could probably be run in any servlet-supporting HTTP server). The servlet depends heavily on code in the genometry and igb directories. Gregg > -----Original Message----- > From: das2-bounces at lists.open-bio.org [mailto:das2-bounces at lists.open- > bio.org] On Behalf Of Brian Gilman > Sent: Tuesday, November 14, 2006 6:28 PM > To: Brian O'Connor > Cc: das2 at lists.open-bio.org > Subject: Re: [DAS2] biopackages DAS/2 server passed validation > > Hey Guys, > > Is the source posted for the BioPackages and Affy server code posted > on biodas? I'd like to utilize it to start on my other scientific > projects. > > Best, > > -B > -- > Brian Gilman > President Panther Informatics Inc. > E-Mail: gilmanb at pantherinformatics.com > gilmanb at jforge.net > AIM: gilmanb1 > > 01000010 01101001 01101111 > 01001001 01101110 01100110 > 01101111 01110010 01101101 > 01100001 01110100 01101001 > 01100011 01101001 01100001 > 01101110 > > > > On Nov 14, 2006, at 7:46 PM, Brian O'Connor wrote: > > > Hi, > > > > I finished validating the DAS/2 server at biopackages.net using > > Andrew's > > validator. After making a few small tweaks all document types pass. > > Here are the URLs I validated with: > > > > * http://das.biopackages.net/das/genome > > * http://das.biopackages.net/das/genome/human/17/segment > > * http://das.biopackages.net/das/genome/human/17/type > > * > > http://das.biopackages.net/das/genome/human/17/feature?segment=http% > > 3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fgenome%2FH_sapiens%2FB36.1%2Fdna% > > 2Fchr1;overlaps=1:1000 > > > > I also fixed the bug with the "type" attribute in the CAPABILITY > > elements. They now are "features", "types", or "segments" to be > > compliant with the spec. > > > > --Brian > > _______________________________________________ > > DAS2 mailing list > > DAS2 at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/das2 > > > > _______________________________________________ > DAS2 mailing list > DAS2 at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/das2 From boconnor at ucla.edu Tue Nov 14 23:02:05 2006 From: boconnor at ucla.edu (Brian O'Connor) Date: Tue, 14 Nov 2006 20:02:05 -0800 Subject: [DAS2] biopackages DAS/2 server passed validation In-Reply-To: <71C4FB6F-4621-4A72-8B74-64AC105C3ECE@pantherinformatics.com> References: <455A6349.40301@ucla.edu> <71C4FB6F-4621-4A72-8B74-64AC105C3ECE@pantherinformatics.com> Message-ID: <455A913D.3060504@ucla.edu> Hi Brian, The biopackages DAS/2 server code is stored under the GMOD project on SourceForge (http://sourceforge.net/projects/gmod/). It's under "das2" in the cvs repository. It's written using the mod_perl Apache interface. Hope that helps. --Brian Brian Gilman wrote: > Hey Guys, > > Is the source posted for the BioPackages and Affy server code > posted on biodas? I'd like to utilize it to start on my other > scientific projects. > > Best, > > -B > -- > Brian Gilman > President Panther Informatics Inc. > E-Mail: gilmanb at pantherinformatics.com > gilmanb at jforge.net > AIM: gilmanb1 > > 01000010 01101001 01101111 > 01001001 01101110 01100110 > 01101111 01110010 01101101 > 01100001 01110100 01101001 > 01100011 01101001 01100001 > 01101110 > > > > On Nov 14, 2006, at 7:46 PM, Brian O'Connor wrote: > >> Hi, >> >> I finished validating the DAS/2 server at biopackages.net using >> Andrew's >> validator. After making a few small tweaks all document types pass. >> Here are the URLs I validated with: >> >> * http://das.biopackages.net/das/genome >> * http://das.biopackages.net/das/genome/human/17/segment >> * http://das.biopackages.net/das/genome/human/17/type >> * >> http://das.biopackages.net/das/genome/human/17/feature?segment=http% >> 3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fgenome%2FH_sapiens%2FB36.1%2Fdna% >> 2Fchr1;overlaps=1:1000 >> >> I also fixed the bug with the "type" attribute in the CAPABILITY >> elements. They now are "features", "types", or "segments" to be >> compliant with the spec. >> >> --Brian >> _______________________________________________ >> DAS2 mailing list >> DAS2 at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/das2 >> > From gilmanb at pantherinformatics.com Wed Nov 15 08:50:41 2006 From: gilmanb at pantherinformatics.com (Brian Gilman) Date: Wed, 15 Nov 2006 08:50:41 -0500 Subject: [DAS2] biopackages DAS/2 server passed validation In-Reply-To: References: Message-ID: <617B1C5E-1BBD-414B-A864-AB93F11E080B@pantherinformatics.com> Great Guys, Thanks very much. -B -- Brian Gilman President Panther Informatics Inc. E-Mail: gilmanb at pantherinformatics.com gilmanb at jforge.net AIM: gilmanb1 01000010 01101001 01101111 01001001 01101110 01100110 01101111 01110010 01101101 01100001 01110100 01101001 01100011 01101001 01100001 01101110 On Nov 15, 2006, at 12:43 AM, Helt,Gregg wrote: > The Affy Genometry DAS/2 server code is in the Genoviz CVS > repository on > sourceforge (http://sourceforge.net/projects/genoviz/), under the > das2_server directory. The core of it is a servlet, > com.affymetrix.genometry.servlets.GenometryDas2Servlet. There is > also a > main class com.affymetrix.genometry.servlets.GenometryDas2Server that > wraps the servlet inside a Jetty server and initializes server and > servlet (though with enough configuration tinkering the servlet could > probably be run in any servlet-supporting HTTP server). The servlet > depends heavily on code in the genometry and igb directories. > > Gregg > >> -----Original Message----- >> From: das2-bounces at lists.open-bio.org [mailto:das2- >> bounces at lists.open- >> bio.org] On Behalf Of Brian Gilman >> Sent: Tuesday, November 14, 2006 6:28 PM >> To: Brian O'Connor >> Cc: das2 at lists.open-bio.org >> Subject: Re: [DAS2] biopackages DAS/2 server passed validation >> >> Hey Guys, >> >> Is the source posted for the BioPackages and Affy server code > posted >> on biodas? I'd like to utilize it to start on my other scientific >> projects. >> >> Best, >> >> -B >> -- >> Brian Gilman >> President Panther Informatics Inc. >> E-Mail: gilmanb at pantherinformatics.com >> gilmanb at jforge.net >> AIM: gilmanb1 >> >> 01000010 01101001 01101111 >> 01001001 01101110 01100110 >> 01101111 01110010 01101101 >> 01100001 01110100 01101001 >> 01100011 01101001 01100001 >> 01101110 >> >> >> >> On Nov 14, 2006, at 7:46 PM, Brian O'Connor wrote: >> >>> Hi, >>> >>> I finished validating the DAS/2 server at biopackages.net using >>> Andrew's >>> validator. After making a few small tweaks all document types pass. >>> Here are the URLs I validated with: >>> >>> * http://das.biopackages.net/das/genome >>> * http://das.biopackages.net/das/genome/human/17/segment >>> * http://das.biopackages.net/das/genome/human/17/type >>> * >>> http://das.biopackages.net/das/genome/human/17/feature?segment=http% >>> 3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fgenome%2FH_sapiens%2FB36.1%2Fdna% >>> 2Fchr1;overlaps=1:1000 >>> >>> I also fixed the bug with the "type" attribute in the CAPABILITY >>> elements. They now are "features", "types", or "segments" to be >>> compliant with the spec. >>> >>> --Brian >>> _______________________________________________ >>> DAS2 mailing list >>> DAS2 at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/das2 >>> >> >> _______________________________________________ >> DAS2 mailing list >> DAS2 at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/das2 > From Gregg_Helt at affymetrix.com Wed Nov 15 12:25:06 2006 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Wed, 15 Nov 2006 09:25:06 -0800 Subject: [DAS2] biopackages DAS/2 server passed validation Message-ID: I've fixed some bugs in IGB and now it is able to retrieve some genome features from the biopackages server and visualize them. For example this feature query works: http://das.biopackages.net/das/genome/human/17/feature?segment=http%3A%2 F%2Fdas.biopackages.net%2Fdas%2Fgenome%2Fhuman%2F17%2Fsegment%2Fchr21;ov erlaps=26040000%3A26060000;type=SO%3AmRNA with URL-decoded query params: segment=http://das.biopackages.net/das/genome/human/17/segment/chr21 overlaps=26040000:26060000 type=SO:mRNA However, not all feature queries work. For example, another query, exactly the same as the above except for a different type filter: http://das.biopackages.net/das/genome/human/17/feature?segment=http%3A%2 F%2Fdas.biopackages.net%2Fdas%2Fgenome%2Fhuman%2F17%2Fsegment%2Fchr21;ov erlaps=26040000%3A26060000;type=SO%3ACDS with URL-decoded query params: segment=http://das.biopackages.net/das/genome/human/17/segment/chr21 overlaps=26040000:26060000 type=SO:CDS returns this error message: 500 Died at /usr/lib/perl5/site_perl/5.8.3/Package/Base/Devel.pm line 425. Should "SO:CDS" not be a searchable type? Also, is the full URI for the type supposed to be A) "SO:CDS" or B) "http://das.biopackages.net/das/genome/human/17/type/SO:CDS" ? According to XML Base resolution rules, with "SO:CDS" as the value for the TYPE uri attribute, then because there is a ":" before any "/", the full URI is (A). If the full URI is supposed to be (B), then the uri attribute should be "./SO:CDS" (given that xml:base is "http://das.biopackages.net/das/genome/human/17/type/"). Thanks, Gregg > -----Original Message----- > From: das2-bounces at lists.open-bio.org [mailto:das2-bounces at lists.open- > bio.org] On Behalf Of Helt,Gregg > Sent: Tuesday, November 14, 2006 6:45 PM > To: Brian O'Connor; das2 at lists.open-bio.org > Subject: Re: [DAS2] biopackages DAS/2 server passed validation > > Thanks! Looks like most of the problem IGB was having with the > biopackages server were due to the truncated 'type' attributes in > CAPABILITY. > > Using IGB I'm still not getting features back from the biopackages > server from a features query with overlaps and type filters, but I think > that's a bug in IGB's request. Hope to fix tonight. > > Gregg > > > -----Original Message----- > > From: das2-bounces at lists.open-bio.org [mailto:das2-bounces at lists.open- > > bio.org] On Behalf Of Brian O'Connor > > Sent: Tuesday, November 14, 2006 4:46 PM > > To: das2 at lists.open-bio.org > > Subject: [DAS2] biopackages DAS/2 server passed validation > > > > Hi, > > > > I finished validating the DAS/2 server at biopackages.net using > Andrew's > > validator. After making a few small tweaks all document types pass. > > Here are the URLs I validated with: > > > > * http://das.biopackages.net/das/genome > > * http://das.biopackages.net/das/genome/human/17/segment > > * http://das.biopackages.net/das/genome/human/17/type > > * > > > http://das.biopackages.net/das/genome/human/17/feature?segment=http%3A%2 > F% > > > 2Fwww.ncbi.nlm.nih.gov%2Fgenome%2FH_sapiens%2FB36.1%2Fdna%2Fchr1;overlap > s= > > 1:1000 > > > > I also fixed the bug with the "type" attribute in the CAPABILITY > > elements. They now are "features", "types", or "segments" to be > > compliant with the spec. > > > > --Brian > > _______________________________________________ > > DAS2 mailing list > > DAS2 at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/das2 > > _______________________________________________ > DAS2 mailing list > DAS2 at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/das2 From Gregg_Helt at affymetrix.com Wed Nov 15 15:22:22 2006 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Wed, 15 Nov 2006 12:22:22 -0800 Subject: [DAS2] New Test Affy DAS/2 server Message-ID: Steve and I have a new test version of the Affy Genometry DAS/2 server up and running, at http://netaffxdas.affymetrix.com/das2/test/sources. For compatibility with the current release of IGB we are keeping the older version of the server at http://netaffxdas.affymetrix.com/das2/sources, until we can synchronize a server upgrade with a new IGB release. Sample test server requests: Sources: http://netaffxdas.affymetrix.com/das2/test/sources Segments: http://netaffxdas.affymetrix.com/das2/test/sources/H_sapiens_May_2004/se gments Types: http://netaffxdas.affymetrix.com/das2/test/sources/H_sapiens_May_2004/ty pes Features with query filters: http://netaffxdas.affymetrix.com/das2/test/sources/H_sapiens_Mar_2006/fe atures?segment=http%3A%2F%2Fnetaffxdas.affymetrix.com%2Fdas2%2Ftest%2Fso urces%2FH_sapiens_Mar_2006%2Fchr21;overlaps=26040000%3A26070000;type=htt p%3A%2F%2Fnetaffxdas.affymetrix.com%2Fdas2%2Ftest%2Fsources%2FH_sapiens_ Mar_2006%2FknownGene URL-decoded query params: segment=http://netaffxdas.affymetrix.com/das2/test/sources/H_sapiens_Mar _2006/chr21 overlaps=26040000:26070000 type=http://netaffxdas.affymetrix.com/das2/test/sources/H_sapiens_Mar_20 06/knownGene Responses to these queries all pass the DAS/2 validator. This latest version of the Genometry DAS/2 server does not yet support the full range of DAS/2 feature queries and feature filters required by the DAS/2 specification. For the server to send a useful response containing features, the feature query string must currently contain: 1 type filter 1 segment filter 1 overlaps filter 0 or 1 inside filter 0 or 1 format parameter 0 other filters/parameters To comply with the spec, when the server receives queries it doesn't support it tries in most cases to return allowable error messages. But at the moment we are having a problem with getting these error messages passed unaltered through our proxy server -- the errors end up being generic 502 'Bad Gateway' messages. We plan to fix this problem and also add fuller feature query filter support as soon as possible. If you compile IGB from the head of the Genoviz CVS repository (http://sourceforge.net/cvs/?group_id=129420), you can access the new server in the DAS/2 tab as "Affy Test Server". Please let me know if you find any problems! Thanks, Gregg From dalke at dalkescientific.com Mon Nov 20 14:17:42 2006 From: dalke at dalkescientific.com (Andrew Dalke) Date: Mon, 20 Nov 2006 20:17:42 +0100 Subject: [DAS2] DAS1 TYPE attribute "category" is what in DAS2? Message-ID: Several questions listed here. DAS1 TYPE elements had an attribute "category". Here are some of the categories listed in DAS1 servers ALPHA-BETA-MOTIF, ASX-MOTIF, ASX-TURN, BETA-BULGE, BETA-BULGE-LOOP, BETA-TURN, CATMAT-3, CATMAT-4, GAMMA-TURN, HELIX-L, NEST, SCHELLMANN-LOOP, ST-MOTIF, ST-STAPLE, ST-TURN, enzyme, miscellaneous motif, pathway, rRNA, repeat, rfam, structural, tRNA transcription, transmembrane prediction DAS1 says: category (optional, recommended) attribute, which provides functional grouping to related types. *TOPIC*: What should I do for automated conversion in my proxy system? Currently I have: DAS1 "id" used to make DAS2 "uri" (via url encoding) DAS1 "method" copied into DAS2 "method" DAS1 (non-standard extension) "description" copied into DAS1 "description" DAS1 (non-standard extensions) "ontology" and "evidence" used to fake a DAS2 "ontology" uri *Q1*: If there is a DAS1 "category" should I use it to make a DAS2 "title"? Gregg's viewer merges types into a single track based on the title, so I that feels correct to me. *Q2*: If the title is not given, should I use the DAS1 "id" as the DAS2 "title"? I think that is correct. *Q3*: If there's no DAS1 "description" extension to use for DAS2's "description" should I copy DAS1's "title" instead (which in turn might come from the "category" and/or the "id" fields). My feeling is no, that is not appropriate. *Q4*: I fake an ontology if I can. Does anyone know examples of DAS1 extensions with to support ontologies other than TMHMM, which has 1766831 For now I convert that into http://www.geneontology.org/GO.evidence.shtml#IEA Andrew dalke at dalkescientific.com From lstein at cshl.edu Mon Nov 20 11:20:16 2006 From: lstein at cshl.edu (Lincoln Stein) Date: Mon, 20 Nov 2006 11:20:16 -0500 Subject: [DAS2] Cannot attend today Message-ID: <6dce9a0b0611200820v25d158au5581938a841ba559@mail.gmail.com> Hi All, I can't attend the conference call today because of a conflict with the CSHL retreat. Best, Lincoln -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From ap3 at sanger.ac.uk Tue Nov 21 11:26:36 2006 From: ap3 at sanger.ac.uk (Andreas Prlic) Date: Tue, 21 Nov 2006 16:26:36 +0000 Subject: [DAS2] DAS1 TYPE attribute "category" is what in DAS2? In-Reply-To: References: Message-ID: <8766a85a180e098ee4b9e17ac359d4c7@sanger.ac.uk> Hi Andrew, > DAS1 TYPE elements had an attribute "category". Here are some > of the categories listed in DAS1 servers Currently the DAS/1 types are not used in a consistent way and so far have not been used much... One of the things that is done as part of the BioSapiens project is to come up with a more consistent definition which annotation types to use. > *Q1*: If there is a DAS1 "category" should I use it to make a DAS2 > "title"? > > Gregg's viewer merges types into a single track based on the title, so > I that feels correct to me. in DAS/1 the annotation types are used to merge features into a single track, therefore I think the das/1 type would be the equivalent to das/2- title then. > *Q2*: If the title is not given, should I use the DAS1 "id" as the DAS2 > "title"? it think that is correct. > *Q3*: If there's no DAS1 "description" extension to use for DAS2's > "description" > should I copy DAS1's "title" instead (which in turn might come from the > "category" and/or the "id" fields). My feeling is no, that is not > appropriate. err - which DAS/2 request do you talk about ? still about types? > *Q4*: I fake an ontology if I can. Does anyone know examples of > DAS1 extensions with to support ontologies other than TMHMM, which has So far this is not used in a consistent way ... BioSapiens will come up with a convention, but it is still work in progress... Andreas ----------------------------------------------------------------------- Andreas Prlic Wellcome Trust Sanger Institute Hinxton, Cambridge CB10 1SA, UK +44 (0) 1223 49 6891 From aloraine at gmail.com Fri Nov 24 09:40:41 2006 From: aloraine at gmail.com (Ann Loraine) Date: Fri, 24 Nov 2006 08:40:41 -0600 Subject: [DAS2] cvs and code examples? Message-ID: <83722dde0611240640m1b3344d0x874b39fd1f31768c@mail.gmail.com> Hi, Can some-one send me URLs for viewcvs & directions for cvs access of biodas code? Thank you! -Ann From aloraine at gmail.com Fri Nov 24 20:00:03 2006 From: aloraine at gmail.com (Ann Loraine) Date: Fri, 24 Nov 2006 19:00:03 -0600 Subject: [DAS2] cvs and code examples? In-Reply-To: References: <83722dde0611240640m1b3344d0x874b39fd1f31768c@mail.gmail.com> Message-ID: <83722dde0611241700s2e7c832n22b86ca9ccee8a9c@mail.gmail.com> Thanks Brian! What code would you recommend I use for setting up a DAS? I am doing a project where the volume of annotations is so great that I can't keeping loading them all at once into IGB via Quickload or File->Open. -Ann Since I'm already looking at a genome browser (IGB) I don't need to see their. On 11/24/06, Brian Osborne wrote: > Ann, > > It's here: > > http://www.open-bio.org/wiki/SourceCode > > > Brian O. > > > On 11/24/06 9:40 AM, "Ann Loraine" wrote: > > > Hi, > > > > Can some-one send me URLs for viewcvs & directions for cvs access of > > biodas code? > > > > Thank you! > > > > -Ann > > _______________________________________________ > > DAS2 mailing list > > DAS2 at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/das2 > > > -- Ann Loraine Assistant Professor Departments of Genetics, Biostatistics, and Section on Statistical Genetics University of Alabama at Birmingham http://www.ssg.uab.edu http://www.transvar.org From Steve_Chervitz at affymetrix.com Sun Nov 26 23:42:01 2006 From: Steve_Chervitz at affymetrix.com (Steve Chervitz) Date: Sun, 26 Nov 2006 20:42:01 -0800 Subject: [DAS2] Notes from the weekly DAS/2 teleconference, 20 Nov 2006 Message-ID: [Note: No DAS teleconference on 27 Nov. Next one is on 4 Dec] Notes from the weekly DAS/2 teleconference, 20 Nov 2006 $Id: das2-teleconf-2006-11-20.txt,v 1.1 2006/11/27 04:31:31 sac Exp $ Note taker: Steve Chervitz Attendees: Affy: Steve Chervitz, Gregg Helt, Ed Erwin Dalke Scientific: Andrew Dalke UCLA: Brian O'connor Action items are flagged with '[A]'. These notes are checked into the biodas.org CVS repository at das/das2/notes/2006. Instructions on how to access this repository are at http://biodas.org DISCLAIMER: The note taker aims for completeness and accuracy, but these goals are not always achievable, given the desire to get the notes out with a rapid turnaround. So don't consider these notes as complete minutes from the meeting, but rather abbreviated, summarized versions of what was discussed. There may be errors of commission and omission. Participants are welcome to post comments and/or corrections to these as they see fit. Agenda ------- * Spec discussion * Status reports Topic: Spec Discussion ----------------------- gh: has everyone reviewed steve's re-orged html doc of the retrieval spec? [To summarize the re-org: everything has been organized into three main sections: general, overview, and detailed. There's a table of contents and all sections and subsections are numbered a la W3C specs. The summary table at the top has been simplified, and in-page navigation has been fixed and improved.] consensus: no, haven't looked at in detail yet. [A] all give the new html retrieval spec doc a read through again. gh: need to add examples of alignments. I plan to announce in the next two weeks that it's ready. When reviewing, pay special attention to comments marked 'XXX' sc: esp in third para from the top, ambiguousness. ee: need commit privileges sc: sent email to support at open-bio.org. They are quick. Topic: Status reports --------------------- gh: First half of last week, getting affy das2 server up to snuff with spec changes and compliance with spec, had spotty compliance. correctly handle errors. email message about test server, different url than the public server. [A] gregg/steve to update public affy das2 server with new IGB release gh: lots of testing in last few days, ready to replace the existing server. ee: some zooming issues when switching servers. gh: worked on fixing bugs in das/2 client, problems w/ biopackages server due to problems in IGB, also w/r/t spec changes. also working on using das/2 in another context in igb, retrieving genomic locations. Expression console, recommended for processing affy chips (all expr), generates chp files. big request esp for whole exon folks, need igb to load these, but the chp files have no genomic location info, had to pre-load these in past. Now, when you load a chip file, igb automatically goes out via das2 and retrieves it. hardwired is what server to go to, based on genome + types from server, figures out what das request to make to load it. per-sequence basis, and lazy, doesn't load all locations for whole genome. big files: 1M probe sets, + 4 probes w diff locs. Also, igb gets it back in compact binary format (using alt format) -- new use of das in igb. not committed yet, but will be cool. ee: data? gh: yes. we need more bp2 files there. Will try and have igb prompt user for file if it can't look it up automatically. [A] gregg send ed a write up so he can get it in the release notes. gh: bottom line: efficient way to look at expr data in igb gh: Third thing: prepping for this release of igb. targetting wed. server change over tues, igb on wed. may cause a day of hassles. ee: I'll be working in Paris on Wed, will be Tues day for US.. gh: so I'll be done by noon wed or earlier. gh: happy with progress on affy client server now. ad: cleaning up code I did on validator while in EBI. will check into dasypus CVS on sf. ee: on vacation, but working now. sc: worked on affy das2 server set up for testing ( http://netaffxdas.affymetrix.com/das2/test ) and worked on the igb keystore update (digital signature for affy jars). Ran into issue with the error codes sent by the affy das2 server getting altered by apache into another error (502, I believe). Need to figure out how to get apache to not alter these error responses. [A] steve figure out how to prevent apache from changing das/2 error responses gh: would assume not doing redirection through apache. but plan b is to not use apache. [affy das2 server uses jetty servlet engine, and apache forwards request to it via a rewrite rule.] sc: quick load access now requires apache. gh: should be able to load and serve these through jetty running on port 80. need to get apache to stop mucking with the http headers. bo: no new progress, but will look into filtering issue this week. 'so:' stuff. either allen of I will look into it this week. gh: i get right feat response for some but not all request. wonder if the 'so:' is involved. that's the only remaining issue I know of. both servers are passing validation. this was a high priority. for brian gilman and lincoln to make use of the das/2 spec. ad: any more feedback from bg? gh: no. he was going to start working on a server as well. [A] gregg contact brian gilman to see how things are going. ee: good news, bug I reported about zooming out was not a bug, but cause by me pressing the wrong button. related to changing genome version. [A] steve set up jar signing cert today so Ed can use tomorrow. Wrap up: -------- [A] review and modify das/2 html retrieval docs over the next few days. [A] Next meeting in two weeks (4 Dec 06) From bosborne11 at verizon.net Mon Nov 27 09:29:32 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Mon, 27 Nov 2006 09:29:32 -0500 Subject: [DAS2] DAS and DAS2 Message-ID: das2, My name is Brian Osborne, I?m working on documentation for GMOD and GMOD-related packages as part of the newly created GMOD Help Desk position. Some of my colleagues here in the GMOD community are recommending that we consider DAS, 1 or 2, as important GMOD-related software so I?m joining your list in order to learn more about DAS. I have some initial questions, I was wondering if someone could help me out with them (I did read the DAS Overview and browsed most of the specs at biodas.org). 1. Are DAS1 and DAS2 designed to inter-operate? For example, will I be able to use a DAS2 client and a DAS1 server? 2. Do you think DAS2 is going to replace DAS1 or co-exist with it? Yes, this may not be easy to answer. 3. Is there a DAS2 release date? Thanks again, Brian O. From Steve_Chervitz at affymetrix.com Thu Nov 30 17:59:22 2006 From: Steve_Chervitz at affymetrix.com (Steve Chervitz) Date: Thu, 30 Nov 2006 14:59:22 -0800 Subject: [DAS2] DAS and DAS2 In-Reply-To: Message-ID: Hi Brian, Brian Osborne wrote on Mon, 27 Nov 2006: > My name is Brian Osborne, I?m working on documentation for GMOD and > GMOD-related packages as part of the newly created GMOD Help Desk position. Great. Looking forward to more quality documentation for GMOD, a la your excellent contributions to Bioperl documentation. > Some of my colleagues here in the GMOD community are recommending that we > consider DAS, 1 or 2, as important GMOD-related software so I?m joining your > list in order to learn more about DAS. DAS is definitely appropriate for GMOD. Providing a DAS-compatible interface to MOD data would help write software tools and perform data analyses that integrate data from different sources. In fact, a DAS/2 server reference implementation is being developed within the GMOD sourceforge CVS, though it's not officially been released as part of GMOD. Here are the CVS commit logs for it. http://sourceforge.net/mailarchive/forum.php?forum_id=42210 Other DAS/2 software is also being developed under open source licenses. See links on http://biodas.org in the About section, look for "The DAS/2 code base". > I have some initial questions, I was > wondering if someone could help me out with them (I did read the DAS > Overview and browsed most of the specs at biodas.org). > > 1. Are DAS1 and DAS2 designed to inter-operate? For example, will I be able > to use a DAS2 client and a DAS1 server? DAS/2 is a complete redesign of the spec, so direct interoperation is not possible. However, DAS/2 has all of the capabilities of the DAS/1 spec (and more!). As proof of this, Andrew Dalke is developing a proxy adapter that will allow you to put a DAS/2 interface around an existing DAS/1 server, allowing DAS/2 clients to interact with existing DAS/1 servers: http://lists.open-bio.org/pipermail/das2/2006-October/000867.html To fully realize 1 <-> 2 interoperation, one would also need to write a DAS/1 proxy adapter for DAS/2 servers, to permit DAS/1 clients to interact with DAS/2 servers. I don't know of any plans for that yet. > 2. Do you think DAS2 is going to replace DAS1 or co-exist with it? Yes, this > may not be easy to answer. The proxy adapter approach should enable some degree of peaceful co-existence between DAS/1 and DAS/2 systems, and should facilitate the transition to DAS/2, which has many niceties not present in DAS/1. As far as replacing DAS/1, the proof will be in the pudding. > 3. Is there a DAS2 release date? The DAS/2 schema for retrieval of genomic annotations has been officially frozen since mid-November (das2_schemas.rnc and das2_schemas.xsd in the biodas/das/das2 CVS repository). The corresponding html version of this spec, viewable from biodas.org, is soon to be finalized as well (probably by end of next week). When that happens, DAS/2 for genome retrieval will be considered released. Stay tuned to this list for an announcement. The DAS/2 writeback spec is still under development and I don't believe a timeframe for it's release has been set. Steve From Steve_Chervitz at affymetrix.com Thu Nov 2 23:40:15 2006 From: Steve_Chervitz at affymetrix.com (Steve Chervitz) Date: Thu, 02 Nov 2006 15:40:15 -0800 Subject: [DAS2] community annotation bee genome sequence In-Reply-To: <83722dde0611020704h610239tab7acdbe5a961b99@mail.gmail.com> Message-ID: > http://www.genome.org/cgi/content/full/16/11/1329 Thanks for the link, Ann. They give a nice review of different annotation models. Interesting to see how they made use of centralized resources to enable their decentralized annotation effort. They say: "... the DAS system does not yet involve incorporating the community annotation data into an official set of gene models." Note the optimistic "yet". We're working on it! So presumably, they didn't use a DAS-based genome browser largely because of lack of editing support. They did use Apollo, but it's not clear how much they relied on its editing vs read-only viewing functionality. They cite a need for annotation mapping between different assembly versions. UCSC provides liftOver for this (but curiously, they don't provide a apiMel1 to apiMel2 chain file). Gregg has genometry-based tools for doing this, but they're not part of Genoviz/IGB at present. Steve From Gregg_Helt at affymetrix.com Mon Nov 6 15:18:59 2006 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Mon, 6 Nov 2006 07:18:59 -0800 Subject: [DAS2] Adding an optional "searchable" attribute to element Message-ID: In the last DAS/2 teleconference I brought up again the idea of an optional "searchable" or "filter" attribute for the elements returned from a types query -- if present and "false", then that type should not be used in a feature query filter. Here are snippets about this from discussion during the last code sprint (I've tried to strip it down to just the relevant parts): > -----Original Message----- > Sent: Monday, August 21, 2006 2:43 PM > To: DAS/2 > Subject: [DAS2] Notes from the weekly DAS/2 teleconference, 21 Aug 2006 > > Notes from the weekly DAS/2 teleconference, 21 Aug 2006 > Note taker: Steve Chervitz > ... > [The note taker apologizes for attending late (~30min)] > > gh: could a server in the types doc restrict the types. just say > 'transcripts'? > ls: yes. if not going to allow for searching for feature, only via > parent, then types doc should only include parent. > > gh: types doc specifies which types you can query on. > ls: ontology gives you access to all types that might come back > ad: and how to depict them. > gh: yes, but it can be restrictive of the types. > ad: what does client do to display it? > gh: implies we separate out style into stylesheet info again. > no one is serving or using, so we can change w/o major impl changes. > ad: type doc ties a feature to ontology, how to display it, and > includes this extra source field. > gh: types doc has all types server contains but tags as to what the > server allows searching on. > > ad: feels weird. can't see why i'd want to do in my server. > bo: better than limiting the types doc, just have a searchable field. > ad: easy > gh: if you don't say no, then it's searchable. this is backwards > compatible. > ... ... > > ad: range and non-range filters must both be true for a given feature > > gh: ok, as long as we can say in types doc that some types are not > filtered. > ... > [A] andrew will add searchable flag to type document ... The motivation for this addition to the spec is to allow a server to restrict what feature types a client can use for query _filtering_, while still allowing these types of features to be returned from feature queries and their display properties to be described in stylesheets. This restriction is important for my server implementation to make full use of ontologies in describing feature types. And in the more general case, I think it will be good for visualization clients. To use a concrete example, in a GUI I don't want to have to make the user choose between requesting "genscan-transcript", "genscan-exon", "genscan-intron", or some combination of these types to make sure they get all the "genscan" annotation information -- this is a recipe for confusion. Now a smart client that fully understands the sequence ontology could automatically simplify this for the user, but I don't expect most client implementations to be so smart -- after all, one of our goals is to have a low threshold for simple client and server implementations. In this example it would be much easier for the server to just specify that "genscan-exon" and "genscan-intron" are not usable in a query filter, and the client just shows "genscan-transcript" in the query options. This change in the spec is backward compatible, since elements without a "searchable" attribute would by default be searchable. It should be easy for clients to implement, and servers can implement it or ignore it. Gregg From Gregg_Helt at affymetrix.com Mon Nov 6 15:58:40 2006 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Mon, 6 Nov 2006 07:58:40 -0800 Subject: [DAS2] TYPE[@source] -> TYPE[@method] Message-ID: I agree that multiple uses of "source" makes it confusing, and that for types "method" is a reasonable alternative. On a related note, do we really need both "title" and "source/method" attributes for types? Both are optional and supposed to be short human-readable strings describing the type. For a longer description we also have the optional "description" element. gregg > -----Original Message----- > From: das2-bounces at lists.open-bio.org [mailto:das2-bounces at lists.open- > bio.org] On Behalf Of Andrew Dalke > Sent: Thursday, October 26, 2006 6:46 AM > To: DAS/2 > Subject: [DAS2] TYPE[@source] -> TYPE[@method] > > I would like to change the existing TYPE attribute of "source" > and have it use a different attribute name. Its meaning conflicts > with the other uses of "source" in DAS2. > > The best alternative is "method" because (I believe) it is supposed > to store the same information as the corresponding DAS1 TYPE attribute. > > > Andrew > dalke at dalkescientific.com > > _______________________________________________ > DAS2 mailing list > DAS2 at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/das2 From Gregg_Helt at affymetrix.com Mon Nov 6 16:53:22 2006 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Mon, 6 Nov 2006 08:53:22 -0800 Subject: [DAS2] segments and types Message-ID: > -----Original Message----- > From: das2-bounces at lists.open-bio.org [mailto:das2-bounces at lists.open- > bio.org] On Behalf Of Andrew Dalke > Sent: Friday, October 27, 2006 12:56 PM > To: DAS/2 > Subject: [DAS2] segments and types > > A couple of observations about what I've seen in existing > DAS1 servers. Nothing here concerns format changes. > > There are four different ways to handle segments: > 1) Don't provide segment information > "Our clients know the segment because of the id > so they don't need a segments document" > 2) use "size" (pre-DAS 1.0 spec) > 3) use "start"/"stop" (DAS 1.0 spec) > - with variations, like "0", "0" meaning the length is undefined > (and even "1", "0", with a size="2", for one server!) > 4) use a "version" field > > The last is mostly used for protein sequences, that I've seen. > Its an aspect of #1 ("9pti" means "bovine pancreatic trypsin > inhibitor structure from PDB") as an abstract identifier, with > the version used to make it concrete ("with the update because > the first release had a typo") I think it can be encapsulated > in the uri scheme we now use because each version gets it own > identifier, and since the client knows all versions there's no > problem. > > > The folks at EBI/Sanger (what's the correct collective term; > Hinxton? Genome Campus?) know which servers provide which > systems so many servers don't provide coordinates. > > In some cases, like rabbit, the server will generate about > 120,000 segments, one for each scaffold. It takes quite some time > (a minute or more) to generate the output. In theory this is > static and can be precomputed by the server. > > For my own knowledge, when do people want the complete list > of segments? When do they want the length? You, yes, you > there, in front of the computer. When do you you want to > use it? For (nearly) completely sequenced genomes, it is important to provide a complete list of genome segment ids/names. This allows a visualization client to provide this list for a user to select from if they are interested in particular genome locations or simply browsing, rather than having the id/name of a particular feature in mind. Now you could just have the user type in the id of a segment, but unless they are familiar with the vagaries of that particular server, do they request "chr1", or "1", "I", "chrI", "chrom1", etc? Length information for a segment is needed to place an upper bound on range queries to the server. And in a GUI client it is often more convenient for the user to indicate visually what range on the segment they want to retrieve data from, but this doesn't make sense without the client app knowing the length of the segment. Furthermore, once the client is displaying located annotations on a segment, it can be important to know where the end of the segment is relative to the locations of annotations. For less complete genomes (like rabbit), it's not so clear what advantage there is to having the list of 120,000 scaffolds to choose from. Same applies to list of proteins or mRNAs. > > Let me stress -- this is not a request to change anything. I > would like to know for my own sake, for writing the documentation, > and for how much emphasis to put on this for the validation. > > As another observation, the Sanger/EBI servers also don't > do much with the types document. Some don't even handle the > request. Eugene said that no one had asked him to add it. > It's there now (thanks Eugene). > > I think this is because most of their servers only had a single > type and the solution was "display everything." They are > running into difficulties with this for a few new servers and > will be need type support, and type filter support soonish. > > Andrew > dalke at dalkescientific.com From ap3 at sanger.ac.uk Mon Nov 6 16:34:05 2006 From: ap3 at sanger.ac.uk (Andreas Prlic) Date: Mon, 6 Nov 2006 16:34:05 +0000 Subject: [DAS2] move biodas website to a wiki? Message-ID: Hi! Over the last year several of the open-bio websites like BioPerl or BioJava have been moved to a Wiki. Looking at the current state of the biodas website, which is getting out of date and does not look well maintained I thought it might be good to do the same for biodas.org. We have a couple of announcements which would be good to put there - e.g. Ensembl now provides DAS reference and annotation servers for all its genomes, several new DAS-based applications are in the pipeline, the DAS registry now counts 170+ DAS servers, etc... what do you guys think about this idea? Cheers, Andreas ----------------------------------------------------------------------- Andreas Prlic Wellcome Trust Sanger Institute Hinxton, Cambridge CB10 1SA, UK +44 (0) 1223 49 6891 From Gregg_Helt at affymetrix.com Mon Nov 6 17:13:27 2006 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Mon, 6 Nov 2006 09:13:27 -0800 Subject: [DAS2] move biodas website to a wiki? Message-ID: Sounds like a good idea to me. Steve? Gregg > -----Original Message----- > From: das2-bounces at lists.open-bio.org [mailto:das2-bounces at lists.open- > bio.org] On Behalf Of Andreas Prlic > Sent: Monday, November 06, 2006 8:34 AM > To: DAS/2 > Subject: [DAS2] move biodas website to a wiki? > > Hi! > > Over the last year several of the open-bio websites like BioPerl or > BioJava have been moved to a Wiki. > Looking at the current state of the biodas website, which is getting > out of date and does not look well maintained I thought it might be > good to do the same for biodas.org. > > We have a couple of announcements which would be good to put there - > e.g. Ensembl now provides DAS reference and annotation servers for all > its genomes, several new DAS-based applications are in the pipeline, > the DAS registry now counts 170+ DAS servers, etc... > > what do you guys think about this idea? > > Cheers, > Andreas > > ----------------------------------------------------------------------- > > Andreas Prlic Wellcome Trust Sanger Institute > Hinxton, Cambridge CB10 1SA, UK > +44 (0) 1223 49 6891 > > _______________________________________________ > DAS2 mailing list > DAS2 at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/das2 From dalke at dalkescientific.com Mon Nov 6 18:18:49 2006 From: dalke at dalkescientific.com (Andrew Dalke) Date: Mon, 6 Nov 2006 19:18:49 +0100 Subject: [DAS2] unified rnc schema Message-ID: <59b794823b04b0963ed41cbd2b51fb0d@dalkescientific.com> the unified schema document is in CVS under das/das2/das2_schemas.rnc This is the merge of the existing rnc files, which were developed and distributed in the spring. There are stubs named types.rnc features.rnc segments.rnc sources.rnc which all look like this include "das2_schemas.rnc" start = sources Meaning that they import the main schema and define the root node appropriately for each specific document type. Andrew dalke at dalkescientific.com From Gregg_Helt at affymetrix.com Mon Nov 6 18:24:32 2006 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Mon, 6 Nov 2006 10:24:32 -0800 Subject: [DAS2] DAS/2 retrieval spec docs Message-ID: Location of DAS/2 get HTML docs: In the cvs.biodas.org repository (http://code.open-bio.org/cgi/viewcvs.cgi/das/das2/?cvsroot=biodas) HTML: das2_protocol.html das2_get.html Schema: draft3/*.rnc From Steve_Chervitz at affymetrix.com Mon Nov 6 19:16:49 2006 From: Steve_Chervitz at affymetrix.com (Steve Chervitz) Date: Mon, 06 Nov 2006 11:16:49 -0800 Subject: [DAS2] Notes from the weekly DAS/2 teleconference, 6 Nov 2006 Message-ID: Notes from the weekly DAS/2 teleconference, 6 Nov 2006 $Id: das2-teleconf-2006-11-06.txt,v 1.1 2006/11/06 19:13:26 sac Exp $ Note taker: Steve Chervitz Attendees: Affy: Steve Chervitz, Gregg Helt, Ed Erwin CHSL: Lincoln Stein Dalke Scientific: Andrew Dalke UAB: Ann Loraine UCLA: Allen Day, Brian O'connor Action items are flagged with '[A]'. These notes are checked into the biodas.org CVS repository at das/das2/notes/2006. Instructions on how to access this repository are at http://biodas.org DISCLAIMER: The note taker aims for completeness and accuracy, but these goals are not always achievable, given the desire to get the notes out with a rapid turnaround. So don't consider these notes as complete minutes from the meeting, but rather abbreviated, summarized versions of what was discussed. There may be errors of commission and omission. Participants are welcome to post comments and/or corrections to these as they see fit. Agenda ------- * 2.0 spec freeze discussion ls: hapmap project repercussion: instability of das/2 spec has lead for me to recommend against using it for hapmap db. Have had to put hapmap data onto a soap server. Need to have an internal project, would have control over the process. not best possible protocol, but I could promise delivery of working s'ware on a reliable schedule. ls: I had a real deadline to deliver an adaptor to the NCI by end of Nov, w/o having a spec that is in stone that I can write to, can't deliver by that date, and can't get extension. went on record a year ago saying get spec was stable, good to build on, and it's not. would like to ask that we freeze the spec, remain frozen, the next version be das/3 and we guarantee das/2 is frozen for at least 2 years. gh: ok with that. how do other's feel. ls: if brian gilman can write a das/2 adaptor for cabig by end of nov based on spec now, it's not a crisis. we have two dependent things: (1) is a das/2 adaptor for caCORE that can read das/2 sources, (2) das/2 server for hapmap and das/2 server for vertebrate promoter db. NCI will not accept delivery of das/2 adaptor after end of Nov. If so, then the other two projects (servers) would become irrelevant, and I would withdraw from those two as well. NCI decided the spec was never going to stabilize, so wasn't flexible in giving more time past November. ad: brian wants schema in XSD not rnc. some changes: source -> method ls: two things: (1) is spec changing too much? this conversation was to create enclosing tag to create a group of related feats for streaming purposes. (2) perception issue: killing me because the NCI people read the archives and can see that the das/2 spec is in thrash and is not converging. ad: parent field, adding one single attribute to each feature, not a major alteration. Then we have discussion in ensuing two weeks following teleconf. we can freeze rnc schema now, everything works now. aday: html document is out of date, not sure what's in schema. ad: I sent it around a while back. aday: haven't seen it. gh: I read it. easier to read it than the html doc. ad: html doc doesn't get touched because it's much harder to write. ls: then freeze the rnc, remove html, point people to RNC. gh: nothing wrong with html, but it should say the formal spec is in the RNC. ad: rnc it's not complete (e.g., reference genomes are defined on a web page someplace). schema is not going to be the spec. there are somethings that schema definitions can't describe. gh: should have a pointer to the rnc at top of html and say "it is frozen and will stay frozen as das 2.0" [A] place DAS 2.0 frozen notice on html spec doc (after 1-2 day analysis) gh: salvagable situation with NCI? [A] lincoln will notify NCI of DAS/2 schema freeze ad: before people can say yes, i need to check in unified version of the schema, then folks can sign off on a unified document. current doc is in 8 parts. i'll put the unified schema into version control. ls: it's in the 'draft3' subdir. ad: yes. [A] Andrew will consolidate rnc schema document, check into cvs, notify list aday: describes formatting of xml and what each fields do. incomplete to impl a server because it doesn't describe req/response cycle. ad: yes, this is a description of the format. aday: it's an incomplete format. ad: describes the stuff needed to be returned back from the server, so it's complete from a server implementer's point of view. ad: what is in the html that doesn't agree with schema? aday: property response, fasta. timestamp on the html doc has been changed 10/24/06. Need to read this again. ad: I made some minor changes while at the EBI. gh: freeze the schema, freeze the intent of the html to smooth out clarifications, all devs read both schema and html, OK with freezing it in the next day. [A] freeze schema as DAS 2.0 (get), freeze html intent and clarify, by 7 Nov 2006 gh: in light of that, improving the biodas site. andreas suggested turning into wiki doc. need to allow multiple people to edit. steve=biodas.org admin? sc: I tend to do most biodas.org upkeep. bioperl has migrated to a wiki format. can probably borrow their template and set up something similar for biodas.org. [A] steve will convert das site into wiki style site ad: typos and xml mistakes in april to the interaction document (writeback) in the last 6 mos. gh: not talking about freezing the writeback portion. ee: is ucla das/2 server working now, top level doc, can't use it via IGB. aday: brian is testing against the affy server, that sources doc is still not responding. [A] allen will fix sources doc on ucla server ad: proxy work (email) accept das/1, interface with das/2. Serving das/2 from a das/1 server. initial result was slow with python's templating lang. new stream based parser with stream based output for doing it. in progress now. gh: what about auto testing of das/2 servers from his registry. talk to andreas about it? ping for alive-ness? ad: should still work easily, can't remember what andreas said about it. tho. [A] gregg will ask andreas about live-ness testing das/2 servers via registry gh: uri's that affy server returns only work for a single version of each genome (latest version). trouble with xml:base that was partially fixed last week. [A] gregg/steve will fix affy server xml:base to support all genome versions gh: stabilize spec, read and sign off the spec, need to address and stabilize. when funding agencies start pulling plugs based on das/2, this is serious. ls: these are management consultants. if promised s'ware product cannot be delivered in working order in time expected, so they make the calc that it's better to cut their losses. ls: need a human readable html doc that's consistent with the rnc document. public declaration on the website that people can rely on it for 2 years. bo: Any developer will want to see this, as well as a reference implementation. gh: read html doc today/tomorrow, with an eye towards agreement with schema. I don't think it's that far off. [A] everyone read html doc for agreement with schema, finish by 7 Nov 2006 Other topic: DAS-related projects ---------------------------------- al: NSF plant science cyber infrastructure project: http://www.nsf.gov/pubs/2006/nsf06594/nsf06594.htm ls: univ of georgia, malmberg. Another one I'm doing in collab with myerowitz. al: incorporating anything from das, or viz work? ls: univ of GA plan, my role is in annotating plant pathways. in my project: all community annotation using wiki, kind of a plantopedia. text is not very structured, series of pages which have some constrained fields, genome annotation, genotypes, ontologies, everything else is text annot on top of it. reason: natural language processing has gotten good. people should not start dumbing down their communication with computers, but communicate in english. al: dense and compact abstract text. ls: people are identifying regions of text in an xmly way. al: proposals for centers, akin to ncbi for plant biology... is the plant wiki idea to be a component? ls: three main parts.... al: do people on das want to write something up for the cyber infrastructure? das/2 seems appropriate. ls: should talk about das being part of it. proposing an open source api, basically a bus, that allows you connect the consumers of data with producers of data. a s'ware layer that goes over an opaque transfer protocol. deliberately not be cross platform. A s'ware kit on laptops, prepopulated, autoupdates, spec that s'ware devs can write to. in terms of an xml protocol that people can plug into, people are never going to see that layer. al: like an OS. ls: in fact called the plant OS. al: terrified that nsf will give 1-2 groups all the funding and we'll have a monolithic structure. would like to try many ideas, let free market decide. How can we give the people who do the hard work enough funding to keep them involved, esp if they have 40hr/wk jobs as well. ls: they don't want cyber infrastruct proj to make awards to people. that would be taking over NSF's role. ls: only seven page write ups. al: I want to propose viz using das and microarray data. webservices and microarray data. takers? aday: webservices yes. gh: other commitments now. bo: mark carlson has been integrating das client into MeV. al: incorporate into lincoln's proposals? ls: problem now: i've identified who the PIs are on the project. Microarray viz=owen white at tigr. can't change it now. ls: An even bigger project that TIGR is involved in now is the biofuels initiative - $250M over 5 years. very important project, bigger than cyber infrastructure proj. biologist, engineers, nanotech, ecologists collaboration. al: do we want better das servers or cheap fuels? ls: not for cheap fuels, but global warming. Wrapup ------ gh: to wrap up this meeting, html documents can be edited out of the repository, latest get specs and rnc docs. no need for an editable web page. ls: andrew is going to freeze the rnc, then make one pass over the html, then open it up to all devs? could be chaotic via source code control. gh: will confer with andrew over the plan and let folks know. [A] Gregg will inform all how to proceed re: html doc editing. [A] Next DAS/2 conf call next Monday (13 Nov, 9:30am). From Steve_Chervitz at affymetrix.com Mon Nov 6 22:33:09 2006 From: Steve_Chervitz at affymetrix.com (Steve Chervitz) Date: Mon, 06 Nov 2006 14:33:09 -0800 Subject: [DAS2] move biodas website to a wiki? In-Reply-To: Message-ID: Distributing the load of maintaining this site sounds great to me (the de facto maintainer). Wikification is also good for consistency within the open-bio.org family. I've initiated the process. Might have something preliminary to show next week. Of course, this means we'll have to come with an icon. Suggestions? How about a armadillo driving a submarine that looks like a gene structure in a sea of DNA? Steve > From: "Helt,Gregg" > Date: Mon, 6 Nov 2006 09:13:27 -0800 > To: Andreas Prlic , DAS/2 > Conversation: [DAS2] move biodas website to a wiki? > Subject: Re: [DAS2] move biodas website to a wiki? > > Sounds like a good idea to me. Steve? > > Gregg > >> -----Original Message----- >> From: das2-bounces at lists.open-bio.org [mailto:das2-bounces at lists.open- >> bio.org] On Behalf Of Andreas Prlic >> Sent: Monday, November 06, 2006 8:34 AM >> To: DAS/2 >> Subject: [DAS2] move biodas website to a wiki? >> >> Hi! >> >> Over the last year several of the open-bio websites like BioPerl or >> BioJava have been moved to a Wiki. >> Looking at the current state of the biodas website, which is getting >> out of date and does not look well maintained I thought it might be >> good to do the same for biodas.org. >> >> We have a couple of announcements which would be good to put there - >> e.g. Ensembl now provides DAS reference and annotation servers for > all >> its genomes, several new DAS-based applications are in the pipeline, >> the DAS registry now counts 170+ DAS servers, etc... >> >> what do you guys think about this idea? >> >> Cheers, >> Andreas >> >> > ----------------------------------------------------------------------- >> >> Andreas Prlic Wellcome Trust Sanger Institute >> Hinxton, Cambridge CB10 1SA, UK >> +44 (0) 1223 49 6891 >> >> _______________________________________________ >> DAS2 mailing list >> DAS2 at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/das2 > > _______________________________________________ > DAS2 mailing list > DAS2 at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/das2 From enwired at gmail.com Tue Nov 7 20:53:14 2006 From: enwired at gmail.com (Ed) Date: Tue, 7 Nov 2006 12:53:14 -0800 Subject: [DAS2] Comments on features.rnc Message-ID: <4aa3a7e70611071253j79aba4b2l44bada5613bff598@mail.gmail.com> Here are my comments on the features.rnc document. ** Fix these comments: Yes, feature style can either partially or fully ** override the feature-type style. (Clients are free to ignore the style, though.) # how to represent the feature; overrides the STYLE in the # feature type (but how? completely? or can this override # the fgcolor but not the other settings?) style*, ** Remove this comment # XXX Need use-cases for this # I think clients should just figure it out from the location region = element REGION { ** I would very much like the xid to have an optional "name" attribute. ** (The client may have more than one URL link for each feature and it needs ** to be easy for the user to tell them apart.) ** Less important, I would like optionally more than one xid per feature. ** (If you are waiting until someone needs it, well, I am ready for it!) # Some human-readable external link. # XXX This needs some way to describe the kind of link # (primary id, accession), and other information (eg, # "promotes", "false positive". # Fixing this will wait until someone needs it. xid = element XID { common_attrs, attribute href { text } } ** No, this does not need anything else ** But, optionally, it might be possible to have more than one note. # Does this element need anything else? note = element NOTE { common_attrs, text } Thanks, Ed Erwin From enwired at gmail.com Tue Nov 7 20:52:40 2006 From: enwired at gmail.com (Ed) Date: Tue, 7 Nov 2006 12:52:40 -0800 Subject: [DAS2] Comments on segments.rnc Message-ID: <4aa3a7e70611071252j7306e325k5f9c4028b0cfdc48@mail.gmail.com> Segments.rnc looks good. There is just one typo: This example should say "format=fasta", not "=fast" # http://localhost/das/sequence/Chromosome1?format=fast From enwired at gmail.com Tue Nov 7 19:19:28 2006 From: enwired at gmail.com (Ed) Date: Tue, 7 Nov 2006 11:19:28 -0800 Subject: [DAS2] sources.rnc Message-ID: <4aa3a7e70611071119j69487b14x680367c174696d5b@mail.gmail.com> I have nothing substantial to change in sources.rnc. Just clean up the notes and comments: ** Can this note be improved? # NOTE: the segments capability has optional 'coordinates' # element to state that it implements the given coordinate # system. I could not figure out how to do that in Relax-NG. ##attribute coordinates { text }, ** Several references to "At present...." should be removed ** Can this note be cleaned-up? Which Andreas? Is this the ** full list of reserved words? # 'Chromosome', 'Clone', 'Contig', 'Scaffold', etc. # This is from a restricted vocabulary maintained by Andreas attribute source { text }, From dalke at dalkescientific.com Wed Nov 8 00:59:04 2006 From: dalke at dalkescientific.com (Andrew Dalke) Date: Wed, 8 Nov 2006 01:59:04 +0100 Subject: [DAS2] We need your input for DAS/2 spec freeze! In-Reply-To: <4550A37F.9080201@pantherinformatics.com> References: <4550A37F.9080201@pantherinformatics.com> Message-ID: Brian: > 1) Best practices for namespacing? What will I encounter in the wild? > or . It would be good to let people know what the > DAS2 specification writers preferred from the implementing community. > I know if doesn't really matter from the spec writer point of view > but, from and engineers point of view it really does. Anything I can > do to reduce the number of chars I'm using in and XML document to save > overhead we will try and do. I expect most will use the default namespace. From "an engineer's point of view" how does it make a difference? Every XML parser has to understand either case, plus the use of any other valid namespace prefix as an alternative. To reduce the number of characters, use the default namespace and use http request negotiation to compress the data stream. Terseness was not a design goal in DAS or DAS2. > 2) Error codes: You have sprinkled error codes into the document for > what an implementor will send back to the caller. Would be wonderful > to have all error codes put into one place or at least put into a > little table so we know what to implement when there is an error > condition. Those are the HTTP error codes. See the HTTP spec for them. Experience with DAS1 strongly suggests that another layer of response codes does not work. You end up needing to handle the HTTP response codes plus the higher layer codes -- and in DAS1 you could even return the error in the response XML. None of the DAS1 clients did anything special with the error codes, other than to report the error to the user. So we decided in a meeting some time ago to leave it at that. > Quick question on Features. Spec says, "Servers may respond with an > error if there are too many matching features to return." What error > shall I return here? Would be great to make all the errors explicit so > clients can either display the appropriate error message or recover > gracefully (latter being the most desirable outcome). Depends on the problem and how it's identified. Looking at RFC 2616 section 10 some of the likely ones are 500 - internal server error (eg, if your backend segfaults) 503 - if the server load is too high 504 - if you have a proxy forwarding to an internal server and the internal server takes too long 413 - Request Entity Too Large DAS clients should follow the HTTP spec. Nothing in DAS ended up needing an addition to the HTTP error codes. > 3) For C/C++/Java programmers - it would be great to have a list of > interfaces to code to that are business/institution agnostic - I'm > planning on doing this so maybe put me on the hook for those? Would > like some help with that though... Since I don't know what that means I can't help. > 4) One more plea for XML Schema! Can you guys spit out an XML schema? I can not. I don't understand XML Schema. When I look at it my brain gets fuzzy. None of the tools I regularly use understand XML schema, and my experience with schema-based (DTD) parser generators is that they break, badly, when there is a normally forwards-compatible change to the format. > Sorry to sound like a jerk but the RelaxNG website was last update in > Sept 2003! Probably because it because ISO/IEC 19757 and is part of the ISO DSDL effort. More recent work is under the new name; NVDL perhaps? """ISO DSDL was developed in part as a reaction against the PSVI/Type-Annotation approach adopted by XML Schemas.""" http://www.stylusstudio.com/xmldev/200605/post90040.html > I'll try and use Trang to spit out a schema but, again, this piece of > software is old and crusty. Aren't there any Relax-NG data binders so you don't need the conversion step? Since you want JAXB, have you tried its (experimental) Relax-NG support? http://java.sun.com/webservices/docs/1.5/jaxb/relaxng.html http://java.sun.com/developer/EJTechTips/2005/tt0524.html http://www.oxygenxml.com/ says it can convert between grammars, http://www.oxygenxml.com/ xml_schema_editor.html#converting_between_grammars > The converter allows one to convert a DTD or Relax NG (full or compact > syntax) grammar or a set of XML files to an equivalent XML Schema, DTD > or Relax NG (full or compact syntax) grammar. Where perfect > equivalence is not possible due to limitations of the target language > will generate an approximation of the source grammar. The > conversion functionality is available from Tools -> Trang Converter . As you can see, it's using Trang, which you've said is crufty. (Personally I would love it if 5 year old software of mine was still going strong and didn't need any more TLC from me.) There are also the following, but they also seem too dusty for you. https://relax-ng.dev.java.net/ (linked from Wikipedia) lists the following isorelax-jaxp-bridge ISO RELAX JARV API to JAXP 1.3 validation API bridge relaxer XML Schema Compiler relaxerstudio Model editor for Relaxer relaxngcc Application-level XML parser generator / data-binding tool rngom RELAX NG Object Model / Parser > Not sure what I'm going to get out of it. Why do I keep asking for > this? Because I'm LAZY. And so am I. Why would I want to do this? > I want to use XML parsing libraries that bind XML Schema to Java > objects and vice versa. I can also look at a schema and code a SAX > document handler pretty quickly. Even a DTD would work here because > it's super easy to convert DTD -> XML Schema. Again, can I entice with > Beer/Wine? ;-) The DAS2 schema is not hard. Really. Honestly. We're using a full-blow, ISO standards based schema definition, and a subset of that so parsers need only single token lookahead for disambiguation. It should be as trivially easy to support RNG as to support a DTD, with the added bonus that DTDs and namespaces don't mix. Andrew dalke at dalkescientific.com From dalke at dalkescientific.com Wed Nov 8 01:06:14 2006 From: dalke at dalkescientific.com (Andrew Dalke) Date: Wed, 8 Nov 2006 02:06:14 +0100 Subject: [DAS2] Comments on segments.rnc In-Reply-To: <4aa3a7e70611071252j7306e325k5f9c4028b0cfdc48@mail.gmail.com> References: <4aa3a7e70611071252j7306e325k5f9c4028b0cfdc48@mail.gmail.com> Message-ID: Ed: > Segments.rnc looks good. There is just one typo: This example should > say > "format=fasta", not "=fast" > > # http://localhost/das/sequence/Chromosome1?format=fast Got it. Checked in. Thanks. Andrew dalke at dalkescientific.com From dalke at dalkescientific.com Wed Nov 8 01:36:50 2006 From: dalke at dalkescientific.com (Andrew Dalke) Date: Wed, 8 Nov 2006 02:36:50 +0100 Subject: [DAS2] sources.rnc In-Reply-To: <4aa3a7e70611071119j69487b14x680367c174696d5b@mail.gmail.com> References: <4aa3a7e70611071119j69487b14x680367c174696d5b@mail.gmail.com> Message-ID: Ed: > I have nothing substantial to change in sources.rnc. Just clean up the > notes and comments: > > ** Can this note be improved? > > # NOTE: the segments capability has optional 'coordinates' > # element to state that it implements the given coordinate > # system. I could not figure out how to do that in Relax-NG. > ##attribute coordinates { text }, # NOTE: the segments capability has an optional 'coordinates' # element describing the supported coordinate system. Because # of the 'attribute *' a few lines above there is an ambiguity. # Any capability element may have a 'coordiantes' attribute # so there's no need for an explicit schema declaration. #attribute coordinates { text }?, > ** Several references to "At present...." should be removed At present those have been removed. > ** Can this note be cleaned-up? Which Andreas? Is this the > ** full list of reserved words? I listed Andreas earlier, regarding his use of "das1:types", etc. in a capability 'type'. I've added those as reserved fields. As to the one for > # 'Chromosome', 'Clone', 'Contig', 'Scaffold', etc. I've updated that to # For a full list of the "authority" and "source" values see # http://das.sanger.ac.uk/registry/help_coordsys.jsp # This refers to the "physical dimension" of the annotated data. # The following names are reserved: "Chromosome", "Clone", # "Contig", "Gene_ID", "NT_Contig", "Protein Sequence", # "Protein Structure", "Scaffold", "Volume Map". # The 'source' attribute corresponds to the coordinate # system 'type' in the DAS registry. attribute source { text }, # The name of an authority/institution that defines the accession # codes of a coordinate system or that provides a gene-build. # See the DAS registry help for a full list of reserved names. # A partial list is: "BDGP", "EMBL", "Entrez", "KEGG", "MGI", "NCBI", # "PDBresnum", "SDG" and "UniProt" and "ZFISH". attribute authority { text }, Changes made and das2_schemas.rnc has been checked in. Andrew dalke at dalkescientific.com From enwired at gmail.com Wed Nov 8 01:40:22 2006 From: enwired at gmail.com (Ed) Date: Tue, 7 Nov 2006 17:40:22 -0800 Subject: [DAS2] sources.rnc In-Reply-To: References: <4aa3a7e70611071119j69487b14x680367c174696d5b@mail.gmail.com> Message-ID: <4aa3a7e70611071740w5bdab602ya61944279a9b78a4@mail.gmail.com> thanks 2006/11/7, Andrew Dalke : > > > > Changes made and das2_schemas.rnc has been checked in. > > From ap3 at sanger.ac.uk Wed Nov 8 14:01:40 2006 From: ap3 at sanger.ac.uk (Andreas Prlic) Date: Wed, 8 Nov 2006 14:01:40 +0000 Subject: [DAS2] sources.rnc In-Reply-To: References: <4aa3a7e70611071119j69487b14x680367c174696d5b@mail.gmail.com> Message-ID: <76b3521b73785f6cbb2540cdde62ed03@sanger.ac.uk> Hi Andrew! > I listed Andreas earlier, regarding his use of "das1:types", > etc. in a capability 'type'. I've added those as reserved > fields. that is good - we are using this sources command now also as a back-port to describe the DAS/1 servers in the DAS registry. It might be good to have a link to the DAS - registry in general somewhere in the sources.rnc It now has its own domain at http://www.dasregistry.org/ so the sources command is available via: http://www.dasregistry.org/registry/das1/sources can you also add das1:stylesheet das1:sequence das1:dna das1:entry_points das1:structure das1:alignment which are supported by the registry? das1:segments is not being used currently, so this could be removed. > I've updated that to > > # For a full list of the "authority" and "source" values see > # http://das.sanger.ac.uk/registry/help_coordsys.jsp can you change that to http://www.dasregistry.org/registry/help_coordsys.jsp please? Thanks. Andreas ----------------------------------------------------------------------- Andreas Prlic Wellcome Trust Sanger Institute Hinxton, Cambridge CB10 1SA, UK +44 (0) 1223 49 6891 From Gregg_Helt at affymetrix.com Wed Nov 8 19:54:14 2006 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Wed, 8 Nov 2006 11:54:14 -0800 Subject: [DAS2] Ontology URIs (was RE: types.rnc) Message-ID: I'll talk to Suzi in her role as co-PI at NCBO (National Center for Biomedical Ontolgoy). We may be able to quickly work out a URI syntax (even if implementation of what the URIs resolve to comes later). gregg > -----Original Message----- > From: Andrew Dalke [mailto:dalke at dalkescientific.com] > Sent: Tuesday, November 07, 2006 6:23 PM > To: Ed > Cc: Helt,Gregg > Subject: Re: types.rnc > > Ed: > > What bothers me is "still undecided".? That doesn't belong in a > > "frozen" spec.? Though I have no idea what the correct text to put > > here is. > > Take for example > > http://genome.cbs.dtu.dk:9000/das/secretomep/types > > > category="protein sorting" description="Ab initio predictions of > non-classical i.e. not signal peptide triggered protein secretion" > evidence="IEA" > > ontology="http://www.geneontology.org/GO.evidence.shtml">35138 > > It uses an ontology URI to describe which ontology scheme is > used to describe the "evidence" value. In this case it means > "Inferred from Electronic Annotation" > > There is no long-term/stable URL scheme for GO. Do we > make something up? Do we say "use a URL" and leave it > at that? I'll go for the latter as every reasonable > scheme should end up as a URL. > > Except for those which are annotated from multiple ontologies. > > > > Andrew > dalke at dalkescientific.com From dalke at dalkescientific.com Wed Nov 8 22:19:53 2006 From: dalke at dalkescientific.com (Andrew Dalke) Date: Wed, 8 Nov 2006 23:19:53 +0100 Subject: [DAS2] Ontology URIs (was RE: types.rnc) In-Reply-To: References: Message-ID: <25e4f92b7df100b5e41a829b9dd1737e@dalkescientific.com> On Nov 8, 2006, at 8:54 PM, Helt,Gregg wrote: > I'll talk to Suzi in her role as co-PI at NCBO (National Center for > Biomedical Ontolgoy). We may be able to quickly work out a URI syntax > (even if implementation of what the URIs resolve to comes later). Doesn't saying that it's a URI suffice? Surely we aren't going to restrict it to a single ontology specification? Eg, what about people working on structure feature ontologies? Andrew dalke at dalkescientific.com From cjm at fruitfly.org Wed Nov 8 22:11:09 2006 From: cjm at fruitfly.org (Chris Mungall) Date: Wed, 8 Nov 2006 17:11:09 -0500 Subject: [DAS2] Ontology URIs (was RE: types.rnc) In-Reply-To: References: Message-ID: <8D4E0D97-D083-4A85-AED3-EFD369F9390A@fruitfly.org> There absolutely needs to be a stable URI scheme for referencing types defined in ontologies. The details of the scheme aren't clear yet. It will probably be http based (ie not LSID). Do you have specific requirements? Should the URI be a URL dereferenceable in any browser? Should it dereference to html or RDF or use content negotion to decide which? etc On Nov 8, 2006, at 2:54 PM, Helt,Gregg wrote: > I'll talk to Suzi in her role as co-PI at NCBO (National Center for > Biomedical Ontolgoy). We may be able to quickly work out a URI > syntax (even if implementation of what the URIs resolve to comes > later). > > gregg > >> -----Original Message----- >> From: Andrew Dalke [mailto:dalke at dalkescientific.com] >> Sent: Tuesday, November 07, 2006 6:23 PM >> To: Ed >> Cc: Helt,Gregg >> Subject: Re: types.rnc >> >> Ed: >>> What bothers me is "still undecided". That doesn't belong in a >>> "frozen" spec. Though I have no idea what the correct text to put >>> here is. >> >> Take for example >> >> http://genome.cbs.dtu.dk:9000/das/secretomep/types >> >> >> > category="protein sorting" description="Ab initio >> predictions of >> non-classical i.e. not signal peptide triggered protein secretion" >> evidence="IEA" >> >> ontology="http://www.geneontology.org/GO.evidence.shtml">35138 >> >> It uses an ontology URI to describe which ontology scheme is >> used to describe the "evidence" value. In this case it means >> "Inferred from Electronic Annotation" >> >> There is no long-term/stable URL scheme for GO. Do we >> make something up? Do we say "use a URL" and leave it >> at that? I'll go for the latter as every reasonable >> scheme should end up as a URL. >> >> Except for those which are annotated from multiple ontologies. >> >> >> >> Andrew >> dalke at dalkescientific.com > > > _______________________________________________ > DAS2 mailing list > DAS2 at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/das2 > From enwired at gmail.com Wed Nov 8 22:51:30 2006 From: enwired at gmail.com (Ed) Date: Wed, 8 Nov 2006 14:51:30 -0800 Subject: [DAS2] Fwd: DAS2 unsubscribe notification In-Reply-To: References: Message-ID: <4aa3a7e70611081451lce1ca8dt3bb260ad802065ca@mail.gmail.com> Don't worry, I simply moved my subscription to enwired at gmail.com Ed ---------- Forwarded message ---------- From: mailman-bounces at lists.open-bio.org Date: 8 nov. 2006 14:48 Subject: DAS2 unsubscribe notification To: enwired at gmail.com ed_erwin at affymetrix.com has been removed from DAS2. From dalke at dalkescientific.com Thu Nov 9 00:40:57 2006 From: dalke at dalkescientific.com (Andrew Dalke) Date: Thu, 9 Nov 2006 01:40:57 +0100 Subject: [DAS2] Ontology URIs (was RE: types.rnc) In-Reply-To: <8D4E0D97-D083-4A85-AED3-EFD369F9390A@fruitfly.org> References: <8D4E0D97-D083-4A85-AED3-EFD369F9390A@fruitfly.org> Message-ID: Chris Mungall: > There absolutely needs to be a stable URI scheme for referencing types > defined in ontologies. The details of the scheme aren't clear yet. It > will probably be http based (ie not LSID). > > Do you have specific requirements? Should the URI be a URL > dereferenceable in any browser? Should it dereference to html or RDF > or use content negotion to decide which? etc Browsers can treat them as opaque strings if they don't understand the ontology. Only if they want to do inferencing or interesting visualizations do they need to know about the ontology. As such, for now we expect clients to have a hard-coded list of known ontology identifiers. They do not need to have a default resolver and we have no use case for what that response might look like. Andrew dalke at dalkescientific.com From dalke at dalkescientific.com Thu Nov 9 01:01:08 2006 From: dalke at dalkescientific.com (Andrew Dalke) Date: Thu, 9 Nov 2006 02:01:08 +0100 Subject: [DAS2] sources.rnc In-Reply-To: <76b3521b73785f6cbb2540cdde62ed03@sanger.ac.uk> References: <4aa3a7e70611071119j69487b14x680367c174696d5b@mail.gmail.com> <76b3521b73785f6cbb2540cdde62ed03@sanger.ac.uk> Message-ID: Andreas: > It might be good to have a link to the DAS - registry in general > somewhere in the sources.rnc > It now has its own domain at http://www.dasregistry.org/ Yes. I had forgotten its name last night. > so the sources command is available via: > http://www.dasregistry.org/registry/das1/sources Any chance of making that URL shorter? It seems long. And it no longer includes das1 sources. Also, I can't find anywhere on the HTML which points to that sources document. How does someone find it? Without doing like I did and look in the back mailing list archive. ;) > can you also add > das1:stylesheet > das1:sequence > das1:dna > das1:entry_points > das1:structure > das1:alignment > > which are supported by the registry? > das1:segments is not being used currently, so this could be removed. Ahh, had gotten the terminology mixed up. All added. > >> I've updated that to >> >> # For a full list of the "authority" and "source" values see >> # http://das.sanger.ac.uk/registry/help_coordsys.jsp > > can you change that to > http://www.dasregistry.org/registry/help_coordsys.jsp > please? Done. All the above checked in. Andrew dalke at dalkescientific.com From Steve_Chervitz at affymetrix.com Thu Nov 9 01:07:42 2006 From: Steve_Chervitz at affymetrix.com (Steve Chervitz) Date: Wed, 08 Nov 2006 17:07:42 -0800 Subject: [DAS2] Ontology URIs (was RE: types.rnc) In-Reply-To: <8D4E0D97-D083-4A85-AED3-EFD369F9390A@fruitfly.org> Message-ID: Seems like we may need to freeze the spec in a state that is fairly non-committal w/r/t how ontology identifiers work. I propose to remove the parts that are still not nailed down, so that we don't engender the creation of mutually incompatible implementations (one of the problems with DAS/1 which DAS/2 is aiming at). The ontology attribute in the type element is currently documented as: # ontology identifier. The naming scheme is still undecided. # This will be a URI. attribute ontology { text }?, I think this is too vague. It's subject to lots of interpretation as to what it could point at and what it might resolve to. It could justifiably be used to identify any of these: - a specific term in an ontology - the ontology as a whole (e.g., homepage of GO) - evidence code (as in the example below) The so_accession attribute gets us most of what we want and should suffice for this freeze. In one fell swoop it identifies the ontology and a particular term within it, and it defers the issue of ontology URIs. Some SO things to consider: 1) Should so_accession be restricted to SOFA (only locatable feature types)? If so, call it sofa_accession. (maybe too limiting) 2) What about SO versioning? Maybe a 'so_version' attribute would make sense (so_version="SOFA 2.1"). SO term IDs are stable across releases, but sometimes terms become obsolete and are no longer listed. Steve > From: Chris Mungall > Date: Wed, 8 Nov 2006 17:11:09 -0500 > To: "Helt,Gregg" > Cc: DAS/2 > Subject: Re: [DAS2] Ontology URIs (was RE: types.rnc) > > > There absolutely needs to be a stable URI scheme for referencing > types defined in ontologies. The details of the scheme aren't clear > yet. It will probably be http based (ie not LSID). > > Do you have specific requirements? Should the URI be a URL > dereferenceable in any browser? Should it dereference to html or RDF > or use content negotion to decide which? etc > > On Nov 8, 2006, at 2:54 PM, Helt,Gregg wrote: > >> I'll talk to Suzi in her role as co-PI at NCBO (National Center for >> Biomedical Ontolgoy). We may be able to quickly work out a URI >> syntax (even if implementation of what the URIs resolve to comes >> later). >> >> gregg >> >>> -----Original Message----- >>> From: Andrew Dalke [mailto:dalke at dalkescientific.com] >>> Sent: Tuesday, November 07, 2006 6:23 PM >>> To: Ed >>> Cc: Helt,Gregg >>> Subject: Re: types.rnc >>> >>> Ed: >>>> What bothers me is "still undecided". That doesn't belong in a >>>> "frozen" spec. Though I have no idea what the correct text to put >>>> here is. >>> >>> Take for example >>> >>> http://genome.cbs.dtu.dk:9000/das/secretomep/types >>> >>> >>> >> category="protein sorting" description="Ab initio >>> predictions of >>> non-classical i.e. not signal peptide triggered protein secretion" >>> evidence="IEA" >>> >>> ontology="http://www.geneontology.org/GO.evidence.shtml">35138 >>> >>> It uses an ontology URI to describe which ontology scheme is >>> used to describe the "evidence" value. In this case it means >>> "Inferred from Electronic Annotation" >>> >>> There is no long-term/stable URL scheme for GO. Do we >>> make something up? Do we say "use a URL" and leave it >>> at that? I'll go for the latter as every reasonable >>> scheme should end up as a URL. >>> >>> Except for those which are annotated from multiple ontologies. >>> >>> >>> >>> Andrew >>> dalke at dalkescientific.com >> >> >> _______________________________________________ >> DAS2 mailing list >> DAS2 at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/das2 >> > > _______________________________________________ > DAS2 mailing list > DAS2 at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/das2 From enwired at gmail.com Thu Nov 9 01:13:19 2006 From: enwired at gmail.com (Ed) Date: Wed, 8 Nov 2006 17:13:19 -0800 Subject: [DAS2] Ontology URIs (was RE: types.rnc) In-Reply-To: References: <8D4E0D97-D083-4A85-AED3-EFD369F9390A@fruitfly.org> Message-ID: <4aa3a7e70611081713w475270b9i3d7b5c3072efc1f4@mail.gmail.com> so_accession sounds fine with me. 2006/11/8, Steve Chervitz : > > Seems like we may need to freeze the spec in a state that is fairly > non-committal w/r/t how ontology identifiers work. I propose to remove the > parts that are still not nailed down, so that we don't engender the > creation > of mutually incompatible implementations (one of the problems with DAS/1 > which DAS/2 is aiming at). > > The ontology attribute in the type element is currently documented as: > > # ontology identifier. The naming scheme is still undecided. > # This will be a URI. > attribute ontology { text }?, > > I think this is too vague. It's subject to lots of interpretation as to > what > it could point at and what it might resolve to. It could justifiably be > used > to identify any of these: > > - a specific term in an ontology > - the ontology as a whole (e.g., homepage of GO) > - evidence code (as in the example below) > > The so_accession attribute gets us most of what we want and should suffice > for this freeze. In one fell swoop it identifies the ontology and a > particular term within it, and it defers the issue of ontology URIs. > > Some SO things to consider: > > 1) Should so_accession be restricted to SOFA (only locatable feature > types)? > If so, call it sofa_accession. (maybe too limiting) > > 2) What about SO versioning? Maybe a 'so_version' attribute would make > sense > (so_version="SOFA 2.1"). SO term IDs are stable across releases, but > sometimes terms become obsolete and are no longer listed. > > Steve > > > From Steve_Chervitz at affymetrix.com Thu Nov 9 01:51:06 2006 From: Steve_Chervitz at affymetrix.com (Steve Chervitz) Date: Wed, 08 Nov 2006 17:51:06 -0800 Subject: [DAS2] Identifiers and URIs Message-ID: All DAS/2 elements are identified with a uri attribute, but their documentation isn't consistent. So I'm recommending this be tighted up a bit. Some examples from das2_schemas.rnc: # URL pointing directly to the given TYPE uri, # URL pointing directly to the feature uri, # URL for the actual sequence data. It implements the DAS2 # sequence request interface. uri, # A unique identifier for this coordinate. # This is an abstract identifier and might not be resolvable. # Two coordinates are the same if and only if they have the # same URI. uri, # unique URI for the SOURCE # Each source URI must be unique in sources list uri, I propose that all such comments have a consistent wording. How about this: # A unique identifier for this [object-type] uri If the entity is resolvable, then add: # This URL is resolvable to this [object-type] from a DAS/2 server. Otherwise: # This is an abstract identifier and might not be resolvable. In the abbreviations section of the rnc, the uri itself is described as: # URI to an object defined by the DAS spec uri = attribute uri { text } I'd change this to: # URL used to identify an object defined in a DAS/2 document. There are some places in the HTML retrieval document that could be updated to state 'uri' instead of 'id'. In the sources section: "All identifiers and href attributes ... follow the XML Base ..." Recommended change: "All uri and href attributes ... follow the XML Base ..." Another sentence in sources that could use a s/id/uri/g: "Each SOURCE id and VERSION id is fetchable." In the types section: "The 'uri' attribute is a URI ..." Change to: "The 'uri' attribute is a URL ..." Steve From boconnor at ucla.edu Thu Nov 9 00:01:01 2006 From: boconnor at ucla.edu (Brian O'Connor) Date: Wed, 08 Nov 2006 16:01:01 -0800 Subject: [DAS2] DAS/2 Server on biopackages.net Message-ID: <45526FBD.1010201@ucla.edu> Hi, FYI: the DAS/2 server on biopackges.net will be down while I try to fix the bug Ed reported on empty domain/source/versioned source documents. I'll email the list when the server is available again, should be a couple hours. --Brian From boconnor at ucla.edu Thu Nov 9 04:05:52 2006 From: boconnor at ucla.edu (Brian O'Connor) Date: Wed, 08 Nov 2006 20:05:52 -0800 Subject: [DAS2] DAS/2 Server on Biopackages.net Message-ID: <4552A920.8020001@ucla.edu> Hi, I brought the DAS/2 server back online and the bug with empty domain/source/versioned source documents should now be fixed. See: http://das.biopackages.net/das/genome. Also, I temporarily turned server caching off so I can make sure the server is responding correctly. I'll turn caching back on tomorrow after I've finished debugging/checking the responses. The server is now using CVS HEAD. Anyway, I've been looking over the output from our server and comparing it to the HTML spec and the RNC schema doc from cvs. I have a few comments/questions/bugs: Potential bugs on das.biopackages.net: * FIXED: segments response was missing xmlns * FIXED: domain/source/versioned source docs are not populated correctly * Coordinates is missing the source attribute (it's empty) which is required in the RNC * The capability responses look like: >>>> <<<< Whereas the HTML spec and RNC doc use "features", "types", and "segments". Should this be changed on the biopackages.net server? Questions about HTML spec/RNC doc: * The segments element has a required attribute of "uri" in the RNC doc, is this correct? The biopackages.net server only has a uri for a given segment and the examples from the HTML are the same. * It's a little confusing to have the "overview" and "detailed" sections separate in the HTML spec. I think it would make more sense to put the detailed section right after each overview or at least provide an anchor link at the end of each overview. * Anchor links are broken throughout the html doc. * the RNC mentions the type attribute under capability with: >>>> # A term describing the capability. The following are reserved # terms: segments, features, locks, writeback, das1:segments, # das1:types, das1:features attribute type { text }, <<<< Types should be listed here too. Also, could this be defined with: attribute type { "segments" | "features" | "types" | "locks" | "..." } to make it more clear? Please let Allen or I know if you have any problems using the biopackages.net server. --Brian From Gregg_Helt at affymetrix.com Thu Nov 9 18:07:32 2006 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Thu, 9 Nov 2006 10:07:32 -0800 Subject: [DAS2] Ontology URIs (was RE: types.rnc) Message-ID: > -----Original Message----- > From: Chervitz, Steve > Sent: Wednesday, November 08, 2006 5:08 PM > To: Chris Mungall; Helt,Gregg > Cc: DAS/2 Discussion > Subject: Re: [DAS2] Ontology URIs (was RE: types.rnc) > > Seems like we may need to freeze the spec in a state that is fairly > non-committal w/r/t how ontology identifiers work. I propose to remove the > parts that are still not nailed down, so that we don't engender the > creation > of mutually incompatible implementations (one of the problems with DAS/1 > which DAS/2 is aiming at). > > The ontology attribute in the type element is currently documented as: > > # ontology identifier. The naming scheme is still undecided. > # This will be a URI. > attribute ontology { text }?, > > I think this is too vague. It's subject to lots of interpretation as to > what > it could point at and what it might resolve to. It could justifiably be > used > to identify any of these: > > - a specific term in an ontology > - the ontology as a whole (e.g., homepage of GO) > - evidence code (as in the example below) > The so_accession attribute gets us most of what we want and should suffice > for this freeze. In one fell swoop it identifies the ontology and a > particular term within it, and it defers the issue of ontology URIs. > > Some SO things to consider: > > 1) Should so_accession be restricted to SOFA (only locatable feature > types)? > If so, call it sofa_accession. (maybe too limiting) > > 2) What about SO versioning? Maybe a 'so_version' attribute would make > sense > (so_version="SOFA 2.1"). SO term IDs are stable across releases, but > sometimes terms become obsolete and are no longer listed. > > Steve > The "ontology" attribute of the TYPE element is meant to be an identifier for a specific ontology term in the SO or SOFA. It (and its placeholder, "so_accession") is the only place where any part of DAS/2 depends directly on an ontology. GO terms (or any other ontology) can be used as properties of features -- the biopackages server does this for example. But it is done using a generic property mechanism that makes no mention of ontologies, and the DAS/2 spec does not mention or depend on any ontology other than SO. The reason there is both an "ontology" and "so_accession" attribute is that we didn't have an official SO URI syntax to refer to, so we created a temporary "so_accession" attribute to use until we had something to put in for "ontology". Since the ontology attribute can _only_ be from SO or SOFA, I agree with Steve that we could collapse "so_accession" and "ontology" down to one attribute and use a prefix shorthand for SO/SOFA terms, for example "SO:0000147". This has the nice property that the shorthand is in fact a legal absolute URI, and therefore unaffected by any "xml:base" attributes in the document. I'd instead prefer this URI to be a URL that points to a description at the biomedical ontology center. But specifying that the attribute is a URI allows both the shorthand and later a more official link. Allen Day and Brian O'Connor have implemented an ontology server with an HTTP API that fits in very well with DAS/2, where each ontology term has its own URI. This was discussed back on the DAS/2 mailing list in February and I think Chris had some concerns, here's the start of the thread: http://portal.open-bio.org/pipermail/das2/2006-February/000507.html . To avoid divergence I've been reluctant to devote more resources to this unless it was in collaboration with the ontology center. I don't think we really need SO versioning -- to be useful it places an extra burden on the ontology maintainers. And looking at the current SO, when a term becomes obsolete it is still included in the ontology, it just gets flagged with an "is_obsolete:true" tag. Andrew's comment below made me realize we may have another problem -- not annotation with multiple ontologies, but rather annotation with multiple terms from the SO. I had thought each feature type could be based on a single ontology term (maybe using SO composite terms: http://www.bioontology.org/wiki/index.php/SO:Composite_Terms), but looking at the latest SO I don't think we can make this assumption. Which argues that "so_accession" should be a child element of TYPE rather than an attribute, and one or more be allowed. Or am I reading the SO wrong? Lincoln? Chris? As far as Chris' question as to what exactly an ontology URL should dereference to, relative to the DAS/2 spec I don't think it matters too much. An XML response with some structured description like what Allen's server returns would be nice, but I could see the benefits of HTML as the default too. Did I mention I'm a fan of content negotiation? In most of the DAS/2 HTTP GET requests, we have optional "format=" query parameter arguments to allow alternative format requests even in situations where HTTP content negotiation is not straightforward. Gregg > > From: Chris Mungall > > Date: Wed, 8 Nov 2006 17:11:09 -0500 > > To: "Helt,Gregg" > > Cc: DAS/2 > > Subject: Re: [DAS2] Ontology URIs (was RE: types.rnc) > > > > > > There absolutely needs to be a stable URI scheme for referencing > > types defined in ontologies. The details of the scheme aren't clear > > yet. It will probably be http based (ie not LSID). > > > > Do you have specific requirements? Should the URI be a URL > > dereferenceable in any browser? Should it dereference to html or RDF > > or use content negotion to decide which? etc > > > > On Nov 8, 2006, at 2:54 PM, Helt,Gregg wrote: > > > >> I'll talk to Suzi in her role as co-PI at NCBO (National Center for > >> Biomedical Ontolgoy). We may be able to quickly work out a URI > >> syntax (even if implementation of what the URIs resolve to comes > >> later). > >> > >> gregg > >> > >>> -----Original Message----- > >>> From: Andrew Dalke [mailto:dalke at dalkescientific.com] > >>> Sent: Tuesday, November 07, 2006 6:23 PM > >>> To: Ed > >>> Cc: Helt,Gregg > >>> Subject: Re: types.rnc > >>> > >>> Ed: > >>>> What bothers me is "still undecided". That doesn't belong in a > >>>> "frozen" spec. Though I have no idea what the correct text to put > >>>> here is. > >>> > >>> Take for example > >>> > >>> http://genome.cbs.dtu.dk:9000/das/secretomep/types > >>> > >>> > >>> >>> category="protein sorting" description="Ab initio > >>> predictions of > >>> non-classical i.e. not signal peptide triggered protein secretion" > >>> evidence="IEA" > >>> > >>> ontology="http://www.geneontology.org/GO.evidence.shtml">35138 > >>> > >>> It uses an ontology URI to describe which ontology scheme is > >>> used to describe the "evidence" value. In this case it means > >>> "Inferred from Electronic Annotation" > >>> > >>> There is no long-term/stable URL scheme for GO. Do we > >>> make something up? Do we say "use a URL" and leave it > >>> at that? I'll go for the latter as every reasonable > >>> scheme should end up as a URL. > >>> > >>> Except for those which are annotated from multiple ontologies. > >>> > >>> > >>> > >>> Andrew > >>> dalke at dalkescientific.com > >> > >> > >> _______________________________________________ > >> DAS2 mailing list > >> DAS2 at lists.open-bio.org > >> http://lists.open-bio.org/mailman/listinfo/das2 > >> > > > > _______________________________________________ > > DAS2 mailing list > > DAS2 at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/das2 From dalke at dalkescientific.com Thu Nov 9 22:30:51 2006 From: dalke at dalkescientific.com (Andrew Dalke) Date: Thu, 9 Nov 2006 23:30:51 +0100 Subject: [DAS2] TYPE[@source] -> TYPE[@method] In-Reply-To: References: Message-ID: <63d2531f67ae3890c9fa4aacf7bd0dff@dalkescientific.com> [dup to Gregg; had forgotten change the reply to all of das2] On Nov 6, 2006, at 4:58 PM, Helt,Gregg wrote: > I agree that multiple uses of "source" makes it confusing, and that for > types "method" is a reasonable alternative. On a related note, do we > really need both "title" and "source/method" attributes for types? The "method" attribute is the method used to generate features of the given type. Eg, "Genscan 1.23". The title is a human readable string about the type. I've been thinking of it as Server A: Type1 = "high confidence gene predictions" from "Genscan 1.23" so_accession="0000704" Type2 = "low confidence gene predictions" from "Genscan 1.23" so_accession="0000704" Server B: Type3 = "high confidence gene predictions" from "HMMGene 1.1" so_accession="0000704" Type4 = "low confidence gene predictions" from "HMMGene 1.1" so_accession="0000704" where the types are used to get different styles; perhaps different colors. The example in the RNC was ambiguous on this. It used "binding site" as the sole example. I've added "High confidence Genscan predictions" as a title and changed the genscan method example from "genscan" to "Genscan 1.23" BTW, as a client implementor, how do you lay these on a track? I presume information about track sharing goes in the stylesheet? Andrew dalke at dalkescientific.com From dalke at dalkescientific.com Thu Nov 9 22:55:49 2006 From: dalke at dalkescientific.com (Andrew Dalke) Date: Thu, 9 Nov 2006 23:55:49 +0100 Subject: [DAS2] Ontology URIs (was RE: types.rnc) In-Reply-To: References: Message-ID: Steve: > The ontology attribute in the type element is currently documented as: > > # ontology identifier. The naming scheme is still undecided. > # This will be a URI. > attribute ontology { text }?, > > I think this is too vague. It's subject to lots of interpretation as > to what > it could point at and what it might resolve to. It could justifiably > be used > to identify any of these: Indeed it could. There are other parts of DAS which are URI identifiable but which are not guaranteed to be resolvable. Eg, the individual features from a feature search don't need to be resolvable. It could be dealt with as an opaque string. Excepting how it interacts with relative url absolutizing. > The so_accession attribute gets us most of what we want and should > suffice > for this freeze. In one fell swoop it identifies the ontology and a > particular term within it, and it defers the issue of ontology URIs. What about leaving it there as "this is reserved for future use"? > Some SO things to consider: > > 1) Should so_accession be restricted to SOFA (only locatable feature > types)? > If so, call it sofa_accession. (maybe too limiting) I have no experience with this to guide me. I'm a structure guy. ;) > 2) What about SO versioning? Maybe a 'so_version' attribute would make > sense > (so_version="SOFA 2.1"). SO term IDs are stable across releases, but > sometimes terms become obsolete and are no longer listed. No. That does not work, for two reasons. You say the IDs are stable across releases. I assume that includes that obsolete ones are not reused. If the client knows how to interpret "2.1" to get information about an old identifier then it knows how to find the identifier in a list. Other reason - you're reinventing the semantics described by LSIDs. Why not just create an lsid naming scheme like urn:lsid:biodas.org:sofa-2.1:0000123 and use the URI. Okay, there's a third. Suppose the client knows nothing about the so term, even with the version information. (Eg, it's a new version, new term, and the client hasn't been updated; or there's a bug on in the server code causing all numbers to be twice as large.) What does the client do? I assert that it will treat unknown or missing ontology terms as being identical to an direct descendent from the root node of SO. Hence obsolete, new and erroneous terms are treated the same, so having the extra version field doesn't help the client. Andrew dalke at dalkescientific.com From dalke at dalkescientific.com Thu Nov 9 23:17:04 2006 From: dalke at dalkescientific.com (Andrew Dalke) Date: Fri, 10 Nov 2006 00:17:04 +0100 Subject: [DAS2] Adding an optional "searchable" attribute to element In-Reply-To: References: Message-ID: Gregg: > In the last DAS/2 teleconference I brought up again the idea of an > optional "searchable" or "filter" attribute for the elements > returned from a types query -- if present and "false", then that type > should not be used in a feature query filter. Too tired to work on this. Tomorrow. Andrew dalke at dalkescientific.com From dalke at dalkescientific.com Thu Nov 9 23:32:59 2006 From: dalke at dalkescientific.com (Andrew Dalke) Date: Fri, 10 Nov 2006 00:32:59 +0100 Subject: [DAS2] Ontology URIs (was RE: types.rnc) In-Reply-To: References: Message-ID: Gregg: > The reason there is both an "ontology" and "so_accession" attribute is > that we didn't have an official SO URI syntax to refer to, Didn't -- and still don't, right? > I agree with Steve that we could collapse "so_accession" and > "ontology" down to one attribute and use a prefix shorthand for SO/SOFA > terms, for example "SO:0000147". We could. When making the RNC I left the "SO:" prefix out deliberately in the number, leaving "0000147". The reason was to be very insistent that it was SO and only SO that could to there. Else people would start adding other terms, because after all the format is obviously "namespace" + "version number". > This has the nice property that the > shorthand is in fact a legal absolute URI, and therefore unaffected by > any "xml:base" attributes in the document. I'd instead prefer this URI > to be a URL that points to a description at the biomedical ontology > center. But specifying that the attribute is a URI allows both the > shorthand and later a more official link. But if we go for systems with no default resolver, why not use LSIDs? url:lsid:biodas.org:go:0000147 > Andrew's comment below made me realize we may have another problem -- > not annotation with multiple ontologies, but rather annotation with > multiple terms from the SO. The type record (like most other records) have a slot at the end for arbitrary non-das2-namespaced XML elements. When this gets to be a problem let people experiment with various ways to do it. Eg, No reason to solve it now, as we've no data which needs this. Andrew dalke at dalkescientific.com From cjm at fruitfly.org Thu Nov 9 23:35:46 2006 From: cjm at fruitfly.org (Chris Mungall) Date: Thu, 9 Nov 2006 15:35:46 -0800 Subject: [DAS2] Ontology URIs (was RE: types.rnc) In-Reply-To: References: Message-ID: <60DAD5CB-763E-4F65-85FA-3FC2E33B5A9C@fruitfly.org> On Nov 9, 2006, at 10:07 AM, Helt,Gregg wrote: > > >> -----Original Message----- >> From: Chervitz, Steve >> Sent: Wednesday, November 08, 2006 5:08 PM >> To: Chris Mungall; Helt,Gregg >> Cc: DAS/2 Discussion >> Subject: Re: [DAS2] Ontology URIs (was RE: types.rnc) >> >> Seems like we may need to freeze the spec in a state that is fairly >> non-committal w/r/t how ontology identifiers work. I propose to >> remove > the >> parts that are still not nailed down, so that we don't engender the >> creation >> of mutually incompatible implementations (one of the problems with > DAS/1 >> which DAS/2 is aiming at). >> >> The ontology attribute in the type element is currently documented >> as: >> >> # ontology identifier. The naming scheme is still undecided. >> # This will be a URI. >> attribute ontology { text }?, >> >> I think this is too vague. It's subject to lots of interpretation as > to >> what >> it could point at and what it might resolve to. It could justifiably > be >> used >> to identify any of these: >> >> - a specific term in an ontology >> - the ontology as a whole (e.g., homepage of GO) >> - evidence code (as in the example below) >> The so_accession attribute gets us most of what we want and should > suffice >> for this freeze. In one fell swoop it identifies the ontology and a >> particular term within it, and it defers the issue of ontology URIs. >> >> Some SO things to consider: >> >> 1) Should so_accession be restricted to SOFA (only locatable feature >> types)? >> If so, call it sofa_accession. (maybe too limiting) >> >> 2) What about SO versioning? Maybe a 'so_version' attribute would >> make >> sense >> (so_version="SOFA 2.1"). SO term IDs are stable across releases, but >> sometimes terms become obsolete and are no longer listed. >> >> Steve >> > > The "ontology" attribute of the TYPE element is meant to be an > identifier for a specific ontology term in the SO or SOFA. It (and > its > placeholder, "so_accession") is the only place where any part of DAS/2 > depends directly on an ontology. GO terms (or any other ontology) can > be used as properties of features -- the biopackages server does this > for example. But it is done using a generic property mechanism that > makes no mention of ontologies, and the DAS/2 spec does not mention or > depend on any ontology other than SO. > > The reason there is both an "ontology" and "so_accession" attribute is > that we didn't have an official SO URI syntax to refer to, so we > created > a temporary "so_accession" attribute to use until we had something to > put in for "ontology". Since the ontology attribute can _only_ be > from > SO or SOFA, I agree with Steve that we could collapse > "so_accession" and > "ontology" down to one attribute and use a prefix shorthand for SO/ > SOFA > terms, for example "SO:0000147". This has the nice property that the > shorthand is in fact a legal absolute URI, and therefore unaffected by > any "xml:base" attributes in the document. I'd instead prefer this > URI > to be a URL that points to a description at the biomedical ontology > center. But specifying that the attribute is a URI allows both the > shorthand and later a more official link. > > Allen Day and Brian O'Connor have implemented an ontology server > with an > HTTP API that fits in very well with DAS/2, where each ontology > term has > its own URI. This was discussed back on the DAS/2 mailing list in > February and I think Chris had some concerns, here's the start of the > thread: > http://portal.open-bio.org/pipermail/das2/2006-February/000507.html . > To avoid divergence I've been reluctant to devote more resources to > this > unless it was in collaboration with the ontology center. well I wouldn't like to hold anything up! By december it will be possible to browse all OBO ontologies, but any plans for providing stables URIs and programmatic access will probably wait til next year. If you have an ontology server ready, go with it. It's still unclear what the best approach is for serving up ontologies is, though the future is looking decidedly rdf/owl/sparqly. > I don't think we really need SO versioning -- to be useful it > places an > extra burden on the ontology maintainers. And looking at the current > SO, when a term becomes obsolete it is still included in the ontology, > it just gets flagged with an "is_obsolete:true" tag. I agree. This is policy for all good OBO ontologies; any change in the substance of a definition results in a new ID. > Andrew's comment below made me realize we may have another problem -- > not annotation with multiple ontologies, but rather annotation with > multiple terms from the SO. I had thought each feature type could be > based on a single ontology term (maybe using SO composite terms: > http://www.bioontology.org/wiki/index.php/SO:Composite_Terms), but > looking at the latest SO I don't think we can make this assumption. > Which argues that "so_accession" should be a child element of TYPE > rather than an attribute, and one or more be allowed. Or am I reading > the SO wrong? Lincoln? Chris? Any DAS feature F should be associated with a single SO:located_sequence_feature T(I would submit that the formal interpretation of this be: all actual genomic entities that instantiate the pattern represented by F should instantiate the pattern represented by T) However, a feature can be associated with multiple properties - these will be subtypes of SO:atribute. > As far as Chris' question as to what exactly an ontology URL should > dereference to, relative to the DAS/2 spec I don't think it matters > too > much. An XML response with some structured description like what > Allen's server returns would be nice, but I could see the benefits of > HTML as the default too. There is a discussion on public-semweb-lifesci on the relative merits of content negaotiation with URIs right now.. > Did I mention I'm a fan of content > negotiation? In most of the DAS/2 HTTP GET requests, we have optional > "format=" query parameter arguments to allow alternative format > requests > even in situations where HTTP content negotiation is not > straightforward. That's fine, on the understand that suffixing the "format=" creates a different URI > Gregg > >>> From: Chris Mungall >>> Date: Wed, 8 Nov 2006 17:11:09 -0500 >>> To: "Helt,Gregg" >>> Cc: DAS/2 >>> Subject: Re: [DAS2] Ontology URIs (was RE: types.rnc) >>> >>> >>> There absolutely needs to be a stable URI scheme for referencing >>> types defined in ontologies. The details of the scheme aren't clear >>> yet. It will probably be http based (ie not LSID). >>> >>> Do you have specific requirements? Should the URI be a URL >>> dereferenceable in any browser? Should it dereference to html or RDF >>> or use content negotion to decide which? etc >>> >>> On Nov 8, 2006, at 2:54 PM, Helt,Gregg wrote: >>> >>>> I'll talk to Suzi in her role as co-PI at NCBO (National Center for >>>> Biomedical Ontolgoy). We may be able to quickly work out a URI >>>> syntax (even if implementation of what the URIs resolve to comes >>>> later). >>>> >>>> gregg >>>> >>>>> -----Original Message----- >>>>> From: Andrew Dalke [mailto:dalke at dalkescientific.com] >>>>> Sent: Tuesday, November 07, 2006 6:23 PM >>>>> To: Ed >>>>> Cc: Helt,Gregg >>>>> Subject: Re: types.rnc >>>>> >>>>> Ed: >>>>>> What bothers me is "still undecided". That doesn't belong in a >>>>>> "frozen" spec. Though I have no idea what the correct text to > put >>>>>> here is. >>>>> >>>>> Take for example >>>>> >>>>> http://genome.cbs.dtu.dk:9000/das/secretomep/types >>>>> >>>>> >>>>> >>>> category="protein sorting" description="Ab initio >>>>> predictions of >>>>> non-classical i.e. not signal peptide triggered protein secretion" >>>>> evidence="IEA" >>>>> >>>>> > ontology="http://www.geneontology.org/GO.evidence.shtml">35138 >>>>> >>>>> It uses an ontology URI to describe which ontology scheme is >>>>> used to describe the "evidence" value. In this case it means >>>>> "Inferred from Electronic Annotation" >>>>> >>>>> There is no long-term/stable URL scheme for GO. Do we >>>>> make something up? Do we say "use a URL" and leave it >>>>> at that? I'll go for the latter as every reasonable >>>>> scheme should end up as a URL. >>>>> >>>>> Except for those which are annotated from multiple ontologies. >>>>> >>>>> >>>>> >>>>> Andrew >>>>> dalke at dalkescientific.com >>>> >>>> >>>> _______________________________________________ >>>> DAS2 mailing list >>>> DAS2 at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/das2 >>>> >>> >>> _______________________________________________ >>> DAS2 mailing list >>> DAS2 at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/das2 > > From dalke at dalkescientific.com Thu Nov 9 23:39:06 2006 From: dalke at dalkescientific.com (Andrew Dalke) Date: Fri, 10 Nov 2006 00:39:06 +0100 Subject: [DAS2] Identifiers and URIs In-Reply-To: References: Message-ID: Steve: > All DAS/2 elements are identified with a uri attribute, but their > documentation isn't consistent. So I'm recommending this be tighted up > a > bit. I did. I said that "URI" refers to DAS2 objects and to external resources (like ontology) treated mostly as an identifier. "URL" and "href" are used for things viewed in more generic browsers. > I propose that all such comments have a consistent wording. How about > this: > > # A unique identifier for this [object-type] > uri Is the spec really so advanced that it's time to do proofing at this level? There are sections in the HTML spec labeled "XXX" because I'm hoping for feedback people concerning the questions listed therein. In talking with Gregg we finished up one of the biggest ones; the XID. We decided to steal from HTML4' "link" element. I've cleaned up the wording and filled in some more details. All checked in. It's 12:40. g'night. Andrew dalke at dalkescientific.com From Gregg_Helt at affymetrix.com Thu Nov 9 23:56:42 2006 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Thu, 9 Nov 2006 15:56:42 -0800 Subject: [DAS2] DAS/2 teleconference on Monday Message-ID: Just wanted to remind everyone that we're having an extra DAS/2 teleconference next Monday, at 9:30 AM PST. The agenda is to review this week's spec finalization for release of a frozen DAS/2.0 protocol. Dialin (US): 800-531-3250 Dialin (Intl): 303-928-2693 Conference ID: 2879055 Passcode: 1365 From cjm at fruitfly.org Thu Nov 9 23:57:04 2006 From: cjm at fruitfly.org (Chris Mungall) Date: Thu, 9 Nov 2006 15:57:04 -0800 Subject: [DAS2] Ontology URIs (was RE: types.rnc) In-Reply-To: References: Message-ID: <1FF932B3-EEBD-47C0-8C55-7F9DB1FD937A@fruitfly.org> On Nov 9, 2006, at 3:32 PM, Andrew Dalke wrote: > Gregg: >> The reason there is both an "ontology" and "so_accession" >> attribute is >> that we didn't have an official SO URI syntax to refer to, > > Didn't -- and still don't, right? is this something we should be doing? Karen, is this why you wanted to serve up per-term xml off of sequenceontology.org? It may make more sense to serve up RDF/OWL here I would really like to have a scheme that leaves the OBO ID unviolated; eg http://www.sequenceontology.org/owl#SO:0000001 Unfortunately jena has fatal problems with numbers immediately following the ':'. And Jena is one of the most commonly used RDF tools. Sigh This would work: http://www.sequenceontology.org/owl/SO#SO_0000001 but unfortunately involves string hacking on the ID >> I agree with Steve that we could collapse "so_accession" and >> "ontology" down to one attribute and use a prefix shorthand for SO/ >> SOFA >> terms, for example "SO:0000147". > > We could. When making the RNC I left the "SO:" prefix out > deliberately in the number, leaving "0000147". The reason was to > be very insistent that it was SO and only SO that could to there. This seems overly defensive. It would seem cleaner to use the same ID scheme throughout > Else people would start adding other terms, because after all the > format is obviously "namespace" + "version number". > >> This has the nice property that the >> shorthand is in fact a legal absolute URI, and therefore >> unaffected by >> any "xml:base" attributes in the document. I'd instead prefer >> this URI >> to be a URL that points to a description at the biomedical ontology >> center. But specifying that the attribute is a URI allows both the >> shorthand and later a more official link. > > But if we go for systems with no default resolver, why not use > LSIDs? > > url:lsid:biodas.org:go:0000147 LSIDs uniquely identify sequences of bytes. The sequence of bytes in the record GO:00000147 may change although the universal it refers to does not >> Andrew's comment below made me realize we may have another problem -- >> not annotation with multiple ontologies, but rather annotation with >> multiple terms from the SO. > > The type record (like most other records) have a slot at the end > for arbitrary non-das2-namespaced XML elements. When this gets to > be a problem let people experiment with various ways to do it. > > Eg, > > > > No reason to solve it now, as we've no data which needs > this. > > Andrew > dalke at dalkescientific.com > > _______________________________________________ > DAS2 mailing list > DAS2 at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/das2 > From enwired at gmail.com Fri Nov 10 00:43:20 2006 From: enwired at gmail.com (Ed) Date: Thu, 9 Nov 2006 16:43:20 -0800 Subject: [DAS2] DAS/2 teleconference on Monday In-Reply-To: <4aa3a7e70611091605o25b5555dsa90d6f6b1a8b8f61@mail.gmail.com> References: <4aa3a7e70611091605o25b5555dsa90d6f6b1a8b8f61@mail.gmail.com> Message-ID: <4aa3a7e70611091643j145f409elf50042b9218dfea1@mail.gmail.com> Sorry, reverse that: >From France: 08 00 907 839 >From UK: 08 00 40 49 467 2006/11/9, Ed : > > Just FYI: There international toll-free numbers for some countries: > > From UK: 08 00 907 839 > From France: 08 00 40 49 467 > > Some other countries are covered, too, but those are the only 2 I have on > hand. > > > 2006/11/9, Helt,Gregg : > > > > Just wanted to remind everyone that we're having an extra DAS/2 > > teleconference next Monday, at 9:30 AM PST. The agenda is to review > > this week's spec finalization for release of a frozen DAS/2.0 protocol. > > > > Dialin (US): 800-531-3250 > > Dialin (Intl): 303-928-2693 > > Conference ID: 2879055 > > Passcode: 1365 > > > > > > > > _______________________________________________ > > DAS2 mailing list > > DAS2 at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/das2 > > > > From enwired at gmail.com Fri Nov 10 00:05:40 2006 From: enwired at gmail.com (Ed) Date: Thu, 9 Nov 2006 16:05:40 -0800 Subject: [DAS2] DAS/2 teleconference on Monday In-Reply-To: References: Message-ID: <4aa3a7e70611091605o25b5555dsa90d6f6b1a8b8f61@mail.gmail.com> Just FYI: There international toll-free numbers for some countries: >From UK: 08 00 907 839 >From France: 08 00 40 49 467 Some other countries are covered, too, but those are the only 2 I have on hand. 2006/11/9, Helt,Gregg : > > Just wanted to remind everyone that we're having an extra DAS/2 > teleconference next Monday, at 9:30 AM PST. The agenda is to review > this week's spec finalization for release of a frozen DAS/2.0 protocol. > > Dialin (US): 800-531-3250 > Dialin (Intl): 303-928-2693 > Conference ID: 2879055 > Passcode: 1365 > > > > _______________________________________________ > DAS2 mailing list > DAS2 at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/das2 > From dalke at dalkescientific.com Fri Nov 10 07:04:31 2006 From: dalke at dalkescientific.com (Andrew Dalke) Date: Fri, 10 Nov 2006 08:04:31 +0100 Subject: [DAS2] Ontology URIs (was RE: types.rnc) In-Reply-To: <1FF932B3-EEBD-47C0-8C55-7F9DB1FD937A@fruitfly.org> References: <1FF932B3-EEBD-47C0-8C55-7F9DB1FD937A@fruitfly.org> Message-ID: Chris: > Andrew >> We could. When making the RNC I left the "SO:" prefix out >> deliberately in the number, leaving "0000147". The reason was to >> be very insistent that it was SO and only SO that could to there. > > This seems overly defensive. It would seem cleaner to use the same ID > scheme throughout What I wrote was > The sequence ontology (SO) is widely used but its identifiers are not > URIs. The 'so_accession' attribute contains the SO accession number > without the leading "SO:", as in "0000316". Note that the leading > zeros are important. This field should be interpreted as an opaque > string. (XXX should this be "0000316" or "SO:0000316"? I prefer the > latter.) The "XXX" in the spec mark places where I'm hoping for feedback. So far I haven't received any. Chris: > Andrew: >> But if we go for systems with no default resolver, why not use >> LSIDs? >> >> url:lsid:biodas.org:go:0000147 > > LSIDs uniquely identify sequences of bytes. The sequence of bytes in > the record GO:00000147 may change although the universal it refers to > does not LSIDs have concrete objects and abstract objects. The abstract object, if resolved, only returns metadata. This would be an LSID for an abstract object. If we have a so_version and so_accesssion, etc. as attributes then we could identically have an LSID referencing an abstract object. It makes no difference to clients and for the spec is promotes the push towards URIs. Andrew dalke at dalkescientific.com From Gregg_Helt at affymetrix.com Fri Nov 10 15:28:07 2006 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Fri, 10 Nov 2006 07:28:07 -0800 Subject: [DAS2] Progress on freezing the DAS/2.0 genome retrieval specification Message-ID: Thanks everyone for reviewing the DAS/2 genome retrieval documents this week and posting your comments. Andrew has incorporated this feedback into the latest das2_schemas.rnc document (das/das2/das2_schemas.rnc). It looks good to me, there is just one optional attribute to add that Andrew and I discussed yesterday, and then I think the schema can be frozen today. The genome retrieval HTML doc (das/das2/das2_get.html) still needs some editing before it can be frozen. Andrew and I will both be editing the doc this weekend. Anyone else with write access to the biodas CVS repository is welcome to help with the editing. If you plan to edit it in the next three days please let me know what sections so I can focus on other sections. Thanks again everyone, talk to you on Monday. Gregg From Gregg_Helt at affymetrix.com Fri Nov 10 15:40:09 2006 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Fri, 10 Nov 2006 07:40:09 -0800 Subject: [DAS2] Progress on XML Schema version of DAS/2.0 Message-ID: I'm working on an XML Schema version of the DAS/2.0 schema. I used the Trang schema tool to automatically convert the RelaxNG schema to an XSD doc. This has provided a good skeleton to start with, but it looks like there are a number of issues I'll have to fix by hand. There are many places where the XSD specifies ordered sequences of elements where there shouldn't be any ordering restrictions. Also the way Trang translated the idea of non-DAS extensions is messy. And a lot of comments got lost in translation. None of these issues look too problematic. I expect more problems will come up, but I am making progress. Gregg From Gregg_Helt at affymetrix.com Sat Nov 11 11:11:31 2006 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Sat, 11 Nov 2006 03:11:31 -0800 Subject: [DAS2] Progress on XML Schema version of DAS/2.0 Message-ID: I've checked in an XML-Schema translation of the das2_schemas.rnc doc in the biodas CVS repository as das2_schemas.xsd (http://cvs.biodas.org/cgi-bin/viewcvs/viewcvs.cgi/das/das2/das2_schemas .xsd?rev=HEAD&cvsroot=biodas). I still have some concerns about how it will handle non-DAS extensions, and I also need to add back in some of the comments from the rnc doc. But I have tested that I can generate Java bindings from the XSD using Apache XMLBeans. To get that to work I also had to remove use of "xml:id" for now, it was causing XMLBeans to throw errors. Gregg > -----Original Message----- > From: das2-bounces at lists.open-bio.org [mailto:das2-bounces at lists.open- > bio.org] On Behalf Of Helt,Gregg > Sent: Friday, November 10, 2006 7:40 AM > To: DAS/2 Discussion > Subject: [DAS2] Progress on XML Schema version of DAS/2.0 > > I'm working on an XML Schema version of the DAS/2.0 schema. I used the > Trang schema tool to automatically convert the RelaxNG schema to an XSD > doc. This has provided a good skeleton to start with, but it looks like > there are a number of issues I'll have to fix by hand. There are many > places where the XSD specifies ordered sequences of elements where there > shouldn't be any ordering restrictions. Also the way Trang translated > the idea of non-DAS extensions is messy. And a lot of comments got lost > in translation. None of these issues look too problematic. I expect > more problems will come up, but I am making progress. > > Gregg > > > > _______________________________________________ > DAS2 mailing list > DAS2 at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/das2 From aloraine at uab.edu Sun Nov 12 14:13:14 2006 From: aloraine at uab.edu (Ann Loraine) Date: Sun, 12 Nov 2006 08:13:14 -0600 Subject: [DAS2] Arabidopsis DAS-es? Message-ID: <83722dde0611120613h4b08cf88r3764ed83d4112602@mail.gmail.com> Dear all, I heard that there is at least one working & supported DAS for Arabidopsis at EBI or NASC. I've looked all over the NASC site and although they mention DAS, they don't give the URL for a DAS server, so far as I can tell. Same for EBI & Ensembl, but of course it's very possible I missed it. Maybe some-one on the list from EBI could fill me in on the details? I would need the URL, obviously :-) Yours, Ann -- Ann Loraine Assistant Professor Section on Statistical Genetics University of Alabama at Birmingham http://www.ssg.uab.edu http://www.transvar.org From aloraine at gmail.com Sun Nov 12 15:24:22 2006 From: aloraine at gmail.com (Ann Loraine) Date: Sun, 12 Nov 2006 09:24:22 -0600 Subject: [DAS2] FYI: Arabidopsis DAS at PlantGDB Message-ID: <83722dde0611120724r641e3dak18a8cc8119baf948@mail.gmail.com> Hi, This is an update on Arabidopsis DAS sites: Iowa State hosts a plant DAS site, but so far I haven't been able to get IGB to talk to it. It fails with this error: [java] DAS request1: http://www.plantgdb.org/cgi-bin/das/ATGDB151_das/features?segment=4:15849536,15854959;type=EST_alignment%3AGeneSeqer_cognate;type=cDNA_alignment%3AGeneSeqer_cognate [java] Attempting to load data from URL: http://www.plantgdb.org/cgi-bin/das/ATGDB151_das/features?segment=4:15849536,15854959;type=EST_alignment%3AGeneSeqer_cognate;type=cDNA_alignment%3AGeneSeqer_cognate [java] [Fatal Error] :14:83: The reference to entity "dbid" must end with the ';' delimiter. [java] Problem parsing DAS XML data: The reference to entity "dbid" must end with the ';' delimiter. The problem appears to be lines such as: 23308168 which include "&" symbols that don't signal the start of an entity. I have written to PlantGDB to ask about this...I'll keep you posted! It might be useful to add a few links to trusted on-line XML validators to the upcoming re-done bioDAS Web site to make it easier for DAS providers to check their XML well-formedness. Here's one I just now used: http://validator.aborla.net/ Many people who implement DAS services are likely to be beginning programmers...or programmers like me who don't do it full-time & can use refreshers :-) Yours, Ann PS If this doesn't get posted to the list, could some-one post it for me? -- Ann Loraine Assistant Professor Section on Statistical Genetics University of Alabama at Birmingham http://www.ssg.uab.edu http://www.transvar.org From ap3 at sanger.ac.uk Mon Nov 13 10:27:52 2006 From: ap3 at sanger.ac.uk (Andreas Prlic) Date: Mon, 13 Nov 2006 10:27:52 +0000 Subject: [DAS2] sources.rnc In-Reply-To: References: <4aa3a7e70611071119j69487b14x680367c174696d5b@mail.gmail.com> <76b3521b73785f6cbb2540cdde62ed03@sanger.ac.uk> Message-ID: <1ad41038300a35c6dda74dde4f4e2951@sanger.ac.uk> Hi Andrew, >> so the sources command is available via: >> http://www.dasregistry.org/registry/das1/sources > > Any chance of making that URL shorter? done. - thanks to our webteam this is now http://www.dasregistry.org/das1/sources > And it > no longer includes das1 sources. I guess you mean das2 sources? - I hope that will change now with the frozen spec ;-) > Also, I can't find anywhere on the HTML which points to that we have a documentation page that explains how scripts (or DAS clients) can talk to the registry and get the list of available DAS servers at: http://www.dasregistry.org/help_scripting.jsp Cheers, Andreas ----------------------------------------------------------------------- Andreas Prlic Wellcome Trust Sanger Institute Hinxton, Cambridge CB10 1SA, UK +44 (0) 1223 49 6891 From ap3 at sanger.ac.uk Mon Nov 13 10:52:15 2006 From: ap3 at sanger.ac.uk (Andreas Prlic) Date: Mon, 13 Nov 2006 10:52:15 +0000 Subject: [DAS2] FYI: Arabidopsis DAS at PlantGDB In-Reply-To: <83722dde0611120724r641e3dak18a8cc8119baf948@mail.gmail.com> References: <83722dde0611120724r641e3dak18a8cc8119baf948@mail.gmail.com> Message-ID: <5032842288e655b2d3d2a5f9fb534d5f@sanger.ac.uk> Hi Ann, > Iowa State hosts a plant DAS site, but so far I haven't been able to > get IGB to > talk to it. I was not aware of this DAS site - I will contact them and invite them to get their DAS servers registered in the DAS registry ... Cheers, Andreas ----------------------------------------------------------------------- Andreas Prlic Wellcome Trust Sanger Institute Hinxton, Cambridge CB10 1SA, UK +44 (0) 1223 49 6891 From Gregg_Helt at affymetrix.com Mon Nov 13 11:53:22 2006 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Mon, 13 Nov 2006 03:53:22 -0800 Subject: [DAS2] URIs for coordinates Message-ID: I'm editing the spec docs to clarify use of coordinates. Where can one find the URIs a server should use for the uri attribute in a COORDINATES element? I've looked at the HTML summary at http://www.dasregistry.org/help_coordsys.jsp , but this doesn't list any of the actual URIs I'm seeing used in the DAS registry. For example, http://das.sanger.ac.uk/dasregistry/coordsys/CS_SPICEDS5 for NCBI human assembly v35 from the DAS/2 registry sources doc: or http://das.sanger.ac.uk/dasregistry/coordsys/CS_DS5 for the same assembly from the DAS/1 registry sources doc: Also, shouldn't these be the same URI? thanks, Gregg From Gregg_Helt at affymetrix.com Mon Nov 13 13:37:25 2006 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Mon, 13 Nov 2006 05:37:25 -0800 Subject: [DAS2] Agenda for DAS/2 teleconference today Message-ID: DAS/2 Teleconference today at 9:30 AM PST Dialin (US): 800-531-3250 Dialin (Intl): 303-928-2693 Conference ID: 2879055 Passcode: 1365 Agenda: Specification Status of schema (das2_schemas.rnc) Ratification of schema freeze Status of XML Schema translation (das2_schemas.xsd) Formalizing query syntax? Status of genome retrieval specification doc (das2_get.html) Review of remaining issues in genome retrieval spec. Coordinates URIs Segment reference URIs Ontology URIs Revising example queries / responses Timeline for DAS/2 genome retrieval spec freeze. Other docs? Implementation status Validator Genome retrieval servers NetAffx queries responses biopackages queries responses DAS/1 --> DAS/2 conversion server cgi.biodas.org test server Sanger registry others? Example queries Biopackages ontology server Genome retrieval clients IGB queries responses others? From ap3 at sanger.ac.uk Mon Nov 13 14:05:55 2006 From: ap3 at sanger.ac.uk (Andreas Prlic) Date: Mon, 13 Nov 2006 14:05:55 +0000 Subject: [DAS2] URIs for coordinates In-Reply-To: References: Message-ID: <267c26ea87f6262d2b551af71655c4b6@sanger.ac.uk> Hi Gregg! > I?m editing the spec docs to clarify use of coordinates.? Where can > one find the URIs a server should use for the uri attribute in a > COORDINATES element?? Hm. The das registry does not provide a list of uris so far, I can provide such a listing. I believe the correct uri for the NCBI assembly version 35 for human should be something like http://www.dasregistry.org/coordsys/CS_DS5 > > Also, shouldn?t these be the same URI? yes they should. Cheers, Andreas ----------------------------------------------------------------------- Andreas Prlic Wellcome Trust Sanger Institute Hinxton, Cambridge CB10 1SA, UK +44 (0) 1223 49 6891 From Gregg_Helt at affymetrix.com Mon Nov 13 15:21:39 2006 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Mon, 13 Nov 2006 07:21:39 -0800 Subject: [DAS2] Agenda for DAS/2 teleconference today Message-ID: The version on biodas.org gets auto-updated from the cvs repository every night. However, for today and probably the rest of the week I'd recommend looking directly at the head of the CVS repository to make sure you've got the most recent version. Thanks, Gregg > -----Original Message----- > From: Brian Gilman [mailto:gilmanb at pantherinformatics.com] > Sent: Monday, November 13, 2006 5:42 AM > To: Helt,Gregg > Subject: Re: [DAS2] Agenda for DAS/2 teleconference today > > Hey Greg, > > Is the latest version of the get spec up on at biodas.org? Or should > I also look in cvs? > > Best, > > -B From gilmanb at pantherinformatics.com Mon Nov 13 15:17:52 2006 From: gilmanb at pantherinformatics.com (Brian Gilman) Date: Mon, 13 Nov 2006 10:17:52 -0500 Subject: [DAS2] Agenda for DAS/2 teleconference today In-Reply-To: References: Message-ID: <415FFF30-CDF6-4EB4-A8FF-2AFF203F595E@pantherinformatics.com> I'm going to be a little late to the call. I have a meeting from 11:30 - 1 today. Will that pose a problem? -B -- Brian Gilman President Panther Informatics Inc. E-Mail: gilmanb at pantherinformatics.com gilmanb at jforge.net AIM: gilmanb1 01000010 01101001 01101111 01001001 01101110 01100110 01101111 01110010 01101101 01100001 01110100 01101001 01100011 01101001 01100001 01101110 On Nov 13, 2006, at 8:37 AM, Helt,Gregg wrote: > DAS/2 Teleconference today at 9:30 AM PST > Dialin (US): 800-531-3250 > Dialin (Intl): 303-928-2693 > Conference ID: 2879055 > Passcode: 1365 > > Agenda: > > Specification > Status of schema (das2_schemas.rnc) > Ratification of schema freeze > Status of XML Schema translation (das2_schemas.xsd) > Formalizing query syntax? > > Status of genome retrieval specification doc (das2_get.html) > Review of remaining issues in genome retrieval spec. > Coordinates URIs > Segment reference URIs > Ontology URIs > Revising example queries / responses > Timeline for DAS/2 genome retrieval spec freeze. > Other docs? > Implementation status > Validator > Genome retrieval servers > NetAffx > queries > responses > biopackages > queries > responses > DAS/1 --> DAS/2 conversion server > cgi.biodas.org test server > Sanger registry > others? > Example queries > Biopackages ontology server > Genome retrieval clients > IGB > queries > responses > others? > > > > _______________________________________________ > DAS2 mailing list > DAS2 at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/das2 > From Gregg_Helt at affymetrix.com Mon Nov 13 17:10:05 2006 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Mon, 13 Nov 2006 09:10:05 -0800 Subject: [DAS2] URIs for coordinates Message-ID: I think having a URI for each coordinate system is important. We could use a simple syntax that constructs the URI from the coordinate system's authority, organism, and type. If it resolves to something informative that would be nice, but not necessary. Gregg > -----Original Message----- > From: Andreas Prlic [mailto:ap3 at sanger.ac.uk] > Sent: Monday, November 13, 2006 6:06 AM > To: Helt,Gregg > Cc: DAS/2 Discussion > Subject: Re: URIs for coordinates > > Hi Gregg! > > > I'm editing the spec docs to clarify use of coordinates.? Where can > > one find the URIs a server should use for the uri attribute in a > > COORDINATES element? > > > Hm. The das registry does not provide a list of uris so far, I can > provide such a listing. > > I believe the correct uri for the NCBI assembly version 35 for human > should be something like > > http://www.dasregistry.org/coordsys/CS_DS5 > > > > > > Also, shouldn't these be the same URI? > > yes they should. > > Cheers, > Andreas > > > ----------------------------------------------------------------------- > > Andreas Prlic Wellcome Trust Sanger Institute > Hinxton, Cambridge CB10 1SA, UK > +44 (0) 1223 49 6891 From gilmanb at pantherinformatics.com Mon Nov 13 20:33:12 2006 From: gilmanb at pantherinformatics.com (Brian Gilman) Date: Mon, 13 Nov 2006 15:33:12 -0500 Subject: [DAS2] XML Instance documents generate valid XML from XML Spy Message-ID: <4558D688.4000404@pantherinformatics.com> Hey Guys, I had XMLSpy output some instance documents based off the xsd and things look good. I've also bound the document to xmlbeans and will dump some documents and run them through the validator to make sure everything's working on that end. I did experience issues when trying to his current DAS2 servers and understand that everyone is working to make those compliant. Thanks very, very much for outputting the xsd. Client writing is now much, much easier and can be automated :-) Best, -B From gilmanb at pantherinformatics.com Mon Nov 13 20:50:41 2006 From: gilmanb at pantherinformatics.com (Brian Gilman) Date: Mon, 13 Nov 2006 15:50:41 -0500 Subject: [DAS2] XML Instance documents generate valid XML from XML Spy In-Reply-To: <4558D688.4000404@pantherinformatics.com> References: <4558D688.4000404@pantherinformatics.com> Message-ID: <4558DAA1.40302@pantherinformatics.com> I think I forgot to attach the XMl instance document!! Sorry! Here they are... -B Brian Gilman wrote: > Hey Guys, > > I had XMLSpy output some instance documents based off the xsd and > things look good. I've also bound the document to xmlbeans and will > dump some documents and run them through the validator to make sure > everything's working on that end. I did experience issues when trying > to his current DAS2 servers and understand that everyone is working to > make those compliant. Thanks very, very much for outputting the xsd. > Client writing is now much, much easier and can be automated :-) > > Best, > > -B > -------------- next part -------------- A non-text attachment was scrubbed... Name: features_from_xsd.xml Type: text/xml Size: 1324 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: segments_from_xsd.xml Type: text/xml Size: 751 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: sources_from_xsd.xml Type: text/xml Size: 1755 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: types_from_xsd.xml Type: text/xml Size: 912 bytes Desc: not available URL: From Steve_Chervitz at affymetrix.com Mon Nov 13 21:25:20 2006 From: Steve_Chervitz at affymetrix.com (Steve Chervitz) Date: Mon, 13 Nov 2006 13:25:20 -0800 Subject: [DAS2] HTML document re-org complete Message-ID: I just committed my large, re-organizational changes to das2_get.html, so others can now feel free to edit at will. Summary of what I did: * Re-organized into three main sections for consistency, readability: - General - Overview - Detailed * Simplified top summary table and fixed in-page navigation links. * Added TOC and subsection TOCs. * Added section numbers. * Added global sequence id section. * Misc typo fixes and wording improvements. * Noted a bad sentence in the third paragraph. Not sure the intent here ("some fetching some of the documents"?) The biodas.org viewable version of this document does not yet have these changes as I write: http://biodas.org/documents/das2/das2_get.html . It operates off of the anonymous CVS server which hasn't yet sync'd with the dev CVS server. Not sure how often this sync happens. I updated the biodas.org site to sync with CVS hourly, so the docs viewable from there will stay more current, but still may be out of date during this time of frequent updates. Steve From boconnor at ucla.edu Wed Nov 15 00:46:01 2006 From: boconnor at ucla.edu (Brian O'Connor) Date: Tue, 14 Nov 2006 16:46:01 -0800 Subject: [DAS2] biopackages DAS/2 server passed validation Message-ID: <455A6349.40301@ucla.edu> Hi, I finished validating the DAS/2 server at biopackages.net using Andrew's validator. After making a few small tweaks all document types pass. Here are the URLs I validated with: * http://das.biopackages.net/das/genome * http://das.biopackages.net/das/genome/human/17/segment * http://das.biopackages.net/das/genome/human/17/type * http://das.biopackages.net/das/genome/human/17/feature?segment=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fgenome%2FH_sapiens%2FB36.1%2Fdna%2Fchr1;overlaps=1:1000 I also fixed the bug with the "type" attribute in the CAPABILITY elements. They now are "features", "types", or "segments" to be compliant with the spec. --Brian From gilmanb at pantherinformatics.com Wed Nov 15 02:28:01 2006 From: gilmanb at pantherinformatics.com (Brian Gilman) Date: Tue, 14 Nov 2006 21:28:01 -0500 Subject: [DAS2] biopackages DAS/2 server passed validation In-Reply-To: <455A6349.40301@ucla.edu> References: <455A6349.40301@ucla.edu> Message-ID: <71C4FB6F-4621-4A72-8B74-64AC105C3ECE@pantherinformatics.com> Hey Guys, Is the source posted for the BioPackages and Affy server code posted on biodas? I'd like to utilize it to start on my other scientific projects. Best, -B -- Brian Gilman President Panther Informatics Inc. E-Mail: gilmanb at pantherinformatics.com gilmanb at jforge.net AIM: gilmanb1 01000010 01101001 01101111 01001001 01101110 01100110 01101111 01110010 01101101 01100001 01110100 01101001 01100011 01101001 01100001 01101110 On Nov 14, 2006, at 7:46 PM, Brian O'Connor wrote: > Hi, > > I finished validating the DAS/2 server at biopackages.net using > Andrew's > validator. After making a few small tweaks all document types pass. > Here are the URLs I validated with: > > * http://das.biopackages.net/das/genome > * http://das.biopackages.net/das/genome/human/17/segment > * http://das.biopackages.net/das/genome/human/17/type > * > http://das.biopackages.net/das/genome/human/17/feature?segment=http% > 3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fgenome%2FH_sapiens%2FB36.1%2Fdna% > 2Fchr1;overlaps=1:1000 > > I also fixed the bug with the "type" attribute in the CAPABILITY > elements. They now are "features", "types", or "segments" to be > compliant with the spec. > > --Brian > _______________________________________________ > DAS2 mailing list > DAS2 at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/das2 > From Gregg_Helt at affymetrix.com Wed Nov 15 02:45:00 2006 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Tue, 14 Nov 2006 18:45:00 -0800 Subject: [DAS2] biopackages DAS/2 server passed validation Message-ID: Thanks! Looks like most of the problem IGB was having with the biopackages server were due to the truncated 'type' attributes in CAPABILITY. Using IGB I'm still not getting features back from the biopackages server from a features query with overlaps and type filters, but I think that's a bug in IGB's request. Hope to fix tonight. Gregg > -----Original Message----- > From: das2-bounces at lists.open-bio.org [mailto:das2-bounces at lists.open- > bio.org] On Behalf Of Brian O'Connor > Sent: Tuesday, November 14, 2006 4:46 PM > To: das2 at lists.open-bio.org > Subject: [DAS2] biopackages DAS/2 server passed validation > > Hi, > > I finished validating the DAS/2 server at biopackages.net using Andrew's > validator. After making a few small tweaks all document types pass. > Here are the URLs I validated with: > > * http://das.biopackages.net/das/genome > * http://das.biopackages.net/das/genome/human/17/segment > * http://das.biopackages.net/das/genome/human/17/type > * > http://das.biopackages.net/das/genome/human/17/feature?segment=http%3A%2 F% > 2Fwww.ncbi.nlm.nih.gov%2Fgenome%2FH_sapiens%2FB36.1%2Fdna%2Fchr1;overlap s= > 1:1000 > > I also fixed the bug with the "type" attribute in the CAPABILITY > elements. They now are "features", "types", or "segments" to be > compliant with the spec. > > --Brian > _______________________________________________ > DAS2 mailing list > DAS2 at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/das2 From Gregg_Helt at affymetrix.com Wed Nov 15 05:43:31 2006 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Tue, 14 Nov 2006 21:43:31 -0800 Subject: [DAS2] biopackages DAS/2 server passed validation Message-ID: The Affy Genometry DAS/2 server code is in the Genoviz CVS repository on sourceforge (http://sourceforge.net/projects/genoviz/), under the das2_server directory. The core of it is a servlet, com.affymetrix.genometry.servlets.GenometryDas2Servlet. There is also a main class com.affymetrix.genometry.servlets.GenometryDas2Server that wraps the servlet inside a Jetty server and initializes server and servlet (though with enough configuration tinkering the servlet could probably be run in any servlet-supporting HTTP server). The servlet depends heavily on code in the genometry and igb directories. Gregg > -----Original Message----- > From: das2-bounces at lists.open-bio.org [mailto:das2-bounces at lists.open- > bio.org] On Behalf Of Brian Gilman > Sent: Tuesday, November 14, 2006 6:28 PM > To: Brian O'Connor > Cc: das2 at lists.open-bio.org > Subject: Re: [DAS2] biopackages DAS/2 server passed validation > > Hey Guys, > > Is the source posted for the BioPackages and Affy server code posted > on biodas? I'd like to utilize it to start on my other scientific > projects. > > Best, > > -B > -- > Brian Gilman > President Panther Informatics Inc. > E-Mail: gilmanb at pantherinformatics.com > gilmanb at jforge.net > AIM: gilmanb1 > > 01000010 01101001 01101111 > 01001001 01101110 01100110 > 01101111 01110010 01101101 > 01100001 01110100 01101001 > 01100011 01101001 01100001 > 01101110 > > > > On Nov 14, 2006, at 7:46 PM, Brian O'Connor wrote: > > > Hi, > > > > I finished validating the DAS/2 server at biopackages.net using > > Andrew's > > validator. After making a few small tweaks all document types pass. > > Here are the URLs I validated with: > > > > * http://das.biopackages.net/das/genome > > * http://das.biopackages.net/das/genome/human/17/segment > > * http://das.biopackages.net/das/genome/human/17/type > > * > > http://das.biopackages.net/das/genome/human/17/feature?segment=http% > > 3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fgenome%2FH_sapiens%2FB36.1%2Fdna% > > 2Fchr1;overlaps=1:1000 > > > > I also fixed the bug with the "type" attribute in the CAPABILITY > > elements. They now are "features", "types", or "segments" to be > > compliant with the spec. > > > > --Brian > > _______________________________________________ > > DAS2 mailing list > > DAS2 at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/das2 > > > > _______________________________________________ > DAS2 mailing list > DAS2 at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/das2 From boconnor at ucla.edu Wed Nov 15 04:02:05 2006 From: boconnor at ucla.edu (Brian O'Connor) Date: Tue, 14 Nov 2006 20:02:05 -0800 Subject: [DAS2] biopackages DAS/2 server passed validation In-Reply-To: <71C4FB6F-4621-4A72-8B74-64AC105C3ECE@pantherinformatics.com> References: <455A6349.40301@ucla.edu> <71C4FB6F-4621-4A72-8B74-64AC105C3ECE@pantherinformatics.com> Message-ID: <455A913D.3060504@ucla.edu> Hi Brian, The biopackages DAS/2 server code is stored under the GMOD project on SourceForge (http://sourceforge.net/projects/gmod/). It's under "das2" in the cvs repository. It's written using the mod_perl Apache interface. Hope that helps. --Brian Brian Gilman wrote: > Hey Guys, > > Is the source posted for the BioPackages and Affy server code > posted on biodas? I'd like to utilize it to start on my other > scientific projects. > > Best, > > -B > -- > Brian Gilman > President Panther Informatics Inc. > E-Mail: gilmanb at pantherinformatics.com > gilmanb at jforge.net > AIM: gilmanb1 > > 01000010 01101001 01101111 > 01001001 01101110 01100110 > 01101111 01110010 01101101 > 01100001 01110100 01101001 > 01100011 01101001 01100001 > 01101110 > > > > On Nov 14, 2006, at 7:46 PM, Brian O'Connor wrote: > >> Hi, >> >> I finished validating the DAS/2 server at biopackages.net using >> Andrew's >> validator. After making a few small tweaks all document types pass. >> Here are the URLs I validated with: >> >> * http://das.biopackages.net/das/genome >> * http://das.biopackages.net/das/genome/human/17/segment >> * http://das.biopackages.net/das/genome/human/17/type >> * >> http://das.biopackages.net/das/genome/human/17/feature?segment=http% >> 3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fgenome%2FH_sapiens%2FB36.1%2Fdna% >> 2Fchr1;overlaps=1:1000 >> >> I also fixed the bug with the "type" attribute in the CAPABILITY >> elements. They now are "features", "types", or "segments" to be >> compliant with the spec. >> >> --Brian >> _______________________________________________ >> DAS2 mailing list >> DAS2 at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/das2 >> > From gilmanb at pantherinformatics.com Wed Nov 15 13:50:41 2006 From: gilmanb at pantherinformatics.com (Brian Gilman) Date: Wed, 15 Nov 2006 08:50:41 -0500 Subject: [DAS2] biopackages DAS/2 server passed validation In-Reply-To: References: Message-ID: <617B1C5E-1BBD-414B-A864-AB93F11E080B@pantherinformatics.com> Great Guys, Thanks very much. -B -- Brian Gilman President Panther Informatics Inc. E-Mail: gilmanb at pantherinformatics.com gilmanb at jforge.net AIM: gilmanb1 01000010 01101001 01101111 01001001 01101110 01100110 01101111 01110010 01101101 01100001 01110100 01101001 01100011 01101001 01100001 01101110 On Nov 15, 2006, at 12:43 AM, Helt,Gregg wrote: > The Affy Genometry DAS/2 server code is in the Genoviz CVS > repository on > sourceforge (http://sourceforge.net/projects/genoviz/), under the > das2_server directory. The core of it is a servlet, > com.affymetrix.genometry.servlets.GenometryDas2Servlet. There is > also a > main class com.affymetrix.genometry.servlets.GenometryDas2Server that > wraps the servlet inside a Jetty server and initializes server and > servlet (though with enough configuration tinkering the servlet could > probably be run in any servlet-supporting HTTP server). The servlet > depends heavily on code in the genometry and igb directories. > > Gregg > >> -----Original Message----- >> From: das2-bounces at lists.open-bio.org [mailto:das2- >> bounces at lists.open- >> bio.org] On Behalf Of Brian Gilman >> Sent: Tuesday, November 14, 2006 6:28 PM >> To: Brian O'Connor >> Cc: das2 at lists.open-bio.org >> Subject: Re: [DAS2] biopackages DAS/2 server passed validation >> >> Hey Guys, >> >> Is the source posted for the BioPackages and Affy server code > posted >> on biodas? I'd like to utilize it to start on my other scientific >> projects. >> >> Best, >> >> -B >> -- >> Brian Gilman >> President Panther Informatics Inc. >> E-Mail: gilmanb at pantherinformatics.com >> gilmanb at jforge.net >> AIM: gilmanb1 >> >> 01000010 01101001 01101111 >> 01001001 01101110 01100110 >> 01101111 01110010 01101101 >> 01100001 01110100 01101001 >> 01100011 01101001 01100001 >> 01101110 >> >> >> >> On Nov 14, 2006, at 7:46 PM, Brian O'Connor wrote: >> >>> Hi, >>> >>> I finished validating the DAS/2 server at biopackages.net using >>> Andrew's >>> validator. After making a few small tweaks all document types pass. >>> Here are the URLs I validated with: >>> >>> * http://das.biopackages.net/das/genome >>> * http://das.biopackages.net/das/genome/human/17/segment >>> * http://das.biopackages.net/das/genome/human/17/type >>> * >>> http://das.biopackages.net/das/genome/human/17/feature?segment=http% >>> 3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fgenome%2FH_sapiens%2FB36.1%2Fdna% >>> 2Fchr1;overlaps=1:1000 >>> >>> I also fixed the bug with the "type" attribute in the CAPABILITY >>> elements. They now are "features", "types", or "segments" to be >>> compliant with the spec. >>> >>> --Brian >>> _______________________________________________ >>> DAS2 mailing list >>> DAS2 at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/das2 >>> >> >> _______________________________________________ >> DAS2 mailing list >> DAS2 at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/das2 > From Gregg_Helt at affymetrix.com Wed Nov 15 17:25:06 2006 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Wed, 15 Nov 2006 09:25:06 -0800 Subject: [DAS2] biopackages DAS/2 server passed validation Message-ID: I've fixed some bugs in IGB and now it is able to retrieve some genome features from the biopackages server and visualize them. For example this feature query works: http://das.biopackages.net/das/genome/human/17/feature?segment=http%3A%2 F%2Fdas.biopackages.net%2Fdas%2Fgenome%2Fhuman%2F17%2Fsegment%2Fchr21;ov erlaps=26040000%3A26060000;type=SO%3AmRNA with URL-decoded query params: segment=http://das.biopackages.net/das/genome/human/17/segment/chr21 overlaps=26040000:26060000 type=SO:mRNA However, not all feature queries work. For example, another query, exactly the same as the above except for a different type filter: http://das.biopackages.net/das/genome/human/17/feature?segment=http%3A%2 F%2Fdas.biopackages.net%2Fdas%2Fgenome%2Fhuman%2F17%2Fsegment%2Fchr21;ov erlaps=26040000%3A26060000;type=SO%3ACDS with URL-decoded query params: segment=http://das.biopackages.net/das/genome/human/17/segment/chr21 overlaps=26040000:26060000 type=SO:CDS returns this error message: 500 Died at /usr/lib/perl5/site_perl/5.8.3/Package/Base/Devel.pm line 425. Should "SO:CDS" not be a searchable type? Also, is the full URI for the type supposed to be A) "SO:CDS" or B) "http://das.biopackages.net/das/genome/human/17/type/SO:CDS" ? According to XML Base resolution rules, with "SO:CDS" as the value for the TYPE uri attribute, then because there is a ":" before any "/", the full URI is (A). If the full URI is supposed to be (B), then the uri attribute should be "./SO:CDS" (given that xml:base is "http://das.biopackages.net/das/genome/human/17/type/"). Thanks, Gregg > -----Original Message----- > From: das2-bounces at lists.open-bio.org [mailto:das2-bounces at lists.open- > bio.org] On Behalf Of Helt,Gregg > Sent: Tuesday, November 14, 2006 6:45 PM > To: Brian O'Connor; das2 at lists.open-bio.org > Subject: Re: [DAS2] biopackages DAS/2 server passed validation > > Thanks! Looks like most of the problem IGB was having with the > biopackages server were due to the truncated 'type' attributes in > CAPABILITY. > > Using IGB I'm still not getting features back from the biopackages > server from a features query with overlaps and type filters, but I think > that's a bug in IGB's request. Hope to fix tonight. > > Gregg > > > -----Original Message----- > > From: das2-bounces at lists.open-bio.org [mailto:das2-bounces at lists.open- > > bio.org] On Behalf Of Brian O'Connor > > Sent: Tuesday, November 14, 2006 4:46 PM > > To: das2 at lists.open-bio.org > > Subject: [DAS2] biopackages DAS/2 server passed validation > > > > Hi, > > > > I finished validating the DAS/2 server at biopackages.net using > Andrew's > > validator. After making a few small tweaks all document types pass. > > Here are the URLs I validated with: > > > > * http://das.biopackages.net/das/genome > > * http://das.biopackages.net/das/genome/human/17/segment > > * http://das.biopackages.net/das/genome/human/17/type > > * > > > http://das.biopackages.net/das/genome/human/17/feature?segment=http%3A%2 > F% > > > 2Fwww.ncbi.nlm.nih.gov%2Fgenome%2FH_sapiens%2FB36.1%2Fdna%2Fchr1;overlap > s= > > 1:1000 > > > > I also fixed the bug with the "type" attribute in the CAPABILITY > > elements. They now are "features", "types", or "segments" to be > > compliant with the spec. > > > > --Brian > > _______________________________________________ > > DAS2 mailing list > > DAS2 at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/das2 > > _______________________________________________ > DAS2 mailing list > DAS2 at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/das2 From Gregg_Helt at affymetrix.com Wed Nov 15 20:22:22 2006 From: Gregg_Helt at affymetrix.com (Helt,Gregg) Date: Wed, 15 Nov 2006 12:22:22 -0800 Subject: [DAS2] New Test Affy DAS/2 server Message-ID: Steve and I have a new test version of the Affy Genometry DAS/2 server up and running, at http://netaffxdas.affymetrix.com/das2/test/sources. For compatibility with the current release of IGB we are keeping the older version of the server at http://netaffxdas.affymetrix.com/das2/sources, until we can synchronize a server upgrade with a new IGB release. Sample test server requests: Sources: http://netaffxdas.affymetrix.com/das2/test/sources Segments: http://netaffxdas.affymetrix.com/das2/test/sources/H_sapiens_May_2004/se gments Types: http://netaffxdas.affymetrix.com/das2/test/sources/H_sapiens_May_2004/ty pes Features with query filters: http://netaffxdas.affymetrix.com/das2/test/sources/H_sapiens_Mar_2006/fe atures?segment=http%3A%2F%2Fnetaffxdas.affymetrix.com%2Fdas2%2Ftest%2Fso urces%2FH_sapiens_Mar_2006%2Fchr21;overlaps=26040000%3A26070000;type=htt p%3A%2F%2Fnetaffxdas.affymetrix.com%2Fdas2%2Ftest%2Fsources%2FH_sapiens_ Mar_2006%2FknownGene URL-decoded query params: segment=http://netaffxdas.affymetrix.com/das2/test/sources/H_sapiens_Mar _2006/chr21 overlaps=26040000:26070000 type=http://netaffxdas.affymetrix.com/das2/test/sources/H_sapiens_Mar_20 06/knownGene Responses to these queries all pass the DAS/2 validator. This latest version of the Genometry DAS/2 server does not yet support the full range of DAS/2 feature queries and feature filters required by the DAS/2 specification. For the server to send a useful response containing features, the feature query string must currently contain: 1 type filter 1 segment filter 1 overlaps filter 0 or 1 inside filter 0 or 1 format parameter 0 other filters/parameters To comply with the spec, when the server receives queries it doesn't support it tries in most cases to return allowable error messages. But at the moment we are having a problem with getting these error messages passed unaltered through our proxy server -- the errors end up being generic 502 'Bad Gateway' messages. We plan to fix this problem and also add fuller feature query filter support as soon as possible. If you compile IGB from the head of the Genoviz CVS repository (http://sourceforge.net/cvs/?group_id=129420), you can access the new server in the DAS/2 tab as "Affy Test Server". Please let me know if you find any problems! Thanks, Gregg From dalke at dalkescientific.com Mon Nov 20 19:17:42 2006 From: dalke at dalkescientific.com (Andrew Dalke) Date: Mon, 20 Nov 2006 20:17:42 +0100 Subject: [DAS2] DAS1 TYPE attribute "category" is what in DAS2? Message-ID: Several questions listed here. DAS1 TYPE elements had an attribute "category". Here are some of the categories listed in DAS1 servers ALPHA-BETA-MOTIF, ASX-MOTIF, ASX-TURN, BETA-BULGE, BETA-BULGE-LOOP, BETA-TURN, CATMAT-3, CATMAT-4, GAMMA-TURN, HELIX-L, NEST, SCHELLMANN-LOOP, ST-MOTIF, ST-STAPLE, ST-TURN, enzyme, miscellaneous motif, pathway, rRNA, repeat, rfam, structural, tRNA transcription, transmembrane prediction DAS1 says: category (optional, recommended) attribute, which provides functional grouping to related types. *TOPIC*: What should I do for automated conversion in my proxy system? Currently I have: DAS1 "id" used to make DAS2 "uri" (via url encoding) DAS1 "method" copied into DAS2 "method" DAS1 (non-standard extension) "description" copied into DAS1 "description" DAS1 (non-standard extensions) "ontology" and "evidence" used to fake a DAS2 "ontology" uri *Q1*: If there is a DAS1 "category" should I use it to make a DAS2 "title"? Gregg's viewer merges types into a single track based on the title, so I that feels correct to me. *Q2*: If the title is not given, should I use the DAS1 "id" as the DAS2 "title"? I think that is correct. *Q3*: If there's no DAS1 "description" extension to use for DAS2's "description" should I copy DAS1's "title" instead (which in turn might come from the "category" and/or the "id" fields). My feeling is no, that is not appropriate. *Q4*: I fake an ontology if I can. Does anyone know examples of DAS1 extensions with to support ontologies other than TMHMM, which has 1766831 For now I convert that into http://www.geneontology.org/GO.evidence.shtml#IEA Andrew dalke at dalkescientific.com From lstein at cshl.edu Mon Nov 20 16:20:16 2006 From: lstein at cshl.edu (Lincoln Stein) Date: Mon, 20 Nov 2006 11:20:16 -0500 Subject: [DAS2] Cannot attend today Message-ID: <6dce9a0b0611200820v25d158au5581938a841ba559@mail.gmail.com> Hi All, I can't attend the conference call today because of a conflict with the CSHL retreat. Best, Lincoln -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From ap3 at sanger.ac.uk Tue Nov 21 16:26:36 2006 From: ap3 at sanger.ac.uk (Andreas Prlic) Date: Tue, 21 Nov 2006 16:26:36 +0000 Subject: [DAS2] DAS1 TYPE attribute "category" is what in DAS2? In-Reply-To: References: Message-ID: <8766a85a180e098ee4b9e17ac359d4c7@sanger.ac.uk> Hi Andrew, > DAS1 TYPE elements had an attribute "category". Here are some > of the categories listed in DAS1 servers Currently the DAS/1 types are not used in a consistent way and so far have not been used much... One of the things that is done as part of the BioSapiens project is to come up with a more consistent definition which annotation types to use. > *Q1*: If there is a DAS1 "category" should I use it to make a DAS2 > "title"? > > Gregg's viewer merges types into a single track based on the title, so > I that feels correct to me. in DAS/1 the annotation types are used to merge features into a single track, therefore I think the das/1 type would be the equivalent to das/2- title then. > *Q2*: If the title is not given, should I use the DAS1 "id" as the DAS2 > "title"? it think that is correct. > *Q3*: If there's no DAS1 "description" extension to use for DAS2's > "description" > should I copy DAS1's "title" instead (which in turn might come from the > "category" and/or the "id" fields). My feeling is no, that is not > appropriate. err - which DAS/2 request do you talk about ? still about types? > *Q4*: I fake an ontology if I can. Does anyone know examples of > DAS1 extensions with to support ontologies other than TMHMM, which has So far this is not used in a consistent way ... BioSapiens will come up with a convention, but it is still work in progress... Andreas ----------------------------------------------------------------------- Andreas Prlic Wellcome Trust Sanger Institute Hinxton, Cambridge CB10 1SA, UK +44 (0) 1223 49 6891 From aloraine at gmail.com Fri Nov 24 14:40:41 2006 From: aloraine at gmail.com (Ann Loraine) Date: Fri, 24 Nov 2006 08:40:41 -0600 Subject: [DAS2] cvs and code examples? Message-ID: <83722dde0611240640m1b3344d0x874b39fd1f31768c@mail.gmail.com> Hi, Can some-one send me URLs for viewcvs & directions for cvs access of biodas code? Thank you! -Ann From aloraine at gmail.com Sat Nov 25 01:00:03 2006 From: aloraine at gmail.com (Ann Loraine) Date: Fri, 24 Nov 2006 19:00:03 -0600 Subject: [DAS2] cvs and code examples? In-Reply-To: References: <83722dde0611240640m1b3344d0x874b39fd1f31768c@mail.gmail.com> Message-ID: <83722dde0611241700s2e7c832n22b86ca9ccee8a9c@mail.gmail.com> Thanks Brian! What code would you recommend I use for setting up a DAS? I am doing a project where the volume of annotations is so great that I can't keeping loading them all at once into IGB via Quickload or File->Open. -Ann Since I'm already looking at a genome browser (IGB) I don't need to see their. On 11/24/06, Brian Osborne wrote: > Ann, > > It's here: > > http://www.open-bio.org/wiki/SourceCode > > > Brian O. > > > On 11/24/06 9:40 AM, "Ann Loraine" wrote: > > > Hi, > > > > Can some-one send me URLs for viewcvs & directions for cvs access of > > biodas code? > > > > Thank you! > > > > -Ann > > _______________________________________________ > > DAS2 mailing list > > DAS2 at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/das2 > > > -- Ann Loraine Assistant Professor Departments of Genetics, Biostatistics, and Section on Statistical Genetics University of Alabama at Birmingham http://www.ssg.uab.edu http://www.transvar.org From Steve_Chervitz at affymetrix.com Mon Nov 27 04:42:01 2006 From: Steve_Chervitz at affymetrix.com (Steve Chervitz) Date: Sun, 26 Nov 2006 20:42:01 -0800 Subject: [DAS2] Notes from the weekly DAS/2 teleconference, 20 Nov 2006 Message-ID: [Note: No DAS teleconference on 27 Nov. Next one is on 4 Dec] Notes from the weekly DAS/2 teleconference, 20 Nov 2006 $Id: das2-teleconf-2006-11-20.txt,v 1.1 2006/11/27 04:31:31 sac Exp $ Note taker: Steve Chervitz Attendees: Affy: Steve Chervitz, Gregg Helt, Ed Erwin Dalke Scientific: Andrew Dalke UCLA: Brian O'connor Action items are flagged with '[A]'. These notes are checked into the biodas.org CVS repository at das/das2/notes/2006. Instructions on how to access this repository are at http://biodas.org DISCLAIMER: The note taker aims for completeness and accuracy, but these goals are not always achievable, given the desire to get the notes out with a rapid turnaround. So don't consider these notes as complete minutes from the meeting, but rather abbreviated, summarized versions of what was discussed. There may be errors of commission and omission. Participants are welcome to post comments and/or corrections to these as they see fit. Agenda ------- * Spec discussion * Status reports Topic: Spec Discussion ----------------------- gh: has everyone reviewed steve's re-orged html doc of the retrieval spec? [To summarize the re-org: everything has been organized into three main sections: general, overview, and detailed. There's a table of contents and all sections and subsections are numbered a la W3C specs. The summary table at the top has been simplified, and in-page navigation has been fixed and improved.] consensus: no, haven't looked at in detail yet. [A] all give the new html retrieval spec doc a read through again. gh: need to add examples of alignments. I plan to announce in the next two weeks that it's ready. When reviewing, pay special attention to comments marked 'XXX' sc: esp in third para from the top, ambiguousness. ee: need commit privileges sc: sent email to support at open-bio.org. They are quick. Topic: Status reports --------------------- gh: First half of last week, getting affy das2 server up to snuff with spec changes and compliance with spec, had spotty compliance. correctly handle errors. email message about test server, different url than the public server. [A] gregg/steve to update public affy das2 server with new IGB release gh: lots of testing in last few days, ready to replace the existing server. ee: some zooming issues when switching servers. gh: worked on fixing bugs in das/2 client, problems w/ biopackages server due to problems in IGB, also w/r/t spec changes. also working on using das/2 in another context in igb, retrieving genomic locations. Expression console, recommended for processing affy chips (all expr), generates chp files. big request esp for whole exon folks, need igb to load these, but the chp files have no genomic location info, had to pre-load these in past. Now, when you load a chip file, igb automatically goes out via das2 and retrieves it. hardwired is what server to go to, based on genome + types from server, figures out what das request to make to load it. per-sequence basis, and lazy, doesn't load all locations for whole genome. big files: 1M probe sets, + 4 probes w diff locs. Also, igb gets it back in compact binary format (using alt format) -- new use of das in igb. not committed yet, but will be cool. ee: data? gh: yes. we need more bp2 files there. Will try and have igb prompt user for file if it can't look it up automatically. [A] gregg send ed a write up so he can get it in the release notes. gh: bottom line: efficient way to look at expr data in igb gh: Third thing: prepping for this release of igb. targetting wed. server change over tues, igb on wed. may cause a day of hassles. ee: I'll be working in Paris on Wed, will be Tues day for US.. gh: so I'll be done by noon wed or earlier. gh: happy with progress on affy client server now. ad: cleaning up code I did on validator while in EBI. will check into dasypus CVS on sf. ee: on vacation, but working now. sc: worked on affy das2 server set up for testing ( http://netaffxdas.affymetrix.com/das2/test ) and worked on the igb keystore update (digital signature for affy jars). Ran into issue with the error codes sent by the affy das2 server getting altered by apache into another error (502, I believe). Need to figure out how to get apache to not alter these error responses. [A] steve figure out how to prevent apache from changing das/2 error responses gh: would assume not doing redirection through apache. but plan b is to not use apache. [affy das2 server uses jetty servlet engine, and apache forwards request to it via a rewrite rule.] sc: quick load access now requires apache. gh: should be able to load and serve these through jetty running on port 80. need to get apache to stop mucking with the http headers. bo: no new progress, but will look into filtering issue this week. 'so:' stuff. either allen of I will look into it this week. gh: i get right feat response for some but not all request. wonder if the 'so:' is involved. that's the only remaining issue I know of. both servers are passing validation. this was a high priority. for brian gilman and lincoln to make use of the das/2 spec. ad: any more feedback from bg? gh: no. he was going to start working on a server as well. [A] gregg contact brian gilman to see how things are going. ee: good news, bug I reported about zooming out was not a bug, but cause by me pressing the wrong button. related to changing genome version. [A] steve set up jar signing cert today so Ed can use tomorrow. Wrap up: -------- [A] review and modify das/2 html retrieval docs over the next few days. [A] Next meeting in two weeks (4 Dec 06) From bosborne11 at verizon.net Mon Nov 27 14:29:32 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Mon, 27 Nov 2006 09:29:32 -0500 Subject: [DAS2] DAS and DAS2 Message-ID: das2, My name is Brian Osborne, I?m working on documentation for GMOD and GMOD-related packages as part of the newly created GMOD Help Desk position. Some of my colleagues here in the GMOD community are recommending that we consider DAS, 1 or 2, as important GMOD-related software so I?m joining your list in order to learn more about DAS. I have some initial questions, I was wondering if someone could help me out with them (I did read the DAS Overview and browsed most of the specs at biodas.org). 1. Are DAS1 and DAS2 designed to inter-operate? For example, will I be able to use a DAS2 client and a DAS1 server? 2. Do you think DAS2 is going to replace DAS1 or co-exist with it? Yes, this may not be easy to answer. 3. Is there a DAS2 release date? Thanks again, Brian O. From Steve_Chervitz at affymetrix.com Thu Nov 30 22:59:22 2006 From: Steve_Chervitz at affymetrix.com (Steve Chervitz) Date: Thu, 30 Nov 2006 14:59:22 -0800 Subject: [DAS2] DAS and DAS2 In-Reply-To: Message-ID: Hi Brian, Brian Osborne wrote on Mon, 27 Nov 2006: > My name is Brian Osborne, I?m working on documentation for GMOD and > GMOD-related packages as part of the newly created GMOD Help Desk position. Great. Looking forward to more quality documentation for GMOD, a la your excellent contributions to Bioperl documentation. > Some of my colleagues here in the GMOD community are recommending that we > consider DAS, 1 or 2, as important GMOD-related software so I?m joining your > list in order to learn more about DAS. DAS is definitely appropriate for GMOD. Providing a DAS-compatible interface to MOD data would help write software tools and perform data analyses that integrate data from different sources. In fact, a DAS/2 server reference implementation is being developed within the GMOD sourceforge CVS, though it's not officially been released as part of GMOD. Here are the CVS commit logs for it. http://sourceforge.net/mailarchive/forum.php?forum_id=42210 Other DAS/2 software is also being developed under open source licenses. See links on http://biodas.org in the About section, look for "The DAS/2 code base". > I have some initial questions, I was > wondering if someone could help me out with them (I did read the DAS > Overview and browsed most of the specs at biodas.org). > > 1. Are DAS1 and DAS2 designed to inter-operate? For example, will I be able > to use a DAS2 client and a DAS1 server? DAS/2 is a complete redesign of the spec, so direct interoperation is not possible. However, DAS/2 has all of the capabilities of the DAS/1 spec (and more!). As proof of this, Andrew Dalke is developing a proxy adapter that will allow you to put a DAS/2 interface around an existing DAS/1 server, allowing DAS/2 clients to interact with existing DAS/1 servers: http://lists.open-bio.org/pipermail/das2/2006-October/000867.html To fully realize 1 <-> 2 interoperation, one would also need to write a DAS/1 proxy adapter for DAS/2 servers, to permit DAS/1 clients to interact with DAS/2 servers. I don't know of any plans for that yet. > 2. Do you think DAS2 is going to replace DAS1 or co-exist with it? Yes, this > may not be easy to answer. The proxy adapter approach should enable some degree of peaceful co-existence between DAS/1 and DAS/2 systems, and should facilitate the transition to DAS/2, which has many niceties not present in DAS/1. As far as replacing DAS/1, the proof will be in the pudding. > 3. Is there a DAS2 release date? The DAS/2 schema for retrieval of genomic annotations has been officially frozen since mid-November (das2_schemas.rnc and das2_schemas.xsd in the biodas/das/das2 CVS repository). The corresponding html version of this spec, viewable from biodas.org, is soon to be finalized as well (probably by end of next week). When that happens, DAS/2 for genome retrieval will be considered released. Stay tuned to this list for an announcement. The DAS/2 writeback spec is still under development and I don't believe a timeframe for it's release has been set. Steve