[DAS2] Ontology URIs (was RE: types.rnc)
Helt,Gregg
Gregg_Helt at affymetrix.com
Thu Nov 9 18:07:32 UTC 2006
> -----Original Message-----
> From: Chervitz, Steve
> Sent: Wednesday, November 08, 2006 5:08 PM
> To: Chris Mungall; Helt,Gregg
> Cc: DAS/2 Discussion
> Subject: Re: [DAS2] Ontology URIs (was RE: types.rnc)
>
> Seems like we may need to freeze the spec in a state that is fairly
> non-committal w/r/t how ontology identifiers work. I propose to remove
the
> parts that are still not nailed down, so that we don't engender the
> creation
> of mutually incompatible implementations (one of the problems with
DAS/1
> which DAS/2 is aiming at).
>
> The ontology attribute in the type element is currently documented as:
>
> # ontology identifier. The naming scheme is still undecided.
> # This will be a URI.
> attribute ontology { text }?,
>
> I think this is too vague. It's subject to lots of interpretation as
to
> what
> it could point at and what it might resolve to. It could justifiably
be
> used
> to identify any of these:
>
> - a specific term in an ontology
> - the ontology as a whole (e.g., homepage of GO)
> - evidence code (as in the example below)
> The so_accession attribute gets us most of what we want and should
suffice
> for this freeze. In one fell swoop it identifies the ontology and a
> particular term within it, and it defers the issue of ontology URIs.
>
> Some SO things to consider:
>
> 1) Should so_accession be restricted to SOFA (only locatable feature
> types)?
> If so, call it sofa_accession. (maybe too limiting)
>
> 2) What about SO versioning? Maybe a 'so_version' attribute would make
> sense
> (so_version="SOFA 2.1"). SO term IDs are stable across releases, but
> sometimes terms become obsolete and are no longer listed.
>
> Steve
>
The "ontology" attribute of the TYPE element is meant to be an
identifier for a specific ontology term in the SO or SOFA. It (and its
placeholder, "so_accession") is the only place where any part of DAS/2
depends directly on an ontology. GO terms (or any other ontology) can
be used as properties of features -- the biopackages server does this
for example. But it is done using a generic property mechanism that
makes no mention of ontologies, and the DAS/2 spec does not mention or
depend on any ontology other than SO.
The reason there is both an "ontology" and "so_accession" attribute is
that we didn't have an official SO URI syntax to refer to, so we created
a temporary "so_accession" attribute to use until we had something to
put in for "ontology". Since the ontology attribute can _only_ be from
SO or SOFA, I agree with Steve that we could collapse "so_accession" and
"ontology" down to one attribute and use a prefix shorthand for SO/SOFA
terms, for example "SO:0000147". This has the nice property that the
shorthand is in fact a legal absolute URI, and therefore unaffected by
any "xml:base" attributes in the document. I'd instead prefer this URI
to be a URL that points to a description at the biomedical ontology
center. But specifying that the attribute is a URI allows both the
shorthand and later a more official link.
Allen Day and Brian O'Connor have implemented an ontology server with an
HTTP API that fits in very well with DAS/2, where each ontology term has
its own URI. This was discussed back on the DAS/2 mailing list in
February and I think Chris had some concerns, here's the start of the
thread:
http://portal.open-bio.org/pipermail/das2/2006-February/000507.html .
To avoid divergence I've been reluctant to devote more resources to this
unless it was in collaboration with the ontology center.
I don't think we really need SO versioning -- to be useful it places an
extra burden on the ontology maintainers. And looking at the current
SO, when a term becomes obsolete it is still included in the ontology,
it just gets flagged with an "is_obsolete:true" tag.
Andrew's comment below made me realize we may have another problem --
not annotation with multiple ontologies, but rather annotation with
multiple terms from the SO. I had thought each feature type could be
based on a single ontology term (maybe using SO composite terms:
http://www.bioontology.org/wiki/index.php/SO:Composite_Terms), but
looking at the latest SO I don't think we can make this assumption.
Which argues that "so_accession" should be a child element of TYPE
rather than an attribute, and one or more be allowed. Or am I reading
the SO wrong? Lincoln? Chris?
As far as Chris' question as to what exactly an ontology URL should
dereference to, relative to the DAS/2 spec I don't think it matters too
much. An XML response with some structured description like what
Allen's server returns would be nice, but I could see the benefits of
HTML as the default too. Did I mention I'm a fan of content
negotiation? In most of the DAS/2 HTTP GET requests, we have optional
"format=" query parameter arguments to allow alternative format requests
even in situations where HTTP content negotiation is not
straightforward.
Gregg
> > From: Chris Mungall <cjm at fruitfly.org>
> > Date: Wed, 8 Nov 2006 17:11:09 -0500
> > To: "Helt,Gregg" <Gregg_Helt at affymetrix.com>
> > Cc: DAS/2 <das2 at lists.open-bio.org>
> > Subject: Re: [DAS2] Ontology URIs (was RE: types.rnc)
> >
> >
> > There absolutely needs to be a stable URI scheme for referencing
> > types defined in ontologies. The details of the scheme aren't clear
> > yet. It will probably be http based (ie not LSID).
> >
> > Do you have specific requirements? Should the URI be a URL
> > dereferenceable in any browser? Should it dereference to html or RDF
> > or use content negotion to decide which? etc
> >
> > On Nov 8, 2006, at 2:54 PM, Helt,Gregg wrote:
> >
> >> I'll talk to Suzi in her role as co-PI at NCBO (National Center for
> >> Biomedical Ontolgoy). We may be able to quickly work out a URI
> >> syntax (even if implementation of what the URIs resolve to comes
> >> later).
> >>
> >> gregg
> >>
> >>> -----Original Message-----
> >>> From: Andrew Dalke [mailto:dalke at dalkescientific.com]
> >>> Sent: Tuesday, November 07, 2006 6:23 PM
> >>> To: Ed
> >>> Cc: Helt,Gregg
> >>> Subject: Re: types.rnc
> >>>
> >>> Ed:
> >>>> What bothers me is "still undecided". That doesn't belong in a
> >>>> "frozen" spec. Though I have no idea what the correct text to
put
> >>>> here is.
> >>>
> >>> Take for example
> >>>
> >>> http://genome.cbs.dtu.dk:9000/das/secretomep/types
> >>>
> >>>
> >>> <TYPE id="NC-SECRETORY" method="SecretomeP-1.0"
> >>> category="protein sorting" description="Ab initio
> >>> predictions of
> >>> non-classical i.e. not signal peptide triggered protein secretion"
> >>> evidence="IEA"
> >>>
> >>>
ontology="http://www.geneontology.org/GO.evidence.shtml">35138</TYPE>
> >>>
> >>> It uses an ontology URI to describe which ontology scheme is
> >>> used to describe the "evidence" value. In this case it means
> >>> "Inferred from Electronic Annotation"
> >>>
> >>> There is no long-term/stable URL scheme for GO. Do we
> >>> make something up? Do we say "use a URL" and leave it
> >>> at that? I'll go for the latter as every reasonable
> >>> scheme should end up as a URL.
> >>>
> >>> Except for those which are annotated from multiple ontologies.
> >>>
> >>>
> >>>
> >>> Andrew
> >>> dalke at dalkescientific.com
> >>
> >>
> >> _______________________________________________
> >> DAS2 mailing list
> >> DAS2 at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/das2
> >>
> >
> > _______________________________________________
> > DAS2 mailing list
> > DAS2 at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/das2
More information about the DAS2
mailing list