[DAS2] Ontology URIs (was RE: types.rnc)

Chris Mungall cjm at fruitfly.org
Thu Nov 9 23:35:46 UTC 2006


On Nov 9, 2006, at 10:07 AM, Helt,Gregg wrote:

>
>
>> -----Original Message-----
>> From: Chervitz, Steve
>> Sent: Wednesday, November 08, 2006 5:08 PM
>> To: Chris Mungall; Helt,Gregg
>> Cc: DAS/2 Discussion
>> Subject: Re: [DAS2] Ontology URIs (was RE: types.rnc)
>>
>> Seems like we may need to freeze the spec in a state that is fairly
>> non-committal w/r/t how ontology identifiers work. I propose to  
>> remove
> the
>> parts that are still not nailed down, so that we don't engender the
>> creation
>> of mutually incompatible implementations (one of the problems with
> DAS/1
>> which DAS/2 is aiming at).
>>
>> The ontology attribute in the type element is currently documented  
>> as:
>>
>>   # ontology identifier.  The naming scheme is still undecided.
>>   # This will be a URI.
>>   attribute ontology { text }?,
>>
>> I think this is too vague. It's subject to lots of interpretation as
> to
>> what
>> it could point at and what it might resolve to. It could justifiably
> be
>> used
>> to identify any of these:
>>
>>   - a specific term in an ontology
>>   - the ontology as a whole (e.g., homepage of GO)
>>   - evidence code (as in the example below)
>> The so_accession attribute gets us most of what we want and should
> suffice
>> for this freeze. In one fell swoop it identifies the ontology and a
>> particular term within it, and it defers the issue of ontology URIs.
>>
>> Some SO things to consider:
>>
>> 1) Should so_accession be restricted to SOFA (only locatable feature
>> types)?
>> If so, call it sofa_accession. (maybe too limiting)
>>
>> 2) What about SO versioning? Maybe a 'so_version' attribute would  
>> make
>> sense
>> (so_version="SOFA 2.1"). SO term IDs are stable across releases, but
>> sometimes terms become obsolete and are no longer listed.
>>
>> Steve
>>
>
> The "ontology" attribute of the TYPE element is meant to be an
> identifier for a specific ontology term in the SO or SOFA.  It (and  
> its
> placeholder, "so_accession") is the only place where any part of DAS/2
> depends directly on an ontology.  GO terms (or any other ontology) can
> be used as properties of features -- the biopackages server does this
> for example.  But it is done using a generic property mechanism that
> makes no mention of ontologies, and the DAS/2 spec does not mention or
> depend on any ontology other than SO.
>
> The reason there is both an "ontology" and "so_accession" attribute is
> that we didn't have an official SO URI syntax to refer to, so we  
> created
> a temporary "so_accession" attribute to use until we had something to
> put in for "ontology".  Since the ontology attribute can _only_ be  
> from
> SO or SOFA, I agree with Steve that we could collapse  
> "so_accession" and
> "ontology" down to one attribute and use a prefix shorthand for SO/ 
> SOFA
> terms, for example "SO:0000147".  This has the nice property that the
> shorthand is in fact a legal absolute URI, and therefore unaffected by
> any "xml:base" attributes in the document.  I'd instead prefer this  
> URI
> to be a URL that points to a description at the biomedical ontology
> center.  But specifying that the attribute is a URI allows both the
> shorthand and later a more official link.
>
> Allen Day and Brian O'Connor have implemented an ontology server  
> with an
> HTTP API that fits in very well with DAS/2, where each ontology  
> term has
> its own URI.  This was discussed back on the DAS/2 mailing list in
> February and I think Chris had some concerns, here's the start of the
> thread:
> http://portal.open-bio.org/pipermail/das2/2006-February/000507.html .
> To avoid divergence I've been reluctant to devote more resources to  
> this
> unless it was in collaboration with the ontology center.

well I wouldn't like to hold anything up!

By december it will be possible to browse all OBO ontologies, but any  
plans for providing stables URIs and programmatic access will  
probably wait til next year.

If you have an ontology server ready, go with it. It's still unclear  
what the best approach is for serving up ontologies is, though the  
future is looking decidedly rdf/owl/sparqly.

> I don't think we really need SO versioning -- to be useful it  
> places an
> extra burden on the ontology maintainers.  And looking at the current
> SO, when a term becomes obsolete it is still included in the ontology,
> it just gets flagged with an "is_obsolete:true" tag.

I agree. This is policy for all good OBO ontologies; any change in  
the substance of a definition results in a new ID.

> Andrew's comment below made me realize we may have another problem --
> not annotation with multiple ontologies, but rather annotation with
> multiple terms from the SO.  I had thought each feature type could be
> based on a single ontology term (maybe using SO composite terms:
> http://www.bioontology.org/wiki/index.php/SO:Composite_Terms), but
> looking at the latest SO I don't think we can make this assumption.
> Which argues that "so_accession" should be a child element of TYPE
> rather than an attribute, and one or more be allowed.  Or am I reading
> the SO wrong?  Lincoln?  Chris?

Any DAS feature F should be associated with a single  
SO:located_sequence_feature  T(I would submit that the formal  
interpretation of this be: all actual genomic entities that  
instantiate the pattern represented by F should instantiate the  
pattern represented by T)

However, a feature can be associated with multiple properties - these  
will be subtypes of SO:atribute.

> As far as Chris' question as to what exactly an ontology URL should
> dereference to, relative to the DAS/2 spec I don't think it matters  
> too
> much.  An XML response with some structured description like what
> Allen's server returns would be nice, but I could see the benefits of
> HTML as the default too.

There is a discussion on public-semweb-lifesci on the relative merits  
of content negaotiation with URIs right now..

> Did I mention I'm a fan of content
> negotiation?  In most of the DAS/2 HTTP GET requests, we have optional
> "format=" query parameter arguments to allow alternative format  
> requests
> even in situations where HTTP content negotiation is not
> straightforward.

That's fine, on the understand that suffixing the "format=<X>"  
creates a different URI

> 	Gregg
>
>>> From: Chris Mungall <cjm at fruitfly.org>
>>> Date: Wed, 8 Nov 2006 17:11:09 -0500
>>> To: "Helt,Gregg" <Gregg_Helt at affymetrix.com>
>>> Cc: DAS/2 <das2 at lists.open-bio.org>
>>> Subject: Re: [DAS2] Ontology URIs (was RE: types.rnc)
>>>
>>>
>>> There absolutely needs to be a stable URI scheme for referencing
>>> types defined in ontologies. The details of the scheme aren't clear
>>> yet. It will probably be http based (ie not LSID).
>>>
>>> Do you have specific requirements? Should the URI be a URL
>>> dereferenceable in any browser? Should it dereference to html or RDF
>>> or use content negotion to decide which? etc
>>>
>>> On Nov 8, 2006, at 2:54 PM, Helt,Gregg wrote:
>>>
>>>> I'll talk to Suzi in her role as co-PI at NCBO (National Center for
>>>> Biomedical Ontolgoy).  We may be able to quickly work out a URI
>>>> syntax (even if implementation of what the URIs resolve to comes
>>>> later).
>>>>
>>>> gregg
>>>>
>>>>> -----Original Message-----
>>>>> From: Andrew Dalke [mailto:dalke at dalkescientific.com]
>>>>> Sent: Tuesday, November 07, 2006 6:23 PM
>>>>> To: Ed
>>>>> Cc: Helt,Gregg
>>>>> Subject: Re: types.rnc
>>>>>
>>>>> Ed:
>>>>>> What bothers me is "still undecided".  That doesn't belong in a
>>>>>> "frozen" spec.  Though I have no idea what the correct text to
> put
>>>>>> here is.
>>>>>
>>>>> Take for example
>>>>>
>>>>> http://genome.cbs.dtu.dk:9000/das/secretomep/types
>>>>>
>>>>>
>>>>>      <TYPE id="NC-SECRETORY" method="SecretomeP-1.0"
>>>>>        category="protein sorting" description="Ab initio
>>>>> predictions of
>>>>> non-classical i.e. not signal peptide triggered protein secretion"
>>>>>        evidence="IEA"
>>>>>
>>>>>
> ontology="http://www.geneontology.org/GO.evidence.shtml">35138</TYPE>
>>>>>
>>>>> It uses an ontology URI to describe which ontology scheme is
>>>>> used to describe the "evidence" value.  In this case it means
>>>>> "Inferred from Electronic Annotation"
>>>>>
>>>>> There is no long-term/stable URL scheme for GO.  Do we
>>>>> make something up?  Do we say "use a URL" and leave it
>>>>> at that?  I'll go for the latter as every reasonable
>>>>> scheme should end up as a URL.
>>>>>
>>>>> Except for those which are annotated from multiple ontologies.
>>>>>
>>>>>
>>>>>
>>>>> Andrew
>>>>> dalke at dalkescientific.com
>>>>
>>>>
>>>> _______________________________________________
>>>> DAS2 mailing list
>>>> DAS2 at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/das2
>>>>
>>>
>>> _______________________________________________
>>> DAS2 mailing list
>>> DAS2 at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/das2
>
>




More information about the DAS2 mailing list