[DAS2] Ontologies in DAS/2

chris mungall cjm at fruitfly.org
Tue Feb 7 21:59:12 UTC 2006


On Feb 7, 2006, at 1:20 PM, Allen Day wrote:

> Hi Chris,
>
> On Tue, 7 Feb 2006, Chris Mungall wrote:
>
>>
>> Hi all
>>
>> I'm concerned that the XML in the URL below isn't quite Obo-XML, it's
>> Allen's modified version of it. In particular, the adding of an "id"
>> attribute which is redundant with the id element, and the 
>> modification of
>> the ID scheme to use slashes instead of :s.
>>
>> I believe the latter may have been to make the ID scheme more DAS-y?
>
> The slash was introduced to take advantage of xml:base and the
> hierarchical relationship between namespaces and terms, e.g.
>
>   xml:base="/das/ontology/obo/1/ontology" + id="SO/0000001"
>
> is equivalent to:
>
>   /das/ontology/obo/1/ontology/SO/0000001

it's actually equivalent to:
/das/ontology/obo/1/ontologySO/0000001

> If we want the identifier to be SO:0000001, it means that we have to 
> make
> xml:base="/das/ontology/obo/1/ontology/SO.  This is problematic for two
> reasons:
>
>   1) multiple xml:base cannot be defined for the entire document, 
> meaning
>      that URIs for other records referenced become very long.

Why not just define a qname for every idspace? This is the standard way 
of doing this in XML

Using xml:base is not a big gain for brevity, since fairly soon some 
obo ontologies will reference other obo ontologies.

In fact is this even as issue if you get rid of the id attribute to 
conform to obo-xml? ids in obo-xml are encoded as elements, so xml:base 
rules are not applied. Obo has it's own rules for ID generation. This 
has the arguable disadvantage that we can't directly use xml:base and 
the whole xml namespace system for OBO IDs, we layer our own system on 
top. This is actually preferable for us.

>   2) different ontologies cannot use the same xml:base
>
> The only way I see out of this ATM is to treat : as a / internal to the
> Ontology-DAS service.

I'm still not sure what the problem is, and I think you may be stuck 
anyway when it comes to RDF/OWL ontologies

>
>> OBO IDs are composed of a prefix and a local ID. These are always 
>> joined
>> with a :. The prefix can be specified as shortform (eg GO) or a URI
>> prefix. When the long form is combined with the local ID you get your 
>> URI.
>>
>> If DAS wants to use a modified version of Obo-XML, that's fine, but 
>> please
>> don't call it Obo-XML, it will cause huge confusion!
>>
>> I would much prefer if you used Obo-XML as it is - if there are things
>> you'd like to see changed about the format we can perhaps work that 
>> out.
>> I'm concerned by the changing the ID to use / instead of :. This is 
>> wrong,
>> and if it's something that's required for DAS, how will you 
>> interoperate
>> with RDF etc?
>>
>> In fact there are other parts where the xml is definitely not Obo-XML 
>> - it
>> looks like Allen has coded these by hand rather than taking existing 
>> XML.
>> That's fine, but it should be marked as such. For example, there is no
>> develops_from element in Obo-XML; all relations bar is_a are encoded 
>> as
>> relationship elements.
>
> The XML provided by the Ontology-DAS server is using templates to mark 
> up
> ontology records that have been loaded to a chado database using
> perl-go-perl.  The develops_from node, IIRC, was created because there 
> is
> a section in a perl-go-perl .xslt that creates elements for all
> relationship types.

hmmm, I don't think so, but the point is moot anyway, just so long as 
the final version uses xml that validates, either against obo-xml or 
your own documented variant

>
>>
>> There is a DTD at the moment
>> http://www.godatabase.org/dev/xml/dtd
>
> This didn't exist at the time I wrote my templates ( 4-6 months ago), 
> or I
> would have validated.

it did, it's just not well signposted! sorry about that

look forward to seeing a demo. I do this you have to work out the 
semantics of retrieval by ontology term though.

cheers
chris

>
> -Allen
>
>
>
>>
>> The docs are minimal as the explanation of all the fields is in the 
>> docs
>> for the obo text file format
>> http://www.godatabase.org/dev/doc/obo_format_spec.{html,txt,pdf}
>>
>> We'll be converting to RNG+XSD soon
>>
>> You can get Obo-XML examples from
>> http://www.fruitfly.org/~cjm/obo-download
>>
>> You can see the default rule for creating a URI in the OWL files; 
>> these
>> currently all get the geneontology.org URI prefix by default, but this
>> will change (we were going to use LSIDs but the majority of OWL tools
>> don't seem to handle URNs very well)
>>
>> As far as DAS/2 supporting different file formats, Obo-XML and 
>> RDFS/OWL
>> would seem to be the natural contenders. We currently go from the 
>> former
>> to the latter via a simple XSLT, the reverse transformation is a 
>> little
>> more difficult.
>>
>> Allen has inlined some comments from an email exchange with me in the
>> document. I agree about keeping the API minimal. On the other hand you
>> will need at least some inferencing machinery - I'd encourage you to 
>> reuse
>> existing reasoning services here.
>>
>> Cheers
>> Chris
>>
>> On Tue, 7 Feb 2006, Helt,Gregg wrote:
>>
>>> I talked to Suzi, she's planning to join our teleconference today to
>>> discuss ontologies, wearing her hat as co-PI of the National Center 
>>> for
>>> Biomedical Ontology.  Hopefully Lincoln can join too.
>>>
>>> I took a closer look at the DAS/2 ontology work Allen has done (see
>>> http://biodas.org/documents/das2/das2_ontology.html).  I urge anyone 
>>> who
>>> wants to contribute to the ontology discussion to read this doc.  It
>>> specifies a way to retrieve ontologies in OBOXML format.  In this 
>>> format
>>> each ontology term gets an absolute URI through the same mechanism 
>>> that
>>> the rest of DAS/2 uses (URIs for ids, which can be either absolute or
>>> relative but resolvable).  As Allen pointed out yesterday this would
>>> solve our problem of how to uniquely specify ontology terms in the 
>>> DAS/2
>>> TYPES XML.
>>>
>>> I couldn't find any documentation for the OBOXML format, other than 
>>> the
>>> code that generates it from OBO files.  But I'm using OBOXML as an
>>> example here because it clearly has resolvable URIs for each ontology
>>> term.  In Allen's spec, ontologies can also be returned in other
>>> formats, but it's unclear to me whether terms in these other formats
>>> would resolve to similar URIs.
>>>
>>> 	gregg
>>>
>>>> -----Original Message-----
>>>> From: das2-bounces at portal.open-bio.org
>>> [mailto:das2-bounces at portal.open-
>>>> bio.org] On Behalf Of Andrew Dalke
>>>> Sent: Tuesday, February 07, 2006 1:32 AM
>>>> To: DAS/2
>>>> Subject: Re: [DAS2] Notes from the DAS/2 teleconference for the code
>>>> sprint,6 Feb 2006
>>>>
>>>>> gh: would like a re-cast as xml document, hosted at so/sofa
>>>>> website. that xml would be like a std ontology representation so 
>>>>> you
>>>>> could extend it. so someone could point to an extension of it.
>>>>
>>>> I asked as an action item if Gregg would look into the solution
>>>> for this.  Do we refer to the ontology by a "GO:0123456" identifier
>>>> or by some URL scheme?  If so, what's the mapping from URL scheme
>>>> to something that clients and people can understand, eg, to
>>>> ask for everything which is an exon?
>>>>
>>>> Does this mapping need a version number - does it change over time?
>>>>
>>>> 					Andrew
>>>> 					dalke at dalkescientific.com
>>>>
>>>> _______________________________________________
>>>> DAS2 mailing list
>>>> DAS2 at portal.open-bio.org
>>>
>>>
>>> _______________________________________________
>>> DAS2 mailing list
>>> DAS2 at portal.open-bio.org
>>> http://portal.open-bio.org/mailman/listinfo/das2
>>>
>>
>>
>> _______________________________________________
>> DAS2 mailing list
>> DAS2 at portal.open-bio.org
>> http://portal.open-bio.org/mailman/listinfo/das2
>>




More information about the DAS2 mailing list