[MOBY-l] thinking about provision info and authoritative service providers
Mark Wilkinson
mwilkinson at gene.pbi.nrc.ca
Fri Oct 25 17:51:56 UTC 2002
Hi everyone,
There are a bunch of thoughts crystallizing in my mind at the moment
around issues of provision information and versioning and such. It has
come up repeatedly as I give the MOBY presentation in various places,
and is clearly something that scientists are very concerned about right
from the get go. In addition, the good folks at myGrid have seen it as
sufficiently important that it is a major part of their entire platform,
whereas we have largely "put off" the problem.
This was all brought to the surface yesterday during a telephone
conversation with Lincoln, Damian, Andrew, et. al.. The issue came up
of "who gets to serve", for example, GO_Terms. It wasn't phrased that
way, of course, but that was in some ways the essense of the issue.
Given that we put no restrictions on who may provide whatever service,
it is possible (likely!) that many services will be providing outdated
information. Lincoln suggested that we perhaps flag certain bodies as
being "authoritative". A few tweaks to the MOBY Central database and API
would be sufficient to store this information and also limit requests
to, for example, "authoritative services only".
...but this opens up some questions in my mind... apart from the
political jockying about who gets to *be* authoritative and on whose
head that decision is going to fall ;-)
I was thinking about the use of the word "authority" in the MOBY versus
LSID world. In both cases, we represent the "authority" as a URI
effectively naming an organization. However, in LSID the authority is
(in my interpretation of the spec) tightly linked to a namespace, while
in the MOBY world the authority is tightly linked to a service. So LSID
folks might intuitively believe that the authoritative server for the
Genbank namespace is NCBI, while in MOBY this is not always going to be
the case... for example, if the Gene Ontology consortium decides to set
up a service that takes in GO_Terms and spits out GenBank records of
sequences in GO that are annotated to that term... then they are,
clearly, the authoritative providers of GenBank in that circumstance,
since they have the most up-to-date information on the
**transformation** that they are providing. This leads us to a somewhat
peculiar situation where we could ask "who is the authoritative provider
of Genbank records" and both NCBI and GO appear as the answer. Of
course, the truth is that NCBI is the authoritative transformer of
GenBank GI's or Accessions into GenBank records, while GO is the
authoritative transformer of GO/ID's into Genbank records... but it is
still a bit peculiar. I don't know if it is going to be problematic in
the end, but it is something to keep in mind as we move towards adoping
authoritative services....
The other thing that directly relates to this is versioning information.
Whether you are using the authoritative service or not, you still want
to know the version of the resources (both data and software) that are
being provided to you. Again, because our services (under the current
MOBY architecture) are generally unrestricted, this might be confusing,
but at least we seem to have a place to fit this information. The
underused <MOBY/> envelope isa natural place to stick this
information... perhaps as attributes of the MOBY envelope
<MOBY authority="geneontology.org" GenBank="version xx"
GeneOntology="version yyy" authoritative="yes" Date="12-12-02 13:05:02">
<Sequence namespace="Genbank/GI" id="123456">
...
...
</Sequence>
<Sequence namespace="Genbank/GI" id="789012">
...
...
</Sequence>
</MOBY>
I realize that the vocabularies for the resources aren't controlled, but
we can hopefully find a way to deal with that... nevertheless, is that
just too ugly, or do you think that is sufficient to accomplish what is
needed? It certainly is a simple solution...
Philip/Carol/Robert - how do you represent this data in myGrid? Have
you found that there a need to be more complicated than that?
M
--
--------------------------------
"Speed is subsittute fo accurancy."
________________________________
Dr. Mark Wilkinson, RA Bioinformatics
National Research Council, Plant Biotechnology Institute
110 Gymnasium Place, Saskatoon, SK, Canada
phone : (306) 975 5279
pager : (306) 934 2322
mobile: markw_mobile at illuminae dot com
More information about the moby-l
mailing list