[MOBY-l] thinking about provision info and authoritative service providers

Mark Wilkinson mwilkinson at gene.pbi.nrc.ca
Fri Oct 25 17:51:56 UTC 2002


Hi everyone,

There are a bunch of thoughts crystallizing in my mind at the moment 
around issues of provision information and versioning and such.  It has 
come up repeatedly as I give the MOBY presentation in various places, 
and is clearly something that scientists are very concerned about right 
from the get go.  In addition, the good folks at myGrid have seen it as 
sufficiently important that it is a major part of their entire platform, 
whereas we have largely "put off" the problem.

This was all brought to the surface yesterday during a telephone 
conversation with Lincoln, Damian, Andrew, et. al..  The issue came up 
of "who gets to serve", for example, GO_Terms.  It wasn't phrased that 
way, of course, but that was in some ways the essense of the issue.

Given that we put no restrictions on who may provide whatever service, 
it is possible (likely!) that many services will be providing outdated 
information. Lincoln suggested that we perhaps flag certain bodies as 
being "authoritative". A few tweaks to the MOBY Central database and API 
would be sufficient to store this information and also limit requests 
to, for example, "authoritative services only".

...but this opens up some questions in my mind... apart from the 
political jockying about who gets to *be* authoritative and on whose 
head that decision is going to fall ;-)

I was thinking about the use of the word "authority" in the MOBY versus 
LSID world.  In both cases, we represent the "authority" as a URI 
effectively naming an organization.  However, in LSID the authority is 
(in my interpretation of the spec) tightly linked to a namespace, while 
in the MOBY world the authority is tightly linked to a service.  So LSID 
folks might intuitively believe that the authoritative server for the 
Genbank namespace is NCBI, while in MOBY this is not always going to be 
the case... for example, if the Gene Ontology consortium decides to set 
up a service that takes in GO_Terms and spits out GenBank records of 
sequences in GO that are annotated to that term... then they are, 
clearly, the authoritative providers of GenBank in that circumstance, 
since they have the most up-to-date information on the 
**transformation** that they are providing.  This leads us to a somewhat 
peculiar situation where we could ask "who is the authoritative provider 
of Genbank records" and both NCBI and GO appear as the answer.  Of 
course, the truth is that NCBI is the authoritative transformer of 
GenBank GI's or Accessions into GenBank records, while GO is the 
authoritative transformer of GO/ID's into Genbank records... but it is 
still a bit peculiar.  I don't know if it is going to be problematic in 
the end, but it is something to keep in mind as we move towards adoping 
authoritative services....

The other thing that directly relates to this is versioning information. 
  Whether you are using the authoritative service or not, you still want 
to know the version of the resources (both data and software) that are 
being provided to you.  Again, because our services (under the current 
MOBY architecture) are generally unrestricted, this might be confusing, 
but at least we seem to have a place to fit this information.  The 
underused <MOBY/> envelope isa natural place to stick this 
information... perhaps as attributes of the MOBY envelope

<MOBY authority="geneontology.org" GenBank="version xx" 
GeneOntology="version yyy" authoritative="yes" Date="12-12-02 13:05:02">

	<Sequence namespace="Genbank/GI" id="123456">
		...
		...
	</Sequence>
	<Sequence namespace="Genbank/GI" id="789012">
		...
		...
	</Sequence>		
</MOBY>


I realize that the vocabularies for the resources aren't controlled, but 
we can hopefully find a way to deal with that...  nevertheless, is that 
just too ugly, or do you think that is sufficient to accomplish what is 
needed?  It certainly is a simple solution...

Philip/Carol/Robert - how do you represent this data in myGrid?  Have 
you found that there a need to be more complicated than that?

M



-- 
--------------------------------
"Speed is subsittute fo accurancy."
________________________________

Dr. Mark Wilkinson, RA Bioinformatics
National Research Council, Plant Biotechnology Institute
110 Gymnasium Place, Saskatoon, SK, Canada

phone : (306) 975 5279
pager : (306) 934 2322
mobile: markw_mobile at illuminae dot com





More information about the moby-l mailing list