[MOBY-dev] Lengthy commentary on registry policies and my own responsibilities

Mark Wilkinson markw at illuminae.com
Wed Mar 1 16:30:16 UTC 2006


On Wed, 2006-03-01 at 10:24 +0000, Martin Senger wrote:
> You might have noticed that Mark (silently? :-)) created a new page 


I quote from my message of Feb 21st:

"The policy of the MOBY Central registry at iCAPTURE is currently
being drawn-up.  A first draft is available (click on Project Docs on
the right side of the moby homepage) for comment"



>    But it may be done slightly too silently to let me feel comfortable
> (sorry, Mark, this is just my feeling, I cannot help it).

Public discussion is welcome;  your opinion, in particular, is
**always** valued!


>  The strength of
> Biomoby was always in sharing - and if we suddenly starts to have many
> registries this strenghts may decreased.

I **strongly** support this statement!!!!  The reality is,
unfortunately, not in line with this vision.  I host a registry, you
host a registry at IRRI, MIPS hosts a registry, INB hosts a registry,
there is a (dead?) registry at the University of Queensland, I host a
second registry here specifically for TAIR, a third for our local
services here at iCAPTURE, and yet a fourth here that allows me to
continue doing interoperability research, since the new RFC policies
around MOBY development are not conducive to a research environment.


>    On the other hand, I feel that we actually *need* more registries,
> perhaps with different purposes, and with the different level of curation.
> But before starting to list them we should - here my 2cs go in - think
> first about federation of the registries, about an API how to access more
> registries etc.


Absolutely!  For the past year I have been slowly (yes, silently ;-) )
moving the MOBY Central Perl code to a point where we might be in a
position to support multiple registries more easily once we decide how
to do so.  The codebase was **strongly** tied to the concept of a single
registry that was aware of a single object ontology, a single service
ontology, and a single namespace ontology.  In many ways it still is -
the ontology maintenance routines are hard-coded to prevent the
modification or removal of Objects, Services, and Namespaces that are
being used; however, that is only true if the registry itself KNOWS
about the Service Instance that is using it - this is where the idea of
multiple registries breaks down right now. 

This will require much more community discussion, but I think there is
one possible solution that minimizes the "pain" - this is just an idea
for discussion, not an RFC :-)

Many years ago, largely with an eye to better supporting the PlaNet
registry at MIPS - they were registering their own objects in their own
ontologies - I started moving the codebase toward an LSID-based naming
system for the ontology nodes; this allows secondary registries to
register new ontology nodes under their own LSID authorities.  This
doesn't entirely solve the problem, since any registry can use any other
registries Objects, Namespaces, and Services regardless of the LSID
authority under which they were registered, but it does give the
registry a way of detecting which Ontology nodes "belong to it", and
possibly querying (through LSID resolution) other registries to retrieve
information about foreign Ontology nodes.  

I'm not convinced that this is a "good" solution...in fact (using
Phillip Lord's Name in vain - hehehhehe... funny pun in there!)  Phillip
would say that an Ontology is only useful if it is *shared*, and so the
idea of having multiple ontologies is dangerous from the get-go.

So... yes... there's a lot of discussion required around how to support
multiple registries, and in particular, how to support multiple
ontologies.  The ontology literature would suggest that we have built,
for ourselves, an unsustainable architecture as soon as we allowed the
ontologies to fork...  that may or may not be true, but the solution
certainly isn't obvious...


> (which was, btw and so far, really the last one, at least regarding the
> funds available from Vancouver, as I understand it)

Correct for the moment.  MOBY has just received a sizable award from
Genome Canada for the next 3+ years, but this is specifically for code
maintenance and tooling.  There is no money for meetings in that pot,
though there is no reason why we can't put together a meeting that is
self-funded.  Traditionally I have paid for much of the expense of
hosting meetings here in Canada through the MOBY Genome Canada award,
but that doesn't have to be the case if there is the need for a meeting.
In any case, in the next couple of months there will be another
competition announcement from Genome Canada, and I will be submitting an
application to that competition that includes another pot of money to
support developer conferences.  Fingers crossed!


> Can we just postpone implementation of the
> individual (public) registry policies and first to discuss how to make the
> various registries federated?

I am not opposed to this.  Though the agent will start running in a few
days, Eddie has agreed not to activate the service deregistration
function for at least three months so that everyone can get used to the
feeling of "being crawled".  If the community feeling remains strong
against the policies I have documented for "my" MOBY Central instance,
the objections should be raised either to me personally, or publicly,
and discussed.

I do, however, have several comments with regard to my responsibilities
to the MOBY community, to my granting agency, and to my own research
endeavours (which is where MOBY started, and it continues to be a
primary domain of *research* in my laboratory).  

Genome Canada funded MOBY to be developed as a platform of
interoperability for other Genome Canada projects.  Period.  This is a
responsibility that I cannot fail in, since it is the primary
responsibility I undertook by accepting the award, so I am somewhat
limited in my freedom.  One possibility would be to support a "public"
registry that was a free-for-all and largely uncurated, and to support a
second, curated registry to support only the Genome Canada projects,
much like the INB has done.  Unfortunately, I was only awarded funds for
a single server - that is the server that is currently running "my" MOBY
Central - so this option makes me a bit uncomfortable.  Moreover, as
Paul, among others, have pointed out: "Right now in the registry if I
send a generic object I get over a hundred services back, almost none of
which will actually consume the object without dying a horrible
death." (Gordon, Feb 17th). MOBY Central has become increasingly useless
over the past couple of years as it got filled with junk, badly
registered services, dead services, "localhost" services, test services,
and all manner of other registration artifacts.  This week, for the
first time in over a year, I started up Gbrowse Moby and was (more or
less) ACTUALLY ABLE TO SURF MOBY DATA!!  It was heavenly!  Because of
the curatorial policy that I enforced - after public warnings that I was
about to do so - "my" registry suddenly became a useful resource not
only to Genome Canada, but also to the wider community.  Some might
argue about what "useful" means, but I think that "doing what it was
built to do" is a pretty good measure... and, frankly, I cannot
interpret the word any other way, since I have a responsibility to my
funding agency to generate something useful.

The curation policy I have posted does not change the API.  It is only a
policy, and though policies can be debated, I don't think that any of
the other registry hosts would want to be told how to curate "their"
registries.  Granted, the registry that I host is a unique resource in
that it is the only (??) public registry; I take that responsibility
seriously as well!  However, that responsibility (surely?!) also
requires that I try to make the registry useful to the majority of the
community users; we, as a community, can discuss whether the policy I
proposed goes too far - though I reserve the right to make the final
decision on "my" registry - however, my experience of using MOBY Central
this week suggests to me that this policy only made the public resource
better.

That's my open-kimono.  

Opinions always welcome, 

M


-- 

--
Mark Wilkinson
Asst. Professor, Dept. of Medical Genetics
University of British Columbia
PI in Bioinformatics, iCAPTURE Centre
St. Paul's Hospital, Rm. 166, 1081 Burrard St.
Vancouver, BC, V6Z 1Y6
tel: 604 682 2344 x62129
fax: 604 806 9274

"For most of this century we have viewed communications as a conduit, 
       a pipe between physical locations on the planet. 
What's happened now is that the conduit has become so big and interesting 
      that communication has become more than a conduit, 
       it has become a destination in its own right..."

                Paul Saffo - Director, Institute for the Future




More information about the MOBY-dev mailing list