[MOBY-dev] Lengthy commentary on registry policies and my own responsibilities

Pieter Neerincx Pieter.Neerincx at wur.nl
Fri Mar 3 14:02:23 UTC 2006


Hi,

On 1-Mar-2006, at 5:30 PM, Mark Wilkinson wrote:

> On Wed, 2006-03-01 at 10:24 +0000, Martin Senger wrote:
>> You might have noticed that Mark (silently? :-)) created a new page

Well it's a publicly available page and not something hidden,  
encoded, deeply buried inside a lot of subfolders of the CVS. I  
spotted the page and was actually pretty happy with it. The official  
registry is only useful if most of the services in it actually work.  
So in my opinion it should be either manually curated or the services  
should be frequently checked in an automated fashion to make sure  
they work as registered / advertised.

>
>
> I quote from my message of Feb 21st:
>
> "The policy of the MOBY Central registry at iCAPTURE is currently
> being drawn-up.  A first draft is available (click on Project Docs on
> the right side of the moby homepage) for comment"
>
>
>
>>    But it may be done slightly too silently to let me feel  
>> comfortable
>> (sorry, Mark, this is just my feeling, I cannot help it).
>
> Public discussion is welcome;  your opinion, in particular, is
> **always** valued!
>
>
>>  The strength of
>> Biomoby was always in sharing - and if we suddenly starts to have  
>> many
>> registries this strenghts may decreased.
>
> I **strongly** support this statement!!!!

I second that. So after a lot of testing I registered my first  
publicly available services in the official Central 2 days ago :). On  
a side note: how often is that graph with the "World wide  
distribution of BioMOBY services" on the webside updated? I'v been  
refreshing that page many times a day like a little kid waiting for  
birthday presents, but so far nothing popped up :(.

> The reality is,
> unfortunately, not in line with this vision.  I host a registry, you
> host a registry at IRRI, MIPS hosts a registry, INB hosts a registry,
> there is a (dead?) registry at the University of Queensland, I host a
> second registry here specifically for TAIR, a third for our local
> services here at iCAPTURE, and yet a fourth here that allows me to
> continue doing interoperability research, since the new RFC policies
> around MOBY development are not conducive to a research environment.

I have another 2 registries here at the WUR, but exclusively for  
testing and debugging. I didn't want to mess up Mark's public Central  
with my experimental stuff. Multiple registries are fine if they have  
different specific functions. Having services that were meant to be  
publicly available from multiple registries would be good too for  
example for redundancy. In case one of those registries is down you  
can simple use another. But the tricky part is that multiple  
registries will easily get out of sync, both with regard to the  
registered services as with regard to the version of the API they are  
running. If that happens everybody will not trust the registries  
already available and create their own. In the end we'll all be  
licking our own lollypops and the idea of interoperability will be gone.

>
>
>>    On the other hand, I feel that we actually *need* more registries,
>> perhaps with different purposes, and with the different level of  
>> curation.
>> But before starting to list them we should - here my 2cs go in -  
>> think
>> first about federation of the registries, about an API how to  
>> access more
>> registries etc.
>
>
> Absolutely!  For the past year I have been slowly (yes, silently ;-) )
> moving the MOBY Central Perl code to a point where we might be in a
> position to support multiple registries more easily once we decide how
> to do so.  The codebase was **strongly** tied to the concept of a  
> single
> registry that was aware of a single object ontology, a single service
> ontology, and a single namespace ontology.  In many ways it still is -
> the ontology maintenance routines are hard-coded to prevent the
> modification or removal of Objects, Services, and Namespaces that are
> being used; however, that is only true if the registry itself KNOWS
> about the Service Instance that is using it - this is where the  
> idea of
> multiple registries breaks down right now.
>
> This will require much more community discussion, but I think there is
> one possible solution that minimizes the "pain" - this is just an idea
> for discussion, not an RFC :-)
>
> Many years ago, largely with an eye to better supporting the PlaNet
> registry at MIPS - they were registering their own objects in their  
> own
> ontologies - I started moving the codebase toward an LSID-based naming
> system for the ontology nodes; this allows secondary registries to
> register new ontology nodes under their own LSID authorities.  This
> doesn't entirely solve the problem, since any registry can use any  
> other
> registries Objects, Namespaces, and Services regardless of the LSID
> authority under which they were registered, but it does give the
> registry a way of detecting which Ontology nodes "belong to it", and
> possibly querying (through LSID resolution) other registries to  
> retrieve
> information about foreign Ontology nodes.
>
> I'm not convinced that this is a "good" solution...in fact (using
> Phillip Lord's Name in vain - hehehhehe... funny pun in there!)   
> Phillip
> would say that an Ontology is only useful if it is *shared*, and so  
> the
> idea of having multiple ontologies is dangerous from the get-go.
>
> So... yes... there's a lot of discussion required around how to  
> support
> multiple registries, and in particular, how to support multiple
> ontologies.  The ontology literature would suggest that we have built,
> for ourselves, an unsustainable architecture as soon as we allowed the
> ontologies to fork...  that may or may not be true, but the solution
> certainly isn't obvious...
>
>
>> (which was, btw and so far, really the last one, at least  
>> regarding the
>> funds available from Vancouver, as I understand it)
>
> Correct for the moment.  MOBY has just received a sizable award from
> Genome Canada for the next 3+ years, but this is specifically for code
> maintenance and tooling.  There is no money for meetings in that pot,
> though there is no reason why we can't put together a meeting that is
> self-funded.  Traditionally I have paid for much of the expense of
> hosting meetings here in Canada through the MOBY Genome Canada award,
> but that doesn't have to be the case if there is the need for a  
> meeting.
> In any case, in the next couple of months there will be another
> competition announcement from Genome Canada, and I will be  
> submitting an
> application to that competition that includes another pot of money to
> support developer conferences.  Fingers crossed!
>
>
>> Can we just postpone implementation of the
>> individual (public) registry policies and first to discuss how to  
>> make the
>> various registries federated?
>
> I am not opposed to this.  Though the agent will start running in a  
> few
> days, Eddie has agreed not to activate the service deregistration
> function for at least three months so that everyone can get used to  
> the
> feeling of "being crawled".  If the community feeling remains strong
> against the policies I have documented for "my" MOBY Central instance,
> the objections should be raised either to me personally, or publicly,
> and discussed.
>
> I do, however, have several comments with regard to my  
> responsibilities
> to the MOBY community, to my granting agency, and to my own research
> endeavours (which is where MOBY started, and it continues to be a
> primary domain of *research* in my laboratory).
>
> Genome Canada funded MOBY to be developed as a platform of
> interoperability for other Genome Canada projects.  Period.

Sure! But it's in the interest of the people in Genome Canada  
projects to be able to use services from other service providers as  
well. Look at it from this side: if Genome Canada funds a good  
BioMOBY Central they get to use lot's of services whose development  
they didn't have to fund for free!

> This is a
> responsibility that I cannot fail in, since it is the primary
> responsibility I undertook by accepting the award, so I am somewhat
> limited in my freedom.  One possibility would be to support a "public"
> registry that was a free-for-all and largely uncurated, and to  
> support a
> second, curated registry to support only the Genome Canada projects,
> much like the INB has done.

Bad idea in my point of view. The free-for-all registry will be a  
mess in no time, so in the end no one will use it. Then if the people  
in Genome Canada want to benefit from services in the INB registry,  
those registries will have to run the same version of the interface.  
Currently the registration process already differs, because at the  
INB they request additional info. I think it's just a matter of time  
before multiple Centrals will be out of sync. What we need is one big  
public free curated Central. Maybe it could be mirrored for load  
balancing and fail-over, but that will only work if those Centrals  
are *perfect* mirrors. If people start to host "partial" mirrors or  
outdated mirrors it won't work. In that case everyone will go  
straight to the official server, because they don't trust the mirrors.

> Unfortunately, I was only awarded funds for
> a single server

If hardware is an issue, I think we could easily run a Central on  
dedicated hardware over here at the WUR. I would have to ask my boss  
of course, but I think we would be happy to help out with either a  
perfect mirror of the official public BioMOBY Central or with a  
public 'feel free to experiment' test Central or both. But I will  
never run my own 'only available to WUR users' Central or one with a  
custom patched interface.

> - that is the server that is currently running "my" MOBY
> Central - so this option makes me a bit uncomfortable.
> Moreover, as
> Paul, among others, have pointed out: "Right now in the registry if I
> send a generic object I get over a hundred services back, almost  
> none of
> which will actually consume the object without dying a horrible
> death." (Gordon, Feb 17th). MOBY Central has become increasingly  
> useless
> over the past couple of years as it got filled with junk, badly
> registered services, dead services, "localhost" services, test  
> services,
> and all manner of other registration artifacts.  This week, for the
> first time in over a year, I started up Gbrowse Moby and was (more or
> less) ACTUALLY ABLE TO SURF MOBY DATA!!  It was heavenly!  Because of
> the curatorial policy that I enforced - after public warnings that  
> I was
> about to do so - "my" registry suddenly became a useful resource not
> only to Genome Canada, but also to the wider community.  Some might
> argue about what "useful" means, but I think that "doing what it was
> built to do" is a pretty good measure... and, frankly, I cannot
> interpret the word any other way, since I have a responsibility to my
> funding agency to generate something useful.
>
> The curation policy I have posted does not change the API.  It is  
> only a
> policy, and though policies can be debated, I don't think that any of
> the other registry hosts would want to be told how to curate "their"
> registries.  Granted, the registry that I host is a unique resource in
> that it is the only (??) public registry; I take that responsibility
> seriously as well!  However, that responsibility (surely?!) also
> requires that I try to make the registry useful to the majority of the
> community users; we, as a community, can discuss whether the policy I
> proposed goes too far

I think it's a fair policy and it should have been like this form the  
start!

> - though I reserve the right to make the final
> decision on "my" registry - however, my experience of using MOBY  
> Central
> this week suggests to me that this policy only made the public  
> resource
> better.

Definitely! I'm glad the policy is there and I think Mark is doing a  
fine job as the  guardian of the only public Central we have.

Cheers,

Pi

> That's my open-kimono.
>
> Opinions always welcome,
>
> M
>
>
> -- 
>
> --
> Mark Wilkinson
> Asst. Professor, Dept. of Medical Genetics
> University of British Columbia
> PI in Bioinformatics, iCAPTURE Centre
> St. Paul's Hospital, Rm. 166, 1081 Burrard St.
> Vancouver, BC, V6Z 1Y6
> tel: 604 682 2344 x62129
> fax: 604 806 9274
>
> "For most of this century we have viewed communications as a conduit,
>        a pipe between physical locations on the planet.
> What's happened now is that the conduit has become so big and  
> interesting
>       that communication has become more than a conduit,
>        it has become a destination in its own right..."
>
>                 Paul Saffo - Director, Institute for the Future
>
> _______________________________________________
> MOBY-dev mailing list
> MOBY-dev at biomoby.org
> http://biomoby.org/mailman/listinfo/moby-dev


Wageningen University and Research centre (WUR)
Laboratory of Bioinformatics
Transitorium (building 312) room 1034
Dreijenlaan 3
6703 HA Wageningen
The Netherlands
phone: 0317-483 060
fax: 0317-483 584
mobile: 06-143 66 783
pieter.neerincx at wur.nl






More information about the MOBY-dev mailing list