[Biojava-l] Re: Biojava-l digest, Vol 1 #597 - 2 msgs

Tom Oinn tmo@ebi.ac.uk
Wed, 20 Mar 2002 12:55:49 +0000


Chris Mungall wrote:
> 
> On Tue, 19 Mar 2002 biojava-l-request@biojava.org wrote:
> 
> > Message: 2
> > From: "Matthew Pocock" <matthew_pocock@yahoo.co.uk>
> > To: "Tom Oinn" <tmo@ebi.ac.uk>, <biojava-l@biojava.org>
> > Subject: Re: start using biojava Was: [Biojava-l] NetBlast for java?
> > Date: Tue, 19 Mar 2002 12:12:12 -0000
> >
> > > I'd be happy to put our GO browser and client API into biojava, what
> > > would you need for this to work? I'm guessing that a standard schema
> > > (probably the one used by the stanford people) and the appropriate ego
>                                   ^^^^^^^^
> 
> ahem... Berkeley (and Colorado)

Gah! Sorry :) 
 
> > > adapter for it, but then you have everything there already.
> > >
> > > See http://www.ebi.ac.uk/ego for more information.
> > >
> > > Tom
> >
> > This would be great. I presume from the docs that the interfaces are
> > de-coupled from the database schema? The only stumbling block currently is
> > that ego seems to be GPL which would kill our lGPL licensing. I'm sure we
> > can come to some arrangement about that, though.

Okay, the way the system works is that you have factory objects that
produce implementations of the data object interfaces, so yes, it's
decoupled completely from the schema. We have implementations that talk
to our oracle database, but it wouldn't be too hard to create ones that
use other data sources.

I don't think the licensing is too much of an issue, but there is one
development point that I'm curious about; currently we're developing the
next version of this, which will include similar constructs for interpro
entries (amongst other things) such that you actually get a 'bag' of
factories, and they federate where possible so that you can pretend you
have all the data in memory and the graph traversals happen in the
background. We could put the current ego code into biojava now, but how
would you suggest further development on it? I suspect it would be best
if we moved completely to using the biojava codebase and simply
incorporated all our live code into it for this project, but that will
take some work, both technical and, *shudder*, political..

> > Looking forward to it,
> >
> > Matthew
> 
> You might also want to check out John Richter's DAG Edit code, see
> geneontology.sf.net
> 
> AFAIK, Ego is wicked for handling the associations between the ontology
> terms and gene products whilst John's code is more geared towards the
> ontologies themselves. John's code is also moving in the direction of a
> full DAML OIL model. John & Tom - don't suppose you've ever discussed
> common interfaces etc?

I think the main difference is that ego was always intended to be a read
only data model. However, there's no reason why the interfaces couldn't
be made to converge I think. It would probably be a good idea really
wouldn't it? 

Using a generic ontology model would actually probably make more sense,
especially as we need the abililty to 'enrich' GO with our additional
linkages both into external data sets (interpro) and between GO terms by
data mining from the protein mappings.

Tom