[Bioperl-l] Bio::Ontology overhaul

Chris Mungall cjm at fruitfly.org
Wed Feb 26 19:07:13 EST 2003


On Wed, 26 Feb 2003, Hilmar Lapp wrote:

> 3) Add a method ontology() to TermI that accepts and returns a object
> implementing Bio::Ontology::OntologyI. Remove method category() (I
> added an implementation to Term.pm that ensures backward compatibility).
>
> This is a controversial thing to do because it almost inevitably
> creates memory cycles (the term points to the ontology which points to
> its terms). Calling OntologyI::close() is required to break the cycle.
> I thought about this for the last couple days and finally decided that
> for usability's sake this is probably the right thing to do
> nevertheless. Here are my reasons.

eek! scary

> 	- the only way to copy the PrimarySeq/Seq/SeqFeatureI pattern loses
> half of its usefulness (you want name *and* query engine accessible for
> it to be really useful)
>
>      - most if not all people are going to use only a few ontologies
> during any given runtime, not hundreds of thousands like for features
> and sequences, and those few ontologies you will want in memory anyway
>
>      - you can break all the cycles by a single call to one designated
> method on an ontology, which IMHO is not asking for that much

are you absolutely sure this will work?

I think this will force a lot of work onto the API user - they will have
to make sure that they no longer have other objects in memory with a
reference path to the ontology being closed.

it also means that someone who mistakenly does this:

foreach (@entities) {
  my $ont = $factory->create_ontology();
  ...
}

instead of this:

my $ont = $factory->create_ontology();
foreach (@entities) {
  ...
}

could easily be hosed

we should also think ahead to the future - will your plan work if we
decide to make our ontologies more OWL/OIL/Protege like and add
instances? what about applications that do things such as generating cross
products between terms, creating potentially huge amount of terms. It may
well be doable in a scalable way but I just think adding circular
references makes it harder and potentially easier for us to get into a
a nasty situation

>      - having the ontology() method just return a plain string or a dumb
> namespace object is clunky and has very limited if any usability
> outside of e.g. bioperl-db; in contrast, being able to get at the full
> featured ontology with all the query methods by calling a method on any
> given term is potentially very useful

i disagree - i say make the term and relationship objects dumb and put all
the methods in the ontology class. makes it way simpler.

> 	- as Matt pointed out to me correctly, it is in fact possible to come
> up with a query engine implementation that avoids the cycles altogether
> by constructing term and relationship objects on the fly from raw hash
> or array refs when such objects are requested. Given the design that
> I'm proposing, it is very easy to plug that in once somebody writes it
> (call
> $ontology->engine($my_engine_without_term_objects).

so if I ask for the same term 1000 times, there will actually be 1000
objects in memory rather than 1?

> 4) Add a method ontology() to RelationshipI. See 3) for the caveats and
> considerations.
>
> 5) Change the OntologyIO parsers to adapt to these changes, and
> implement a method next_ontology(), returning a
> Bio::Ontology::OntologyI instance, or undef at EOI.
>
> </proposal>
>
> I also added a class Bio::Ontology::OntologyStore that acts as a
> singleton and is able to resolve names to ontology objects. I
> originally thought this was going to take care of the cyclic reference
> problem, but it really doesn't. It might still be of use to someone ...
>
> Please share your comments, concerns, suggestions, criticisms :-)

other than the circular ref thing, it all sounds great, go for it!

> Cheers,
>
> 	-hilmar
>



More information about the Bioperl-l mailing list