[Bioperl-l] Bio::Ontology::Ontology

Sohel Merchant s-merchant at northwestern.edu
Tue Feb 21 18:47:54 UTC 2006


Hi Hilmar and Chris,
  I have played around a bit using Bio::Annotation::Collection to
capture the headers of an ontology file. It behaves pretty well and
avoids the cycle issue which might arise by suing ontology to describe
the ontology. I have an initial version of a working parser for obo flat
file format. 

Chris, I was able to model any kind of relationship by using some of the
functionality in the Bio::Ontology::SimpleGoEngine which, I had
initially overlooked. 

I would like to commit this code to the Bioperl CVS, but I don't have
write access to it I believe. Can I send the stuff to either of you
guys?

Hilmar, I would like your feedback on the code base and would be happy
to make any changes required before we commit it to the CVS.

Thanks,
Sohel Merchant.
dictyBase

-----Original Message-----
From: drycafe at gmail.com [mailto:drycafe at gmail.com] On Behalf Of Hilmar
Lapp
Sent: Monday, February 20, 2006 8:53 PM
To: chris mungall
Cc: Bioperl; Sohel Merchant
Subject: Re: [Bioperl-l] Bio::Ontology::Ontology

On 2/20/06, chris mungall <cjm at fruitfly.org> wrote:
>
> I like the idea of using an ontology to describe the ontology.
>
> Note that the proposed structure:
> OntologyI HAS_A Annotation::OntologyTerm IS_A TermI HAS_A OntologyI
>
> will lead to cycles in the object graph when the metadata ontology
> describes itself.

Yes I know, that's why I didn't want to be too vocal about it ...

>
> actually, I think the ontology module already has object reference
> cycles. TermI->OntologyI->TermI
>
> When I brought this up originally people didn't seem to care much - so
> long as you're only parsing GO then it's not a big issue, people have
> enough memory they won't notice a big chunk of memory that refuses to
> be garbage collected way after it's used.

There is a method that destroys the cycle: $ontology->close()
(this is also an interface method)

Essentially, the cycle is not in OntologyI itself but in OntologyI
HAS-A OntologyEngineI; i.e., the latter holds (may hold) references to
terms which (may) hold a reference to an OntologyI which holds a
reference to the OntologyEngineI.

I say 'may' in parentheses because an implementation may use tricks
like late instantiation, stringified references (handles), and weak
references. It's possible to avoid the cycle altogether using such
tricks but it remains questionable how much this then affects
performance, and how ugly and incomprehensible the code would become.
Since there is the close() method I haven't bothered yet trying a
fully de-cycled implementation.

> Of course, if you want to use
> bioperl to cycle though all of OBO + SnoMed + UMLS then it's a
> different story.

Well if you want to keep all three in memory for some kind of
cross-reasoning then yes you are in trouble. But if you do one
ontology after another, you'd just have make sure to call close() on
an ontology once you're done with it.

>
> I think it's best of Sohel concentrates on getting obo.pm working,
then
> we can start thinking as a group about the best way to capture
ontology
> metadata. This includes metadata on the whole ontology, and metadata
on
> the terms (eg synonyms).
>
> To what extent are the current modules already in use?

I don't know about others but I use them often.

> I think the object cycle is a serious flaw, will it be possible to fix
this without
> a major overhaul?

If I recall correctly the way go-perl circumvents this is by having
the ontology of a term as a flat attribute. This also means that when
having a term alone, you cannot ask for its connected terms. It's been
a while, so Chris set me straight where this is not true.

It should be possible to come up with an implementation of OntologyI
that for all intents and purposes behaves like a flat scalar giving
the name until you call one of its graph traversal methods. At that
point it would instantiate the engine from persistent storage (file,
or a database connection), or retrieve one from a 'store'. The latter
is I believe what Allen started with the OntologyStore, but again I
would need to check the details.

    -hilmar

>
>
> On Feb 11, 2006, at 9:10 PM, Hilmar Lapp wrote:
>
> > Sohel, please do keep the discussion on the list, in your own
interest
> > as there's a multitude of people who can respond to you.
> >
> > SimpleValue would probably be what I'd use too. As Heikki hinted you
> > might even create an ontology for annotating ontologies, which would
> > allow you to use Annotation::OntologyTerm for annotation, but then
> > there's no qualifier value ...
> >
> > Bioperl 1.5.1 has been released last year, please check the website.
> >
> >       -hilmar
> >
> > On Feb 10, 2006, at 3:32 PM, Sohel Merchant wrote:
> >
> >> Hi Hilmar,
> >>   I really like your suggestion of implementing the
Bio::AnnotatableI
> >> interface in the Bio::Ontology::Ontology class. I am going to
> >> implement
> >> this and play around a little with it. I am planning to use
> >> Bio::Annotation::SimpleValue for annotating the header as it
provides
> >> a
> >> good way of specifying the Tag/value pair. What are your thoughts
on
> >> using this?
> >>
> >>   Also, I was wondering if you have any idea about the scheduled
date
> >> for the Bioperl 1.51 release. I would like to contribute some stuff
in
> >> the next release.
> >>
> >> Thanks,
> >> Sohel.
> >>
> >> -----Original Message-----
> >> From: Hilmar Lapp [mailto:hlapp at gmx.net]
> >> Sent: Friday, February 10, 2006 3:40 PM
> >> To: Sohel Merchant
> >> Cc: Bioperl
> >> Subject: Re: Bio::Ontology::Ontology
> >>
> >> Sohel,
> >>
> >> please allow me to copy the list in my response. There's many good
and
> >> insightful people on the list who may have something to add or
> >> different ideas.
> >>
> >> I've come across that problem myself, for instance with InterPro.
What
> >> I've done so far simply is to stick it unstructured into the
> >> definition
> >> slot, which is not helpful if your purpose goes further than just
> >> displaying it in an unstructured fashion.
> >>
> >> I'm not sure you would want to create another class for this (like
> >> AnnotatedOntology). One could make Bio::Ontology::Ontology (i.e.,
the
> >> implementation, probably not the interface) annotatable (i.e.,
> >> implement Bio::Annotatable), which supposedly would be simple to do
> >> (AnnotationCollection is already implemented, you'd just return an
> >> instance of it).
> >>
> >> Even though tag/value pairs sound like quick&fast way to go I'm
> >> leaning
> >> against it; in essence we're moving away from that elsewhere
> >> (SeqFeatureI) and hence I don't think we should restart it here.
> >>
> >> I'm not giving a definitive answer here, just my (initial)
thoughts.
> >> Hope that helps nonetheless. Can you fancy yourself trying the
> >> Annotatable approach and let us know how it goes?
> >>
> >>      -hilmar
> >>
> >>
> >> On Feb 10, 2006, at 8:39 AM, Sohel Merchant wrote:
> >>
> >>> Hi Hilmar,
> >>> How are you doing? I am Sohel Merchant, a programmer at dictyBase,
> >>> Northwestern University. I am working on a parser for an ontology
> >>> file. I really like the ontology object model which you have
> >>> contributed to Bioperl. I think its just Awesome!! One of things
> >>> which
> >>
> >>> I thought would be great to capture is the ontology headers. Right
> >>> now
> >>
> >>> one can specify only the name, authority information. I was
wondering
> >>> if there is any way, I could also capture other ontology file
headers
> >>> like version of the file, date when that ontology file was made. I
> >>> was
> >>
> >>> thinking of making a header class or alternatively it could go as
> >>> Hash
> >>
> >>> of values in the Bio::Ontology::Ontology class itself. I wanted to
> >>> know whets your thoughts about on this.
> >>>
> >>> Thanks,
> >>> Sohel Merchant
> >>> dictyBase
> >>>
> >> --
> >> -------------------------------------------------------------
> >> Hilmar Lapp                            email: lapp at gnf.org
> >> GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
> >> -------------------------------------------------------------
> >>
> >>
> >>
> >>
> > --
> > -------------------------------------------------------------
> > Hilmar Lapp                            email: lapp at gnf.org
> > GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
> > -------------------------------------------------------------
> >
> >
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


--
----------------------------------------------------------
: Hilmar Lapp -:- San Diego, CA -:- hlapp at gmx dot net :
----------------------------------------------------------





More information about the Bioperl-l mailing list