[Bioperl-l] OntologyTermI

Lincoln Stein lstein@cshl.org
Wed, 28 Aug 2002 14:49:45 -0400


I believe that the idea of the term should be separated from the idea of a 
graph of terms.  I think also that the idea of an association between a term 
and a set of biological objects should also be separate.  

I'd propose something like this:

	Bio::Ontology::TermI;   # why repeat the word Ontology?

	
		Encapsulates the term ID, the term name, the term definition, the term 		
		aliases,and possibly the "obsolete" tag.

	Bio::Ontology::Graph::NodeI;

		A node in a graph.

	Bio::::Ontology::Graph::DAGI;

		Encapsulates the graph traversal methods.

	Bio::Ontology::Term::AssociationI;
	
		Maps a term to a set of biological objects.

The nice thing about separating the Term from the DAG is that you can then 
reuse the same term in different types of graphs, or chuck the graph entirely 
without scrambling the meaning of the term.

Lincoln

On Wednesday 28 August 2002 12:53 pm, Hilmar Lapp wrote:
> (Sorry if you get this twice. Somewhere in the chain of smtp exchanges,
> this disappeared.)
>
> Hi all,
>
> we're going to need an Ontology interface and parsers for different
> formats pretty soon as we want to bring GO and other ontologies into
> Biosql. Ewan even put Ontology support on the road map for 1.2, so
> it may the right time to join forces here.
>
> Our preliminary picture here so far is that we are going to need a
> basic interface describing an ontology entry conceptually, which is
> then realized by different implementations. To give it a name, say
> Bio::OntologyTermI, with implementations living in Bio::Ontology:
>
> 	Bio::Ontology::OntologyTerm  # base implementation,
>                                   # is-a Bio::OntologyTermI
> 	Bio::Ontology::GOTerm        # is-a Bio::Ontology::OntologyTerm
>      ... etc
>
> We are looking at InterPro as in fact being another ontology, so in
> this scheme there would also be
>
> 	Bio::Ontology::InterPro
>
> Now this sketchy picture doesn't pay a lot of attention to
> ontologies being graphs, and looks at them from the use-case point
> of view rather than the computer science abstraction view point.
>
> The GO perl API in GO::Model::* in contrast lays out and implements
> the graph model. (Cool!)
>
> Does the simple sketch above make any sense? Is it going to be
> useful and appropriate? Would copying all methods from
> GO::Model::Term into OntologyTermI provide for a good start?
>
> To me it seems porting over GO::Model to Bioperl should be a pretty
> straight-forward process. Or should we prefer not to port it over
> but instead keep an external dependency to the GO perl API?
>
> We'll also need a streaming IO. Again, the GO parser already exists
> (for the XML version of the dump too?). Peter on our end is going to
> add one for InterPro unless someone can point us at something we can
> steal for that purpose (which would be great). The interface I'd
> suggest should resemble the other streaming interfaces in Bioperl,
> e.g.
>
> 	package Bio::OntologyIO.pm
>
> 	# returns a Bio::OntologyTermI object
> 	sub next_term {
> 	}
>
> 	# serializes one or more Bio::OntologyI objects
> 	sub write_term {
> 	}
>
> and drivers in Bio::OntologyIO::*.
>
> Again, does this make any sense? I'm unsure how compatible the input
> being a graph is with a streaming next_XXX() kind of thing. I'm also
> wondering how a streaming interface can be plugged into the current
> GO::Parser/GO::Builder framework, without reading the entire file
> up-front. Would flatfile parsing need the XS extension in C as
> stated in GO::AppHandle? Any advice from the experts much
> appreciated ...
>
> Ideally the groundwork for this can be steered by someone else than
> us, as we are clearly only beginners in this field. Chris? We'll
> just need something working pretty soon ...
>
> 	-hilmar