[Bioperl-l] Bio::Ontology overhaul

Hilmar Lapp hlapp at gnf.org
Thu Feb 27 19:07:11 EST 2003


On Thursday, February 27, 2003, at 03:09  PM, Aaron J Mackey wrote:

>
> Fantastic, code examples ... examining in more detail:
>
> I. obtain an ontology term (and its parent ontology) from a seqfeature:
>
>>   my $seqin = Bio::SeqIO->new( -format => 'embl');
>>   while (my $seq = $seqin->next_seq()) ) {
>>     foreach my $sf ( $seq->get_all_SeqFeatures() ) {
>>       # unclear whether OntologyTermI's have an as_string method
>>       print "Ooooh. It has ontology term ",$sf->type->as_string," from
>> ontology ",$sf->type->ontology->name,"\n";
>>     }
>>   }

Ontology::TermIs have a name (which would be your "as_string"):
$sf->type->name()

>
> Great.  From this perspective, clearly an OntologyTermI has to somehow 
> be
> able to get to their parent OntologyI; Lincoln, how would a
> OntologyHandleI help get around a "backref" to OntologyI from
> OntologyTermI in this case?

I thought about this possibility too but decided against it because it 
creates yet another class. Maybe that's not a problem to anyone ... The 
way this would help is *not* by obviating the memory cycle, but by 
breaking the cycle automatically if the ontology (as represented by the 
"handle") goes out of scope. Since you write it such that the "handle" 
is not part of the cycle, it can be garbage collected. You hook into 
that with a DESTROY on the "handle" and break the cycle in the real 
ontology. $term->ontology() creates the "handle" on the fly (it can't 
keep a pointer because otherwise the handle is part of the cycle).

So, there still is a memory cycle, but instead of the user to be 
required to call $ontology->close() you initiate that automatically in 
the handle's DESTROY.

What just occurs to me is what if someone didn't ever ask for the 
ontology? Then the cycle will not be broken because no handle would go 
out of scope. Also, what if a handle goes out of scope, but you're not 
done yet with the ontology, i.e., you still have references to one or 
more terms of the ontology? I'm confused now. Lincoln, what did I miss?

>
> II. Determine the "inheritance" of a term from another term (both 
> obtained
> via seqfeatures)
>
>>   if( $sf->type->is_child_of($anothersf->type) ) {
>>      # do something
>>   }
>
> Presumably, this gets passed off to the OntologyEngineI:
> sub OntologyTermI::is_child_of {
>    my $self = shift;
>    return $self->ontology->engine->is_child_of($self, @_)
> }
>

You can't issue ontology-based queries directly on the term. We made 
decision a while back to keep terms as lean as possible. 
$term->ontology() is already quite a stretch from that.

So, you'd do $term->ontology->is_child_of($term, $reltype) (which would 
indeed internally delegate to the engine), if that method existed. It 
doesn't :)

What you'd do is

	sub is_child_of {
		my ($ontology,$subject,$query,$reltype) = @_;
		my @match = grep { $_->name() eq $subject->name(); }
                          $ontology->get_child_terms($query,$reltype);
		return @match ? 1 : 0;
	}

Now that people want to get more serious about ontologies, those may 
want to revisit the present query capabilities defined in 
OntologyEngineI (OntologyI just inherits those). We kept it 
intentionally to a minimal degree, to allow specialized light-weight 
implementations.

> Q. Does is_child_of do path traversal?
>

With depth 1, yes. If you're asking for any possible path, then 
substitute get_child_terms() with get_descendant_terms in the above 
code snippet.

> Q. Can is_child_of be given the "predicate" type [ i.e.
> $termA->is_child_of($termB, "isa") ]

Yes.

>
> Q. If "predicate" isn't given, what is assumed?
>

Wildcard.

> Q. What about two terms that are related through differing "predicates"
> (i.e A isa B, B partof C, what is the relationship between A and C?)
>

Depends on the relationship between "isa" and "partof". If they are 
disjunct, then there is no path satisfying an AND of the two 
relationship types.

> III. Same as II, but with an "anonymous" term:
>
>>   # this might not be the right class for this
>>   # Hilmar and Chris to agree
>>   $ontology = Bio::Ontology::Factory->new( -ontology => 'SO');
>
>   $gene = $ontology->get_term("gene");
>
>   # ...
>
>   if($sf->type->is_child_of($gene)) {
>     # do something
>   }
>
> BTW, why didn't this start as:
>
> $ontology = new Bio::Ontology 'SO';

So you want this to magically instantiate a fully populated ontology? 
Could be added ("hardcoded" ontologies which are distributed with 
Bioperl), but I'm not sure we should be doing this. Creates a 
maintenance headache. I'd leave ontology distribution to ontology 
maintainers ...

To populate ontologies you either have them in biosql or read them from 
files:

	$stream = Bio::OntologyIO->new(-format => 'so', -file => "so.file");
	$ontology = $stream->next_ontology();

Note as an aside that -format 'so' is not implemented yet. There is 
only 'go' and 'interpro'.

>
> or, if necessary:
>
> $ontology = new Bio::Ontology -name => 'SO',
>                               -factory => "MyFunkyOntologyFactory";
>
>
>
> Q. Should "is_child_of" be better as "is_subject_of" to reflect the
> subject/object/predicate ontology term paradigm?
>
> Q. Does is_subject_of need to know ontology namespace as well as
> applicable predicate term(s)?
>

The relationship type is-a Ontology::TermI and hence has an ontology() 
method, so it's implicit of you constrain by relationship type. Now the 
question is what do you do if relationship type matches by name but not 
by namespace. This is why Matt and Thomas wanted the namespace also on 
the relationship (I believe). You ask for the relationship between 
"SOFA"::"isa" and "GO"::"isa". If it is an "isa" you decide whether you 
trust it based on its namespace (and authority).

I apologize for my verbosity - I'm sitting in a cafe (named Wired, but 
it's not wired).

	-hilmar
-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------



More information about the Bioperl-l mailing list