[Biojava-l] Gene Ontology

Phillip Lord p.lord@russet.org.uk
03 Dec 2002 12:17:09 +0000


>>>>> "Matthew" == Matthew Pocock <matthew_pocock@yahoo.co.uk> writes:

  >> GO allows multiple parents along both is-a, and part-of. In the
  >> case of GO, there are only two relationship types, but some of
  >> the GOBO ontologies define others. It wouldn't really make sense
  >> to have an API supporting GO, but not other DAG ontologies.

  Matthew> OK - so GO is potentialy a DAG, not a tree (for any given
  Matthew> relation). Can you represent part-whole hierachies
  Matthew> (i.e. partative with cycles)? 

No. Its a acyclic (hence DAG!). 


  Matthew> I can think of a simple meta-data one:

  Matthew> |-> biological_process has-a (biological_process as
  Matthew> sub_process)*

  Matthew> You can remove the apparent cycle here by saying:

  Matthew> |-> sub_process is-a biological_process &&
  Matthew> biological_process has-a (sub_process)*

I'm not sure that I am understanding what you are saying here. I don't
think that this is a cycle, because the arrows are the wrong way
around. 

So for instance

macrophagy has-a autophagic membrane degredation (the latter is a
subprocess of the former), 

autophagic membrane degradation is-a membrane degradation, which is-a
biological process. 


Actually there is a slight gotcha here. Although, logically, all
concepts should be a kind of something else, in GO this is not
explicitly represented in the DAG. So a "large ribosome subunit" is
part-of a "ribosome", but is not a kind of anything. We call these
"orphan nodes", and they make life slightly confusing, although the GO
people did them for sensible reasons (they didn't want terms like
"ribosome component", which are not, of themselves, useful for
annotating). 

There is a brief discussion of this, in one of my papers. 

http://www.cs.man.ac.uk/~phillord/download/publications/bioinf-sim-2002.pdf

It caused us a few problems, so our solution was, er, to fudge it. 


  Matthew> but now we have to symultaneously reason over is-a and
  Matthew> part-of/has-a to figue anything out. I know this is being
  Matthew> pedantic, but are there cases when something with an
  Matthew> equivalent structure to this could/have been built in GO?

You do have to graph over both link types.


  >> It's also worth mentioning that the GO database has transitive
  >> closure information pre-calculated, so you need methods for
  >> accessing paths, if you want to advantage of this. I know from
  >> experience that graphing around GO using the "get_parent" methods
  >> is very slow, over the SQL database. (I won't bore you with the
  >> details, but I couldn't use the path methods). If memory serves
  >> the GO perl API has first class Path objects to allow this sort
  >> of access.

  Matthew> OK - worth knowing. Thanks.

  Matthew> My 2p is that we should suck GO in using whatever API
  Matthew> because it's usefull, and do it with beauty in BioJava
  Matthew> V2. If someone posts javacoc for GO APIs, we will plump for
  Matthew> one.

I'd have a look at this lot first. I don't know what the future plans
are for it, but you could ask!

http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/geneontology/go-dev/java/dagedit/sources/org/bdgp/apps/dagedit/datamodel/

Cheers

Phil