[Bioperl-l] GO ontology browser module available
Mark Wilkinson
mwilkinson@gene.pbi.nrc.ca
Wed, 31 Jan 2001 09:25:51 -0600
Ewan Birney wrote:
> > > Wouldn't it make sense to add it to bioperl-gui?
> > >
> > > Hilmar
> > >
> > Inasmuch as it is completely separate from SeqCanvas, and we are still
> > thinking bioperl-gui=SeqCanvas, no; but since bioperl-gui could be greater
> > than SeqCanvas, maybe. Mark? I think it would be okay.
>
> Sounds like the right place to me....
indeed - that was where I intended to put it when it was a little more
"polished"... I am just hesitant to use the BioPerl CVS repository to store my
half-baked code.
There are several things which "don't work right" (tm). I think a lot of this
has to do with the fact that I can not get my hands on the GO.dtd - it isn't
available on the GO website, though all of the other XML files are (yet they
reference the DTD in these same XML files). Neither do I receive a response to
inquiries sent to the consortium e-mail address.
The consequence is that XML::Parser doesn't know what to do with the HTML-like
formatting tags that they are using in some of their "free text", and in some
cases tries to treat them as sub-level tags (for example, what should be a
subscript or superscript will become a sub-element of the preceeding word, so
Carbon<down>14</down> parses as $GO->{Carbon}->{14}... which is ridiculous of
course....). In addition they use HTML designations for the greek alpha, beta,
gamma, and so on, preceeded with an ampersand and ending with a semicolon These
can not be parsed by XML::Parser *at all* unless it is specifically told that
these are going to be #CDATA elements... which requires a DTD.... which I don't
have.
So, GO_Browser (for the time being) hacks away at the XML in its first parsing
pass, replacing these tags with things that will not break XML::Parser, and then
reads from this hacked data. As a result, what you get is not "strict" GO
ontology, but a slightly modified version of the same.... which effectively
defeats the purpose of GO which is that everyone should use a consensus
nomenclature. :-(
In any case, after all that griping, I am perfectly willing to cvs add this
module to bioperl-gui, so long as I am not judged too harshly by it - I know it's
a hack!! :-)
I'll get on to that later this afternoon.
b.t.w. If anyone can assist me in getting ahold of a GO.dtd please speak up! It
would make my miserable life a bit brighter!!
--
---
Dr. Mark Wilkinson
Bioinformatics Group
National Research Council of Canada
Plant Biotechnology Institute
110 Gymnasium Place
Saskatoon, SK
Canada