[Bioperl-l] GO annotatinos in BioPerl. A tentative proposal...

Mark Wilkinson mwilkinson@gene.pbi.nrc.ca
Thu, 05 Apr 2001 17:53:45 -0600

> > At 04:17 PM 4/5/2001 , Jeffrey Chang wrote:
> > >I urge you to keep the representation of GO separate from the annotation
> > >on the sequence.  The sequence should contain only a very minimal
> > >reference to an annotation, possibly as a perl pointer or a GO ID.

The Sequence/SeqFeature models already allow you to have sequences with no
annotation, or sequences with minimal "generic" features, or sequences with
full-blown features & subfeatures and annotations up the ying-yang.  I am only
proposing to add a method to a Feature object which would add/return an
(optional) complex & structured annotation object to the Feature (or any other
BioPerl object which now, or in the future, we might want to add annotations
to... as Hilmar pointed out - microarray spots are an example of annotatable,
non-Feature objects)

I would, on the contrary, strongly disagree that a sequence feature is not the
place to hang the annotation of that feature... or am I misunderstanding you?
Certainly, speaking as an annotator, the *last* thing I need is to have access
to only a minimally annotated sequence.  I need my sequence/feature objects to
be loaded with as much annotation as is available so that I can evaluate it.

> In theory, this is great.  However, if you are annotating 'tentatively'
> and you are not ready to submit an annotation to the GO curator for your
> organism, where do you put the evidence you are assembling?

exactly... and I suspect that the GO curator would tear her hair out at the
thought of the GO database being used as the primary repository for annotation
information from 'unpolished' genome databases... I might be wrong, but I
think she has a plate-full already :-)

Regardless, it is to be expected that people will want to have the annotation
information for their favorite genome sitting in their own genome database, at
which point such annotation objects become necessary.  The GO database, and
the Genome database serve entirely different functions in my mind.

And at the end of the day, as I said before, all of these things are optional
anyway... so there is no need to worry about your sequence objects suddenly
becoming huge.

Am I completley mis-understanding your concerns?  If so, please set me


Dr. Mark Wilkinson
Bioinformatics Group
National Research Council of Canada
Plant Biotechnology Institute
110 Gymnasium Place
Saskatoon, SK