[Bioperl-l] load_ontology and GO - progress!
Dave Howorth
dhoworth at mrc-lmb.cam.ac.uk
Thu Apr 22 07:39:59 EDT 2004
Hi everybody,
I'm very pleased to report that the upgraded branch-1-4 now passed its
tests to an acceptable/understandable level and has installed.
Furthermore, it has fixed the original trouble with the load_ontology
program.
So big thanks to everybody who has put effort into helping me, fixing
problems and updating docs so quickly.
There's always a gotcha, of course ...
load_ontology is now failing with messages like this:
Loading ontology Gene Ontology:
... terms
-------------------- WARNING ---------------------
MSG: insert in Bio::DB::BioSQL::TermAdaptor (driver) failed, values were
("GO:0001529","elastin","OBSOLETE (was not defined before being made
obsolete).","X") FKs (2)
Duplicate entry 'elastin-2' for key 2
---------------------------------------------------
Could not store GO:0001529 (elastin):
------------- EXCEPTION -------------
MSG: create: object (Bio::Ontology::GOterm) failed to insert or to be
found by unique key
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create
/usr/local/share/perl/5.6.1/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:207
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store
/usr/local/share/perl/5.6.1/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:253
STACK Bio::DB::Persistent::PersistentObject::store
/usr/local/share/perl/5.6.1/Bio/DB/Persistent/PersistentObject.pm:270
STACK (eval) load_ontology.pl:508
STACK toplevel load_ontology.pl:490
--------------------------------------
What appears to be happening is that the GO.defs file defines two terms
called 'elastin' (0001528 and 0001529), one of which is used in
component.ontology and the other in function.ontology. Both are
declared OBSOLETE.
This isn't an isolated case; it occurs for 'collagen' as well, for
example. Neither is it new in this release of the files; I found it in
an old copy I had.
The GO flat file definition format document isn't very helpful in
describing what should be in the file, so my question is whether this
duplication is a legitimate occurence in GO?
That is, is there a bug in the GO distribution (duplicate IDs) or a bug
in the biosql model (each of the three GO ontologies needs a separate
ontology ID)?
Also of interest would be any ideas for a workaround :)
Cheers, Dave
--
Dave Howorth
MRC Centre for Protein Engineering
Hills Road, Cambridge, CB2 2QH
01223 252960
More information about the Bioperl-l
mailing list