[BioSQL-l] Re: gene ontology questions (bug)

Marc Colosimo mcolosim at brandeis.edu
Mon Jun 2 18:16:32 EDT 2003


On Mon, 2 Jun 2003, Hilmar Lapp wrote:

> 
> On Sunday, June 1, 2003, at 01:46  PM, Marc Colosimo wrote:
> 
> > I have several questions about the GO structure.
> >
> > First, the old ER diagram only has ontology_term.
> 
> That diagram is dated, there is no current ERD. I need to reproduce it, 
> sorry about that. Will get to it today or tomorrow.
> 
> >  But the version I have,
> > has  ontology and ontology_term.
> 
> Then you have mixed versions. There should be no ontology_term table, 
> and indeed the ontology table is new. The name of ontology_term was 
> changed to term. Do you run a recent CVS checkout? Did you try to 
> instantiate over a previous schema?

No! They had cross references, that was how I found it. I got the latest 
(todays) cvs version and dropped the old one. Do I need the 
accelerators and views?

> 
> > I want to add my own terms, but with so
> > many contraints, I have no clue where to start. Here is a sample
> > autogenerated list of what I want to add:
> >
> > Term ID Term Name       Frequency
> > 3       reproduction    248
> > 8       thioredoxin     39
> >
> > (This was made by dChip, with Affymetrix cvs files, for those 
> > interested).
> 
> Right now there is no term_qualifier_value table, but there is 
> term_relationship, which is valueless though. However, I guess the 
> frequency is the value to be associated with a chip target, so either a 
> bioentry or seqfeature. For both there are *_qualifier_value tables. If 
> you want better help, you need to be more specific about what you want 
> to represent for which purpose.

If Affymetrix used the correct GO terms, then all I need is the 
function.ontology (I think). I have a list of probe_set_names and 
accession numbers with associated terms. So, I loaded in the first 
part and I wanted to link the terms to their descriptions. Also, 
they have a protein domain ontology. What would that correspond to?

 > 
> >
> > Second, I tried to load in stuff (function.ontology) from 
> > geneontology.org
> > using the script load_ontology.pl.

[snip]

> 
> You need to upgrade to the latest schema version. Ontology_term was 
> renamed to term in the Singapore-version (which is essentially the CVS 
> head).

Got todays cvs for bioperl and  tried one more time to load it. It still 
dies, but much later. I don't know if this is a real bug in load_ontology 
or in the GO file.

perl load_ontology.pl --dbname bioseqdb --dbuser mcolosim --driver Pg 
~/Affymetrix/function.ontology
Parsing input ...
Loading ontology Gene Ontology:
        ... terms
Could not store GO:0030693 (caspase activity):

------------- EXCEPTION  -------------
MSG: create: object (Bio::Ontology::GOterm) failed to insert or to be 
found by unique key
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create 
/usr/lib/perl5/site_perl/5.8.0/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:206
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store 
/usr/lib/perl5/site_perl/5.8.0/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:249
STACK Bio::DB::Persistent::PersistentObject::store 
/usr/lib/perl5/site_perl/5.8.0/Bio/DB/Persistent/PersistentObject.pm:266
STACK (eval) load_ontology.pl:489
STACK toplevel load_ontology.pl:471

--------------------------------------


> 
> > Finally, I was wondering if anyone has written a script to parse the 
> > gene
> > association file type found at
> > <http://www.geneontology.org/doc/GO.annotation.html#file>
> > and  for the files at <http://www.geneontology.org/#ontologies>?
> >
> 
> As for the latter, the bioperl ontologyIO parser parses that format 
> (--format goflat). Read the POD documentation of load_ontology.pl, it 
> tells you how to load this. Be sure to look at the --fmtargs option for 
> how to supply the definitions file (the example shows it though).
> 
> In order to hassle-free load GO you want to obtain the latest bioperl 
> CVS update from either the HEAD or the stable branch (tag branch-1-2), 
> if you have bioperl 1.2.1 installed. There were several bugs in 1.2.1 
> that I had to fix. We'll release bioperl 1.2.2 pretty soon, which will 
> contain the fixes too.
> 
> As for the association file, no. Shouldn't be too hard though to write 
> a quick converter that outputs SQL INSERT statements into the 
> bioentry_qualifier_value table, which you then feed to your SQL shell 
> (psql for instance):

I'll look this over when I get the GO files loaded.

Thanks,
Marc



More information about the BioSQL-l mailing list