[BioSQL-l] RE: gene ontology questions (bug)
Hilmar Lapp
hlapp at gnf.org
Tue Jun 3 10:56:14 EDT 2003
> -----Original Message-----
> From: Marc Colosimo [mailto:mcolosim at brandeis.edu]
> Sent: Tuesday, June 03, 2003 7:36 AM
> To: Hilmar Lapp
> Cc: biosql-l at open-bio.org
> Subject: Re: gene ontology questions (bug)
>
> >
> > You can check by GO_ID whether you actually have the
> respective term
> > in
> > the database (and if so, whether it has the same name).
>
> There is a bug! But whose does it belong to, I don't know.
>
> bioseqdb=# select term_id, name, identifier from term where name =
> 'caspase activity';
> term_id | name | identifier
> ---------+------------------+------------
> 681 | caspase activity | GO:0004199
> (1 row)
>
> As you see, caspase activity was entered with a different identifier.
I suspected that, but normally the error message gives better hints at
that. This is almost certainly a problem of either
a) the input file, in that it mentions the term twice with
different identifiers,
Or
b) the bioperl ontology dagflat parser.
Could you please check whether a) is true. If it's not (i.e., term
'caspase activity' only occurs with one and the same identifier), send
me the file so that I can debug the parser.
>
> bioseqdb=# \d term
> [snip]
> Indexes: term_pkey primary key btree (term_id),
> term_identifier_key unique btree (identifier),
> term_name_key unique btree (name, ontology_id),
> term_ont btree (ontology_id)
>
> And I think the name needs to be unique! D'oh!
Note that name is unique only within an ontology. I don't think an
ontology that does not guarantee this is a very sane one. GO (and SOFA
as well) does comply with this.
-hilmar
More information about the BioSQL-l
mailing list