[BioSQL-l] Ontology names

Hilmar Lapp hlapp@gnf.org
Mon, 30 Sep 2002 11:04:31 -0700


Your points are well taken, and I agree with most of them. But given our domain of expertise here and our schedule, I just don't feel we should take a lead here, where we is referring to our local group, not the audience of the mailing list.

OTOH we're kind of taking a lead already because we need to get it working somehow in a reasonable way ... So, yes, the category does mix different things together; I just took it from ChrisM's proposal, and the simple self-referencing FK does work for us for the time being (it shouldn't be hard to change it though). I also agree that Ontologies should reference a namespace with authority, which a simple term doesn't. Or could the category be regarded as the namespace?

As for the schema aspect of these ontologies, yes that's not at all dealt with right now. It's not exactly in the focal point of our use case here though, although I feel that ontology maps will hit us rather sooner than later (e.g., which tags denote a dbxref: db_xref, dbxref, DR, ...)

Sorry if this was confusing, I'm in a rush... bottom line is I'm glad if other people with more or less different use cases than ours weigh in with their requirements and proposals, and I'm happy to (help) make them all work together if there is a reasonable way.

	-hilmar

> -----Original Message-----
> From: Thomas Down [mailto:td2@sanger.ac.uk]
> Sent: Monday, September 30, 2002 2:46 AM
> To: Hilmar Lapp
> Cc: Biosql; David Block; OBDA
> Subject: Re: [BioSQL-l] Ontology names
> 
> 
> On Fri, Sep 27, 2002 at 11:52:56AM -0700, Hilmar Lapp wrote:
> >
> > Ontology names will likely (but are not required to) have NULL in 
> > category_id.
> > 
> > Is everyone OK with this so far?
> > 
> > In order to get things out by a Bio* package other than the 
> one that 
> > put it in, we need to agree on ontology names in the first place 
> > (but also on terms).
> > 
> > I am right now using the following ontology names:
> > 
> > - 'Annotation Tags': the keys (tags, qualifier names) for simple 
> > annotation values (qualifier values)
> > - 'SeqFeature Keys': the keys of seqfeatures ($feat->primary_tag() 
> > slot in bioperl; e.g., the genbank feature key, or 
> swissprot feature 
> > key, like 'CDS', 'mRNA', ...)
> > - 'SeqFeature Sources': the source names of seqfeatures 
> > ($feat->source_tag() slot in bioperl; like 'swissprot', 'genscan', 
> > etc).
> > 
> > There is already a pre-defined number of terms for location 
> > properties (min_start, etc), but without an ontology. I'd like to 
> > put them into an ontology and suggest the name 'Location Tags' for 
> > it.
> 
> Sorry to reply a bit late to this thread -- I've been having
> a few problems with e-mails to and from these mailing lists
> (probably DNS-related, and seem to be sorted out now).
> 
> Anyway, to me this all feels like it's trying to mix together
> several different concepts.  Many (though by no means all)
> ontology_terms are really defining properties of objects.
> The keys used in seqfeature_qualifier_value are a very good
> example of this.  Similarly the location qualifiers.
> 
> Looking specifically at properties, they can be defined by:
> 
>   - Their domain -- the class (or classes) of object to which
>     they apply.
> 
>   - Their range -- the set of values which are allowed.
> 
>   - Their cardinality -- e.g, 0..1, exactly 1, 0..infinity
> 
> The domain might just be `seqfeature' or `seqfeature_location'.
> But the interesting cases come when you set more restrictive
> domains (say, "A feature of type SNP must have one or more
> variants").  A more mundane application might be to define
> the required set of qualifiers for a given feature type in
> an EMBL feature table./
> 
> We're now taking ontology_terms somewhat beyond being a simple
> controlled vocabulary, and into schema-land.  I don't know what
> people's feelings are on this.  My understanding is that the
> original plan with ontology_term was to leave it totally opaque,
> then join on some extra tables which included relationship/schema
> information.
> 
> As I understand it (please correct me if I've got the wrong
> end of this), the `category' concept seems to be trying to
> mix up aspects of property domains (for ontology_terms which
> define names of properties) and propery ranges (for terms which
> are used as values -- e.g. seqfeature_key).  Is this actually
> a sensible thing to do?
> 
> 
> Hilmar: I know you're on a tight schedule with this.  If adding
> a category field solves your problem, today, then go for it.
> However, it might be better to put this on a separate table,
> for ease of untangling stuff in the future (it also avoids having
> an FK to self, although you still get a circular reference, of
> course).
> 
>      Thomas.
> 
> 
> PS. The way I've discussed properties here is very DAML-esque.
>     At some point in the past, I remember a dicussion about doing
>     DAML definitions for the open-bio datamodels.  Did this
>     ever get off the ground?
>