[BioSQL-l] more consistency
Hilmar Lapp
hlapp at gnf.org
Wed Mar 12 01:15:11 EST 2003
On Wednesday, March 12, 2003, at 12:51 AM, Yves Bastide wrote:
> By the way, a rationale for the Singapore changes would be great for
> us users.
Aaron has written a nice document describing the schema and some of the
ideas behind it for the audience of a 'small lab' that wants to store
and manager their sequences. I don't think he emphasized the Singapore
changes and the rationale behind them though. Aaron?
> E. g., why the split between ontology and (ontology_)term?
Category being a loop back to term was poor design that just happened
to work nicely despite of being poor :)
Ontology is the namespace for a term, which really is not a term, even
though sometimes it resembles one (and you could even think about an
ontology of ontologies). We collectively decided that creating a new
table instead of re-using bionamespace was the 'right' thing to do.
> Why do reference use dbxref (as a one to one relationship, so one
> cannot store both Pubmed and Medline ids)? Etc.
>
Good point actually. The reason they now have a FK to dbxref was that
upon discussing what would be the proper generic name for the document
database ID we concluded that in fact this is just a dbxref as any
other, so why not make it one then. I think this was a good decision.
The reason it is also a UK is that document_id (or medline_id) was a UK
before. If you want multiple dbxrefs per reference, you'd need an
association table, which means the UK constraint goes out the window
(i.e., it is not straight-forward [=impossible in MySQL] to enforce
that for one medline ID there is only one reference entry). Possible,
but makes me wary.
What is no problem with the present schema is to have an arbitrary, but
specified, database associated with the document ID (and even a version
if you wanted to). So, you can store either medline ID *or* pubmed ID,
and you would easily know which one you chose (which is different, and
richer, than before).
-hilmar
--
-------------------------------------------------------------
Hilmar Lapp email: lapp at gnf.org
GNF, San Diego, Ca. 92121 phone: +1-858-812-1757
-------------------------------------------------------------
More information about the BioSQL-l
mailing list