[BioSQL-l] database extensions

Mon Aug 7 13:45:16 UTC 2006

Hilmar Lapp wrote:
> Hi Angel, sorry for the belated response, I was at BOSC. See my 
> comments below.
>
Yes, I missed my chance to go this year. Maybe next!
> On Aug 3, 2006, at 2:28 PM, Angel Pizarro wrote:
>>
>> Second, are primary keys up for discussion any time soon? I realize that
>> a lot of external projects rely on this schema, so it has to remain
>> stable, but the inconsistent use of UID, compound keys or even lack of a
>> key really put a hindrance on the use of off-the-shelf ORMs.
>
> Can you elaborate? Meanwhile most tables do have a surrogate key. Only 
> those that serve as association tables and aren't referenced 
> themselves (and only very few association tables are referenced by 
> foreign key) do not (they still have a unique key constraint though).
>
> Just to make sure - you're looking at the CVS check-out version, not 
> at 0.1 or something?

I am looking at the CVS 1.0 schema.

By "inconsistent" I mean that certain tables have a single PK, others 
have multiple and yet others have none. Alternate keys are not the issue 
here. Many of the simple off-the-shelf object relational mapping APIs, 
particularly those tied to the web app suites, assume a single primary 
key and that all persistent object have one. Personally as a database 
guy I really don't see a problem with the data model, but it is making 
my life a little more difficult than it needs to be in the app and 
language binding space, particularly python.

Lastly, since I do want to make some schema proposals, guidelines on how 
to encode the proposed data models would be nice, and make less work for 
the reviewers.

My extra needs are:

Experimental results: There is no schema component for storing exp 
results of high-throughput data like microarray and proteomics.

Experimental context: You can't divorce the experimental context from 
the results of microarray, proteomics and other high-throughput 
experimental  technologies.

Pathway and networks: Hilmar has provided a start in the previous reply, 
but may need extension to kinetic information. I probably won't get to 
this, but I do notice that it is missing.

-angel

>>
>> Third, how does one go about submitting proposals for schema extensions?
>> I am wanting to extend the schema with a few modules, mainly ripped out
>> of either  GUS and/or chado, as well as adding a module for 
>> proteomics data.
>
> You would send those to the list, ideally accompanied with some 
> comments on motivation and why the existing tables can't deal with the 
> data the new entities are supposed to capture. That would give people 
> a chance to comment.
>
> I enthusiastically welcome proposals for additions especially if those 
> help to promote the utility of BioSQL.