[BioSQL-l] Affymetrix SQL for PostgreSQL

Hilmar Lapp hlapp at gnf.org
Thu May 1 15:20:08 EDT 2003


Sounds great. Here are a few comments as for my $0.02 ...

There's probably as many expression data schemas out there as labs 
hosting expression data. There's not that many big efforts making a 
generalizing attempt, but there are some (GEO, ArrayExpress, GeneX, 
RAD, SMD, and I'm sure a couple more).

If gene expression tables in the 'official' BioSQL (everyone can - and 
many will - have his/her own, extended or whatever, build), a design 
that attempts to be generic and technology agnostic would be most 
attractive to me.

Gene expression not having been within the scope of BioSQL yet ever, 
I'd prefer to take as much advantage of existing open-source schemas as 
possible, since then the reality-check has already happened and the 
software support may come with it.

Lately GMOD/Chado faced a similar situation, and Allen who I believe 
took the lead on that project settled on integrating the respective 
parts of GUS/RAD.

Allen, how did that work out? Could we just build on your work and RAD?

Marc, what made you decide to disregard the big expression schemas? (No 
offense whatsoever, I'm just curious.)

The way I could envision a different design of a gene expression model 
in BioSQL is as a warehouse star-schema, where there'd be essentially 
one (or very few) analytical data tables, and all the rest is hosted by 
the existing biosql tables (i.e., mostly the term table). It would be 
understood then that people would host their expression data in another 
schema, and the biosql table(s) would be used as a warehouse only.

	-hilmar

On Thursday, May 1, 2003, at 12:08  PM, Marc Colosimo wrote:

>
> Since I couldn't easily find a good schema, I made my own based on
> Affymetrixs GATC schema. My hope is that as I develope it, that it will
> use parts of BioSQL to handle the non-array stuff (taxon, sequence
> databases, etc...). I only have a few tables made and they are not
> normalized (one actually I think is best de-normalized). Oh, I am 
> keeping
> in mind MIAME stuff.
>
> I have one script that is almost finished that loads in CEL files. I 
> just
> have a few complex regexs to make/debug and add support for bulk 
> loading
> on a local machine (piping it to psql). Now that I have played around 
> with
> DBI, loading CDF files are next.
>
> If people are interested in the code to try it out, let me know.
>
> Marc
>
> _______________________________________________
> BioSQL-l mailing list
> BioSQL-l at open-bio.org
> http://open-bio.org/mailman/listinfo/biosql-l
>
-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------



More information about the BioSQL-l mailing list