[BioSQL-l] RE: BioSQL and Bioperl-db inching towards release

Hilmar Lapp hlapp at gnf.org
Thu May 29 12:10:36 EDT 2003


What I have in mind is a warehouse-like table, like

CREATE TABLE biosignal (
	--
	-- dimensions:
	--
	-- the biological entity dimension may be a feature or a
bioentry
	bioentry_id INTEGER FOREIGN KEY REFERENCES bioentry
(bioentry_id),
	seqfeature_id INTEGER FOREIGN KEY REFERENCES seqfeature
(seqfeature_id),
	-- quantitation type
	quantitation_id INTEGER NOT NULL FOREIGN KEY REFERENCES term
(term_id),
	-- bio-experiment, project, or screen name
	project_id INTEGER NOT NULL FOREIGN KEY REFERENCES term
(term_id),
	-- sample name
	sample_id INTEGER FOREIGN KEY REFERENCES term (term_id),
	--
	-- numerical value for the quantitation type
	--
	value	DOUBLE NOT NULL
);

(Technically, this is more along a snow-flake than a pure star-schema
design.)

You don't want to use this for natively hosting your raw values and LIMS
annotation. Instead, you have another 'real' db for expression data etc
and once a week or so you repopulate the above table.

You could use the same table and the same principle for other biological
data your lab(s) produce(s) so long as they are numeric.

As for a real native expression database in Biosql, I am happy to see
someone else drive this; as things stand now I won't. I maintain the
points I've expressed earlier though, namely let's not re-invent the
wheel here, and if you do, that wheel in biosql should be generic and
technology agnostic (which is not without significant challenges).

I'm not sure but I believe I failed to list a couple of links to open
source gene expression databases before when you asked for that. Here
goes, but quite frankly I'm surprised that you couldn't find them on the
web (they are all googleable).

RAD (RNA Abundance Database) is part of GUS from UC Penn:
http://www.gusdb.org/, RAD is a hyperlink there.

GeneX (also check out GeneX-lite linked from the page below)
http://genex.ncgr.org/

TIGR's entire Microarray analysis suite, includes database and lots of
beautiful software that supposedly even works (as they use it
themselves)
http://www.tigr.org/software/tm4/

BASE (BioArray Software Environment) (there are local installations of
BASE running at various institutions world-wide already)
http://base.thep.lu.se/

I'm sure none of the above is perfect under every condition, so you may
still reach the conclusion that you want to roll your own. I just
wouldn't without having looked closely at them.

	-hilmar

> -----Original Message-----
> From: Marc Colosimo [mailto:mcolosim at brandeis.edu] 
> Sent: Thursday, May 29, 2003 9:12 AM
> To: biosql-l at open-bio.org
> Subject: [BioSQL-l] RE: BioSQL and Bioperl-db inching towards release
> 
> 
> 
> On Wednesday, Fri, 23 May 2003 11:39:19 -0700  Hilmar Lapp wrote:
> >>
> >>> The next feature I'm going to add to biosql is a table for
> >> expression
> >>> data that basically resembles the data cube with various 
> dimensions 
> >>> (roughly, bioentry, term [quantitation type], term [sample 
> >>> annotation]). No LIMS info (yet) other than sample
> >> annotation as far as
> >>
> >> Is the plan to add LIMS, etc info at some point?  Fully 
> represent the 
> >> MAGE-OM?
> >
> > No, not really. At least as far as I'm concerned. I'm happy 
> though to 
> > have somebody else drive this piece to be fully MAGE-OM 
> compatible. I 
> > just think that there are enough decent open-source 
> expression schemas 
> > out there already who try to do the full monty.
> >
> > There may be more LIMS at some point as far as it is needed 
> for sample 
> > annotation, and where that can't be expressed very well 
> using the term 
> > table. Our use case here is tissue expression profiles, so 
> the sample 
> > annotation will be relatively simple and flat, at least initially.
> >
> > 	-hilmar
> 
> Do you have some sort of ER model in mind for storing 
> expression data? 
> I did start on adding affymetrix tables to store cel data. I posted 
> this awhile back. I really don't want to continue working on it if it 
> will be replaced by something else. So, should I just stop working on 
> stuff to be added and just make something that works for me?
> 
> Marc
> 
> _______________________________________________
> BioSQL-l mailing list
> BioSQL-l at open-bio.org http://open-bio.org/mailman/listinfo/biosql-l
> 



More information about the BioSQL-l mailing list