[Bioperl-l] want to bring tools together to help small labs

T.D. Houfek tdhoufek@unity.ncsu.edu
Tue, 30 Oct 2001 16:50:06 -0500 (EST)

Hi all,

Recently Jason Stajich visited our lab and gave us a lot of good information
as well as encouragement to participate here.  But I'm new to this forum,
so please excuse me (yet still tell me) if I stray too far from its proper subject
matter.  Besides whatever my lab puts on our plate at any given moment,
we're chiefly interested in working on freely available open-source software
geared towards the needs of small-to-medum size laboratories doing
sequencing.  Smaller labs, with their correspondingly small computer hardware
and bioinformatics salary budgets, have an extremely daunting task on
their hands even if their ambitions for analysis are modest.  Ultimately
there is no cure for this problem, but we'd like to do something to ease
the pain... and I'd greatly appreciate any help anyone can give us.

Since small labs do more EST sequencing than large genomic assemblies, I'd
like to develop a distributable Linux/UNIX web application package that:
	a) facilitates batching of various analyses for ESTs
	b) allows specification of different processing "pipelines" for
	   different sets of incoming data.
	c) stores sequence data, quality data, meta-data, analysis
	   results, etc in a relational database.
	d) gives easy web browsing access to this data, allowing specification of
	   different levels of access permissions for different data sets.
	e) seriously eases data management burdens, including:
	   	1) file organization
		2) sequence data quality control
		3) data backups
		4) logging of analysis histories
	f) installs easily
	g) allows almost all ongoing administration to be done by
	   researchers or technicians  (non-power-users) through CGI.
	h) requires only one fairly decent ( <=$5,000 ) computer, but
	   allows a number of ways to distribute the system over more
	   machines (so that a lab can separate the workhorse and the
	   web server, or grow a small compute farm).

There being no point to reinventing the wheel, I'd like to use BioPerl /
BioJava / etc wherever I can.  If anyone has any thoughts about how such
might (or might not) fit into such a scheme, or has helpful information
about what smaller labs they have known might want or need, I'd be most

T.D. Houfek

system administrator
NCSU Fungal Genomics Laboratory