[Bioperl-pipeline] Gramene BioPipe

Lenny Teytelman teytelma@cshl.edu
Mon, 21 Oct 2002 18:02:02 -0400 (EDT)


I want to thank you guys for surprising me.


I genuinely did not expect to get as far as we did in the setup of the
Gramene pipeline.  Of course, there is still a lot to do, but the heart 
of what I needed the pipeline for, it already does.  The basic
dataset-to-rice genome alignment (which I do for fifteen different
datasets) is now in the XML file.

Though we started by breaking up the jobs on contig basis (a job that took
30 minutes on one CPU became a 2-hour compute on a 24-cpu cluster),
because of the flexibility of BioPipe, we were able to redefine the
splitting up of the jobs.  The day before Kiran left, a 15-hour
one-machine alignment was finished in 13 minutes in the pipeline.

The credit goes to Kiran for one week of slavery and to the BioPipe team
for creating the system.

I've documented our setup process, so you will have an additional use
case.  Luckily, I basically have all of November to devote to the
pipeline.  My plan is to finalize the xml (as far as duplicating from
beginning to end what I do now) and do a complete build of the Gramene
sequence database in the next two-three weeks.  After that, I hope to
recruit other Steinlab members to use the pipeline with their own XMLs.
Already, there is another, completely distinct, pipeline that needs to be
implemented for Gramene.  So, I expect several more use cases for you guys
in the near future.

Thanks again,

Lenny