[Bioperl-l] RE: bioperl pipeline picking momentum

Lincoln Stein lstein@cshl.org
Fri, 12 Apr 2002 11:20:04 -0400


I just talked to a couple of people from AVAKI yesterday.  They claim that 
their grid architecture has an advantage over GLOBUS because it incorporates 
a virtual filesystem that grids out the data as well as the CPUs (and has 
better performance than NFS).  Anyone have insight into these claims?

Lincoln

On Friday 12 April 2002 05:45, Larry Ang wrote:
> Elia,
>
> Very comprehensive plan. We will work together.
>
> Larry
>
> -----Original Message-----
> From: Elia Stupka [mailto:elia@fugu-sg.org]
> Sent: Friday, April 12, 2002 5:36 PM
> To: Bioperl
> Cc: Ensembl dev list; Prasanna R Kolatkar; Lai Loong Fong; Larry Ang
> Subject: bioperl pipeline picking momentum
>
>
> Dear all,
>
> just thought I would let you know that the effort to create a
> bioperl-pipeline based on extending and improving the already very capable
> ensembl-pipeline is well underway.
>
> Jer Ming Chia and Shawn Hoon have spent two useful weeks at Hinxton
> discussing with ensemblers the specs of the new pipeline, and have started
> coding it all up.
>
> Moreover over here in Singapore a few Institutes have taken interest in
> it, so we are likely to see a strong interest as well as broader set of
> coders working on the project.
>
> Some of the aims of the new pipeline as compared to the previous one are
> (in order of ease of achievement):
>
> 1)Making the system very flexible in terms of where the input data comes
> from and where the output results should be stored. This used to be all in
> one mysql db, now it should be able to come from anywhere provided
> adaptors are in place to communicate to the resource.
> [already underway]
>
> 2)Making the system less LSF dependent. As a first step we are starting to
> play with PBS both on an alpha cluster and Itanium cluster, and will code
> the modules needed to make it interact with PBS. PBS is free and thus if
> we can make it work stably it opens the pipeline for use to a much wider
> set of people, even for small multiprocessor systems.
> [will start next week]
>
> 3)Making the pipeline GRID aware. This means making the pipeline code talk
> to GLOBUS and being able to use within a local pipeline resources
> (data/cpu) available elsewhere seemlessly, or almost ;)
> [will start on this in a few weeks]
>
> 4)Taking advantage of the GRID-awareness to start reasoning in terms of
> allocating analysis runs according to where they are most suited or where
> there is more resources. In other words run cpu-intensive jobs on SNP
> systems, small-but-many jobs on MPP systems, and of course allocate "a la
> LSF" according to resources available.
> [wishful thinking?]
>
> Just thought I should let you know :)
>
> Elia