[Bioperl-l] new cvs module bioperl-runnable

Catherine Letondal letondal@pasteur.fr
Thu, 20 Jun 2002 10:32:05 +0200


Elia Stupka wrote:
> > There are LOTS of files now in Bio::Tools::Run.  For people who aren't
> > running analysis this is a bit of a bite for the building and downloading
> > and I think it is appropriate for the core of bioperl to be a little
> > lighter.  If the opportunity and (lack of) hassle were there, I would want
> > bioperl-live to become bioperl-core.

Pise indeed not only brings 276 files but 3.8 Mo more in bioperl-live.
I have of course thought about having fewer files, for instance not to have one
file for each program in Bio/Tools/Run/PiseApplication. How is it actually for
EMBOSSApplication? You need at least the informations from the ACD file
to be somewhere I guess? 
Regarding PiseApplication/program.pm files, there is really useful stuff in them:
all the information about each parameters, and the documentation about them (so that
you can do a perldoc Bio::Tools::Run::PiseApplication::program). Hence the 3.8 Mo. It could
be reduced maybe by removing information that is only needed on the server or in HTML
forms.

I think the idea of having a separate bioperl-run for people not needing to run
analyses is a good one though. 

> The whole extra effort of writing stuff in Bio::Tools::Run from our side
> was because we wanted to make sure people could use our work for
> standalone analyses, so I wouldn't like to see them move to
> bioperl-pipeline. I like the bioperl-run CVS module idea, that is a nice
> split. Basically it's important for people to realise that the pipeline is
> workflow management code, and not much else, and it wraps whatever
> databases and run modules one wants to wrap.
> 
> Another side comment is about the Pise run modules. This is in no way a
> criticism, there is lots of useful code, but I wonder if we have just
> "overlapped it" with bioperl run, and whether some reationalisation should
> be done (i.e. what is covered by Pise, what is covered by
> Bio::Tools::Run) 

If Bio::Tools::Run is intended to contain API for running programs (local or remote), 
there is no real difference, they cover the same need.
There is a nice Bio::Tools::Genscan module, but where is the Bio::Tools::Run::Genscan?
Here: Bio::Tools::Run::PiseApplication::genscan.
There is a Bio::Tools::Grail module ... and the Bio::Tools::Run::PiseApplication::grailclnt.
There is no Bio::Tools::Toppred yet (but one is ready in
ftp://ftp.pasteur.fr/pub/GenSoft/unix/protein/toppred/ see ToppredXML.tar.gz ) - again it
could be combined with Bio::Tools::Run::PiseApplication::toppred. Etc...

So these modules enable people not only to run programs without having to install 
them locally, but also just to use available bioperl parsers within a single script.

Another benefit is the programming time for an additional program: depending on the
number of parameters, adding a new program in Pise takes from a few minutes to a few 
hours.

However, the main question remains: is bioperl the place to have code to run 
bioinformatics programs? :-)

--
Catherine Letondal -- Pasteur Institute Computing Center