[Bioperl-l] bioperl-run; size/complexity of bioperl for 1.2

Ewan Birney birney@ebi.ac.uk
Fri, 22 Nov 2002 22:15:18 +0000 (GMT)


We are starting to run into a problem due to the sheer size of Bioperl;
we have 500 odd modules in bioperl-live (perhaps better thought of as
bioperl-core) - but some of the first "use case" problems - like running
BLAST we have quite sensibly separated out into bioperl-run - which itself
has some 306 modules. In bioperl-run we hope to put lots of generic run
functionality, not least it is the bridge between biopipe and the core
(or, in Ensembl speak, it is the runnables) and we'd probably like to put
generic job control, primitive queue mangement and stuff in there.

So... separate cvs modules - good. But... for new users... not having
"BLAST a sequence" in the first thing they download - Bad.


What we are struggling with is that our logical description is cutting
across our "starting functionality" set. This I am sure is something many
projects have faced before - does anyone know how they square this circle?
Does anyone square this circle?


More practically/importantly, for bioperl-1.2, do we:


  (a) distributed bioperl-1.2, bioperl-run-1.2 and say "if you want to get
remote BLAST parsing, you have to download and install both" (I don't like
this - new users are getting freaked out enough just by installing one of
these beasts)

  (b) Have a bioperl-all-1.2.tar.gz, which is everything in which case:
    - how is it structured internally?
    - do we do this with cvs aliases or scripts
    - does bioperl-db come in? bioperl-ext? Oh... vey...

  (c) Have bioperl-1.2 being actually "starter-pack bioperl" which is a
merge-and-prune of bioperl-core and bioperl-run (and perhaps others) and
then distribute bioperl-live as bioperl-core-1.2.tar.gz,
bioperl-run-1.2.tar.gz etc.



Any ideas? I sort of favour (c) and am happy to write the necessary
scripts for this and/or learn deep cvs aliasing magic for this.



Basically the aim is to keep the learning curve as-flat-as-possible for
newbies without having
everything-in-one-cvs-module-and-everything-a-function-in-one-file for
developers.



The eternal problem for software engineers I am sure. Any thoughts out
there?


e.