[Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy)
Sendu Bala
bix at sendu.me.uk
Tue Jun 19 07:41:57 UTC 2007
Steve Chervitz wrote:
> Might this been a good opportunity to investigate partitioning
> bioperl-live into sub-repositories? There has been talk in the past of
> defining a set of "core" modules separate from other functionally
> related groups of modules that would be viewed as optional extensions.
> The goal being to help manage growth and simplify releases. There are
> currently 892 modules under Bio/.
>
> In addition to simplifying the migration to SVN, it would also have
> other benefits. Say some new functionality or a slew of fixes were
> added to Bio::Graphics. We could turn around a new Bio::Graphics
> release quickly without having to work on getting various other parts
> up to snuff that aren't related to graphics (Biblio, DB, PopGen,
> Search etc.). Maintenance and releases of the various extensions would
> be more parallelizable, orchestrated by separate ring leaders.
>
> Over time, as a set of functionality matures, it would see fewer
> updates and there would be less of a need for users to
> download/install/test it. This could make bioperl easier to customize,
> extend, and grok in general.
>
> Long term, it should ease development and release cycles
I actually take the opposite view. Breaking things up makes testing and
releases more difficult.
If one person acts as pumpkin for all the sub-parts, his work-load
increases almost linearly with the number of sub-parts. If each sub-part
gets its own pumpkin, where do all these pumpkins come from? It seems to
me that frequently authors will write modules but inevitably their
circumstance changes and they can no longer devote the time to look
after them. Having a single pumpkin and 'forcing' him to make sure
everything works (regardless of his personal interest in the module)
seems more reliable than hoping there will be a person interested enough
in each sub-part to handle its release.
Since all sub-parts will at the least interact with the 'true' core set
of Bioperl modules, they need to be tested and potentially re-released
every time the true core is updated. And since some sub-parts will
interact with other sub-parts, there will need to be coordinated
joint-testing and release of multiple sub-parts.
What happens when users report problems? We ask them what version
they're running. Right now '1.5.2' means a specific thing, and its
trivial for someone to confirm the same problem by installing 1.5.2.
What happens when users have to list out all the versions of all the
sub-parts they have? Who is going to consistently recreate a users
hodge-podge of versions in order to confirm a bug? Won't the advice
instead be: "update all versions to the latest and get back to us"?
So, as I see it, all sub-parts would best be tested and released with a
single new version number every time one sub-part is updated
(significantly). In which case, why have sub-parts at all? Keeping
things the way they are now means ease of release for the pumpkin and
ease of installation for end-users (only one install command to issue to
CPAN). Having 'true' sub-parts (each with its own pumpkin), in my
fatalistic view, is just going to lead to some useful sub-parts being
abandoned and never updated, even where updates may be desirable.
Each and every Bio:: module could have been released separately by its
respective author. As I see it, one of the main values of 'Bioperl' is
that its one (reasonably) consistent collection of modules that lowers
the barrier of entry for new Bioinformaticians, giving them extremely
easy access to a whole host of functionality with a single install.
More information about the Bioperl-l
mailing list