[Bioperl-l] Splitting Bioperl
Chris Fields
cjfields at uiuc.edu
Wed Jul 4 14:53:45 UTC 2007
On Jul 4, 2007, at 5:00 AM, Sendu Bala wrote:
> To summarise some previous threads:
> http://thread.gmane.org/gmane.comp.lang.perl.bio.general/15315
> http://thread.gmane.org/gmane.comp.lang.perl.bio.general/15338/
> focus=15409
>
> # Bioperl is currently one monolithic distribution of ~900 modules
> # There is some desire to split it up into smaller functional groups
> # There are some problems with that proposal
> # An extreme variant of that proposal is to make the groups individual
> modules
>
>
> Following this discussion:
> http://www.nntp.perl.org/group/perl.modules/2007/07/msg55160.html
> (especially Adam Kennedy's postings of 4/07, soon to appear in that
> archive), the extreme variant doesn't seem like a good idea.
brian d foy made some sound arguments against it as well.
> I'm now suggesting that Steve's original split idea, as
> modified/expanded by Adam's driver and other ideas, is the best
> choice.
> The problems I previously identified can be solved in the same way
> they
> were solved in my extreme variant: the splits are done by Build.PL
> automation working on a single repository/code-base, not by splitting
> things up at the repository level.
>
> As I see it, the way forward now is for someone interested enough to
> decide on the specifics of how things will be split and offer them
> up to
> the group for discussion. I don't mean vague possibilities of what
> might
> work as a split, but rather some real thought should go into it to
> make
> sure the split makes sense and will actually work in practice.
We've already identified a few (SearchIO, Tools, GBrowse-related, etc).
...
> If there isn't sufficient interest to make this happen, I don't see
> that
> as a terrible thing. There are benefits to keeping Bioperl monolithic,
> and some of the problems (eg. lack of updates) can be solved without
> changing its nature.
If so, proposals that solve this problem need to be made as well.
If we stay monolithic, then here's mine: we start having fixed,
regularly timed dev releases like Parrot, monthly or bimonthly (quite
common on CPAN), with brief release reports on which bugs have been
fixed, code has been added, so on. Not every bug has to be fixed per
dev release; if that were true there would never be releases for some
of the XML parser packages. No RCs for dev releases (it's a dev
release!). These would be 1.x.y. We can then, every once in a
while, have a bug-squashing session, hackathon, etc, and have regular
non-dev release (1.x) that all core devs accept and that passes a
particular milestone.
As for the advantage of a split approach, as mentioned previously it
is to focus modules/tests/scripts into groups with related
functions. Even just splitting off ones with external reqs (XML
parsers, GD, etc) into an 'aux' release would be an advantage, as it
doesn't confront a new user with the burden of installing a large
list of dependencies, some of which may be complicated for a perl
newbie to either install from scratch (DBD::mysql, GD) or to get the
latest bug-fixed prereq release for their OS (the recent debacle with
XML::SAX::Expat issues come to mind, which wasn't immediately
available for win32 as a PPM).
I'm fairly open to any approach as long as it's reasonably though
out, though I am admittedly a bit biased towards the split approach.
I do think some change is in order; I worry about there ever being a
1.6 release at this point.
chris
More information about the Bioperl-l
mailing list