[Bioperl-l] Splitting Bioperl

Wed Jul 4 14:53:45 UTC 2007

On Jul 4, 2007, at 5:00 AM, Sendu Bala wrote:

> To summarise some previous threads:
> http://thread.gmane.org/gmane.comp.lang.perl.bio.general/15315
> http://thread.gmane.org/gmane.comp.lang.perl.bio.general/15338/ 
> focus=15409
>
> # Bioperl is currently one monolithic distribution of ~900 modules
> # There is some desire to split it up into smaller functional groups
> # There are some problems with that proposal
> # An extreme variant of that proposal is to make the groups individual
> modules
>
>
> Following this discussion:
> http://www.nntp.perl.org/group/perl.modules/2007/07/msg55160.html
> (especially Adam Kennedy's postings of 4/07, soon to appear in that
> archive), the extreme variant doesn't seem like a good idea.

brian d foy made some sound arguments against it as well.

> I'm now suggesting that Steve's original split idea, as
> modified/expanded by Adam's driver and other ideas, is the best  
> choice.
> The problems I previously identified can be solved in the same way  
> they
> were solved in my extreme variant: the splits are done by Build.PL
> automation working on a single repository/code-base, not by splitting
> things up at the repository level.
>
> As I see it, the way forward now is for someone interested enough to
> decide on the specifics of how things will be split and offer them  
> up to
> the group for discussion. I don't mean vague possibilities of what  
> might
> work as a split, but rather some real thought should go into it to  
> make
> sure the split makes sense and will actually work in practice.

We've already identified a few (SearchIO, Tools, GBrowse-related, etc).
...
> If there isn't sufficient interest to make this happen, I don't see  
> that
> as a terrible thing. There are benefits to keeping Bioperl monolithic,
> and some of the problems (eg. lack of updates) can be solved without
> changing its nature.

If so, proposals that solve this problem need to be made as well.

If we stay monolithic, then here's mine: we start having fixed,  
regularly timed dev releases like Parrot, monthly or bimonthly (quite  
common on CPAN), with brief release reports on which bugs have been  
fixed, code has been added, so on.  Not every bug has to be fixed per  
dev release; if that were true there would never be releases for some  
of the XML parser packages.  No RCs for dev releases (it's a dev  
release!).  These would be 1.x.y.  We can then, every once in a  
while, have a bug-squashing session, hackathon, etc, and have regular  
non-dev release (1.x) that all core devs accept and that passes a  
particular milestone.

As for the advantage of a split approach, as mentioned previously it  
is to focus modules/tests/scripts into groups with related  
functions.  Even just splitting off ones with external reqs (XML  
parsers, GD, etc) into an 'aux' release would be an advantage, as it  
doesn't confront a new user with the burden of installing a large  
list of dependencies, some of which may be complicated for a perl  
newbie to either install from scratch (DBD::mysql, GD) or to get the  
latest bug-fixed prereq release for their OS (the recent debacle with  
XML::SAX::Expat issues come to mind, which wasn't immediately  
available for win32 as a PPM).

I'm fairly open to any approach as long as it's reasonably though  
out, though I am admittedly a bit biased towards the split approach.   
I do think some change is in order; I worry about there ever being a  
1.6 release at this point.

chris