[Bioperl-l] GSoC/BioPerl Reorganization Project

Thu Apr 28 19:53:49 UTC 2011

Chris,

We haven't talked much about the versioning yet, but it will be on the list
to figure out asap.

So far, the plan is to split out Bio::Root first, followed by a couple
modules that depend only on Bio::Root. The plan I proposed was Bio::Das,
Bio::Event then Bio::Location. Depending on how much time is remaining for
the GSoC project, the next to split out would be Bio::Factory and
Bio::Coordinate, because they depend on Bio::Root and Bio::Location. I plan
to still help with the reorganization after the internship is over, but I
obviously have to have a stopping point for the GSoC project.

Rob provide me with a really nice scrip to list dependencies of the modules,
so I plan to make a roadmap towards to end of the summer that will help
guide the rest of the reorganization. At that point, we'll have to deal with
the circular dependencies carefully.

This is a huge project, much bigger than I can do in one summer. But I plan
to get it started in a way that makes it easy for others to contribute.

Sheena

On Wed, Apr 27, 2011 at 12:35 PM, Chris Fields <cjfields at illinois.edu>wrote:

> Sheena,
>
> Congrats on being accepted! We've talked about doing this over the years,
> but it's not an easy task and it needs a dedicated project to get the ball
> rolling, so to speak.  Hopefully this isn't tl;dr.  I'll start off with a
> few of my questions/thoughts (Rob could probably chime in as well, but I
> think his general thoughts on the project parallel mine):
>
> 1) The current BioPerl CPAN could just be a simple install script, acting
> like a 'Task' or 'Bundle' module, installing the actual Bio-specific
> distributions.  Doing it this way would allow you to iteratively split off
> additional code but retain the original Task/Bundle-based approach to
> installation.  For instance, the first pass could split out Root, then have
> a dependency-light and 'extras' distribution, 2nd round split further based
> on function, and so on:
>
>  1st round (v 1.9)   :  BioPerl (just an installer) -> installs root,
> min-deps, extra-deps
>  2nd round (v 1.901) :  BioPerl (just an installer) -> root, seq/feature,
> other-min-deps, extra-deps
>  ...
>  Xth round (v 1.99)  :  BioPerl (just an installer) -> root, tools, seq,
> tree, align, coord, map, everything-else
>  ...
>
> Also, one could potentially install modules in various ways: interactively,
> in predetermined groups, using a user-defined list, etc (one could
> effectively create custom BioPerl installs for GBrowse or other tools for
> instance).  Of course I would only pick the easiest route to start, but
> maybe that gives some ideas.  Regardless, if the dependency tree is set up
> correctly any reliance on other Bio* modules would be defined in the various
> Build.PL/Makefile.PL and then installed via CPAN (as is any dependency).
>
> 2) The Bio::Root modules are probably the true core modules and are the
> most stable with regards to changes, so those could be moved to something
> like BioPerl-Core.  Beyond that, what are the proposed splits?  (we've
> discussed this on-list before, but it's appropriate to bring this up again)
>
> 3) How do we want to handle versioning?  We can't (and probably shouldn't)
> release everything on a synchronized versioning scheme (via
> Bio::Root::Version, for instance), that'll quickly fall apart.  Personally I
> can foresee each split-off dist having it's own version, with the BioPerl
> network of modules being in effect it's own mini-CPAN.
>
> 5) Related to versioning, in my opinion we should maybe aim on eventually
> calling this BioPerl v2.0 and starting with a simpler X.Y versioning scheme.
>  Lincoln has already done something like this with Bio::Graphics, which was
> originally part of BioPerl but split off prior to v 1.6.0.
>
> 6) In some cases I can see particularly thorny problems, such as circular
> dependencies.  I can think of a few ways to address that (creating a simple
> lightweight Bio::Species class as a fallback if Bio::Tree code isn't
> present, for instance), but any additional thoughts on this would be
> helpful.
>
> 7) Do we want to set up something like 'git submodule' for the devs to pull
> down all BioPerl-relevant code?
>
> Other thoughts?
>
> chris
>
> On Apr 27, 2011, at 12:17 AM, Sheena Scroggins wrote:
>
> > Hey everyone,
> >
> > I wanted to take a minute to introduce myself as one of the Google Summer
> of
> > Code interns. I was the lucky one chosen to work on the BioPerl
> > Reorganization (*crowd cheers*). I am a grad student in bioinformatics,
> and
> > somewhat new to this level of programming so bear with me as I learn the
> > technical jargon. Luckily I have both Rob and Chris to mentor me this
> > summer!
> >
> > Reading through the mailing list archives, I see there have been many
> > discussion and differing opinions about tackling this project. Given the
> > time frame for GSoC and my limited experience, there is no way I will
> > complete this project on my own but I will at least be able to start it,
> > which will hopefully motivate others to pitch in. So far, the plan for
> the
> > GSoC project is to start by breaking out Bio::Root, followed by a couple
> > other modules based on their dependencies and the time allowed. Each will
> be
> > published to CPAN independently. You can follow the project (once it
> starts)
> > on github at https://github.com/sheenams.
> >
> > I look forward to collaborating with many of you on the reorganization
> (hint
> > hint)!
> >
> > Sheena
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>