[Bioperl-l] GSoC/BioPerl Reorganization Project
Chris Fields
cjfields at illinois.edu
Thu Apr 28 21:04:51 UTC 2011
Sounds fine; I think (as you indicate) we can deal with issues along the way. Rob, anything to add?
chris
On Apr 28, 2011, at 2:53 PM, Sheena Scroggins wrote:
> Chris,
>
> We haven't talked much about the versioning yet, but it will be on the list to figure out asap.
>
> So far, the plan is to split out Bio::Root first, followed by a couple modules that depend only on Bio::Root. The plan I proposed was Bio::Das, Bio::Event then Bio::Location. Depending on how much time is remaining for the GSoC project, the next to split out would be Bio::Factory and Bio::Coordinate, because they depend on Bio::Root and Bio::Location. I plan to still help with the reorganization after the internship is over, but I obviously have to have a stopping point for the GSoC project.
>
> Rob provide me with a really nice scrip to list dependencies of the modules, so I plan to make a roadmap towards to end of the summer that will help guide the rest of the reorganization. At that point, we'll have to deal with the circular dependencies carefully.
>
> This is a huge project, much bigger than I can do in one summer. But I plan to get it started in a way that makes it easy for others to contribute.
>
> Sheena
>
>
> On Wed, Apr 27, 2011 at 12:35 PM, Chris Fields <cjfields at illinois.edu> wrote:
> Sheena,
>
> Congrats on being accepted! We've talked about doing this over the years, but it's not an easy task and it needs a dedicated project to get the ball rolling, so to speak. Hopefully this isn't tl;dr. I'll start off with a few of my questions/thoughts (Rob could probably chime in as well, but I think his general thoughts on the project parallel mine):
>
> 1) The current BioPerl CPAN could just be a simple install script, acting like a 'Task' or 'Bundle' module, installing the actual Bio-specific distributions. Doing it this way would allow you to iteratively split off additional code but retain the original Task/Bundle-based approach to installation. For instance, the first pass could split out Root, then have a dependency-light and 'extras' distribution, 2nd round split further based on function, and so on:
>
> 1st round (v 1.9) : BioPerl (just an installer) -> installs root, min-deps, extra-deps
> 2nd round (v 1.901) : BioPerl (just an installer) -> root, seq/feature, other-min-deps, extra-deps
> ...
> Xth round (v 1.99) : BioPerl (just an installer) -> root, tools, seq, tree, align, coord, map, everything-else
> ...
>
> Also, one could potentially install modules in various ways: interactively, in predetermined groups, using a user-defined list, etc (one could effectively create custom BioPerl installs for GBrowse or other tools for instance). Of course I would only pick the easiest route to start, but maybe that gives some ideas. Regardless, if the dependency tree is set up correctly any reliance on other Bio* modules would be defined in the various Build.PL/Makefile.PL and then installed via CPAN (as is any dependency).
>
> 2) The Bio::Root modules are probably the true core modules and are the most stable with regards to changes, so those could be moved to something like BioPerl-Core. Beyond that, what are the proposed splits? (we've discussed this on-list before, but it's appropriate to bring this up again)
>
> 3) How do we want to handle versioning? We can't (and probably shouldn't) release everything on a synchronized versioning scheme (via Bio::Root::Version, for instance), that'll quickly fall apart. Personally I can foresee each split-off dist having it's own version, with the BioPerl network of modules being in effect it's own mini-CPAN.
>
> 5) Related to versioning, in my opinion we should maybe aim on eventually calling this BioPerl v2.0 and starting with a simpler X.Y versioning scheme. Lincoln has already done something like this with Bio::Graphics, which was originally part of BioPerl but split off prior to v 1.6.0.
>
> 6) In some cases I can see particularly thorny problems, such as circular dependencies. I can think of a few ways to address that (creating a simple lightweight Bio::Species class as a fallback if Bio::Tree code isn't present, for instance), but any additional thoughts on this would be helpful.
>
> 7) Do we want to set up something like 'git submodule' for the devs to pull down all BioPerl-relevant code?
>
> Other thoughts?
>
> chris
>
> On Apr 27, 2011, at 12:17 AM, Sheena Scroggins wrote:
>
> > Hey everyone,
> >
> > I wanted to take a minute to introduce myself as one of the Google Summer of
> > Code interns. I was the lucky one chosen to work on the BioPerl
> > Reorganization (*crowd cheers*). I am a grad student in bioinformatics, and
> > somewhat new to this level of programming so bear with me as I learn the
> > technical jargon. Luckily I have both Rob and Chris to mentor me this
> > summer!
> >
> > Reading through the mailing list archives, I see there have been many
> > discussion and differing opinions about tackling this project. Given the
> > time frame for GSoC and my limited experience, there is no way I will
> > complete this project on my own but I will at least be able to start it,
> > which will hopefully motivate others to pitch in. So far, the plan for the
> > GSoC project is to start by breaking out Bio::Root, followed by a couple
> > other modules based on their dependencies and the time allowed. Each will be
> > published to CPAN independently. You can follow the project (once it starts)
> > on github at https://github.com/sheenams.
> >
> > I look forward to collaborating with many of you on the reorganization (hint
> > hint)!
> >
> > Sheena
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
More information about the Bioperl-l
mailing list