[Bioperl-l] Splits again
Chris Fields
cjfields at uiuc.edu
Thu Jun 28 05:17:01 UTC 2007
D'oh! Just when I wanted to go to bed. It's not fair, you're in
California...
On Jun 27, 2007, at 10:51 PM, Jason Stajich wrote:
> Hey guys - I'm wading in a bit late as I haven't had time to keep up
> with whole discussion.
>
> So you are suggesting 800+ individual CPAN modules? I don't think
> that is a good idea. Why would you split up Bio::Seq::RichSeq and
> Bio::Seq into two separate packages for example? I think if you
> really want to move away from the monolithic install it has to be
> more logical by function - but I am not that optimistic that this is
> going to actually be easier for people. Maybe I'm misunderstanding.
Okay, so maybe it wasn't just me.
> What are the arguments for separating things -- to make it so people
> aren't scared by the number of modules so they'll code? It seems
> like some people just want it to be installed and run scripts - does
> having them install dozens of modules work. Do we need to consider
> people how much this would suck if someone can't use CPAN or
> Module::Builder to automate dependancy tracking installation? How
> does it work when modules are deprecated?
What I envision for core is maybe not just one distribution, but a
cluster of distributions:
base - Bio::Seq; Bio::SeqIO; Bio::AlignIO, some Bio::DB, associated
modules. Bare bones, with as few dependencies as possible.
aux - Any Bio::SeqIO, Bio::AlignIO, Bio::DB etc. that requires
additional modules.
search - Bio::Search and SearchIO
tools - Bio::Tools, Bio::Restriction, maybe DB modules, GFF-related
stuff?
graphics - Bio::Graphics. Maybe GMOD-related stuff here?
The last four would list bioperl-core as a dependency themselves
along with any other modules necessary. We could also have the core
Build.PL ask the user if they want to install the other non-base
distros, and maybe include bioperl-db, bioperl-network, and bioperl-
run in the loop if requested.
All would be installed as a bundle similar to Bundle::BioPerl, but
have regular CPAN point releases (1.x.x) independently from one
another i.e. for bug fixes, with a yearly/biyearly timed full release
(1.x) of the whole shebang. Any point release for any 'core'
distribution would have to be tested against the others prior to
release.
This is basically following Steve's train of thought, though more
elaborated:
http://thread.gmane.org/gmane.comp.lang.perl.bio.general/15315/
focus=15315
> I'm not sure I have made up my mind on what I'd like to see, but at
> some point I think we need to get a clearer idea of what audience we
> are trying to serve best. If want it to be easy to install maybe we
> should invest time into making OSX double-click installers, RPMs, and
> the Windows stuff easily installable. If we want to serve the
> developers who aren't using SVN so we want to push out releases of
> modules ASAP? I just am not clear on the motivation for some of the
> proposed changes.
I think regular CPAN releases with updated PPMs hosted via portal
work fine for the most part, but it would be nice to host RPMs.
Others (Allen Day, for instance) have donated time to generate RPMs
but they seem to lag behind a bit more.
The original idea for svn arose from an unrelated thread with Mark
Johnson discussing something (Glimmer maybe?) and took off from
there. I was actually pretty surprised it took on a life of it's
own. As for the motivation to switch, I haven't specifically used it
myself, but the large number of responses seem to indicate others
have and seem happy with it. Rutger Vos had also indicated he would
move Bio::Phylo over to the repo if we used svn. We def. should
address the issues you bring up (why _WE_ need svn) more succinctly
but that shouldn't be an issue.
> Also - the main point I wanted to make - Can I suggest we spend a
> little time discussing what it will take to get a stable release for
> the current code as it stands (bioperl-live and bioperl-run)? It
> seems like we really need to do this first so that we have a stable
> release that can be followed by CVS -> SVN migration, then consider
> major changes to the repository structure and release packaging, and
> potential deprecation and incorporation of other modules.
Agreed. We prob. need to schedule a good couple of days (or so) to
squash bugs.
> I assume there is no chance that we'd have a 1.6 candidate by BOSC
> next month?
Um, not likely as nothing has been addressed Feature/Annotation-wise
(overloads are still there, methods have not been deprecated, etc).
There was an underlying assumption these would have an effect on GMOD-
related stuff (I remember reading a post from Scott Cain in the mail
archive mentioning something along these lines after the 1.5 release
hubbub).
Maybe a quick 1.5.3 for BOSC, with a 1.6 for fall?
> Will it be productive to schedule a fair amount of time at BOSC
> discussing how to partition out the packages into separate sub-
> packages after we've done a successful release rather than trying to
> change things right now? I realize not everyone will be there but
> maybe it will be easier to interact on this then.
How many are going to be there? I can't go this year except on my
own dime (which I don't have many of, student loans and all, sorry),
though I'll likely be in a new lab by spring which is likely more
amenable to funding. If there is a hackathon in the late fall (post-
sept) I'll make it a point to go regardless.
> I think it will also be time to talk with Lincoln/Scott about how
> Gbrowse is structured and if that is working for them. There is too
> much code in different places that I think we need to figure out how
> to structure it properly so those packages can be released. It would
> probably mean moving Bio::Graphics, Bio::DB::GFF and
> Bio::DB::SeqFeature and gff tools for Gbrowse into separate packages
> so they could be released more regularly on par with Gbrowse
> schedules. Also I think someone needs to figure out Bio::Tools::GFF
> vs Bio::FeatureIO -- what do we want to do? I don't think we really
> fully support GFF3 that well -- the X2GFF scripts probably need some
> more good testing (where X is BLAST,FASTA,Sim4, GenBank, EMBL,
> etc... ) and or migration to the proper GFF writing.
>
>
> -jason
Will Lincoln or Scott be at BOSC?
chris
More information about the Bioperl-l
mailing list