[Biojava-dev] The future of BioJava

george waldon gwaldon at geneinfinity.org
Sat Sep 22 16:35:22 UTC 2007


Richard,

You cannot kill biojava and it is not vista; you cannot force people to use it. I have a project with hundreds of classes using biojava and working without a glitch and the choice of either keeping with it or switching to a bj3 in the middle of a rewrite of around 1500 classes that may take months or years to complete. I may just never switch to the new biojava. Most likely, a lot of people are going to be in a similar situation and most likely bj3 will also have to have support old biojava classes - great!

I agree that you cannot change interface but you can deprecate them and toss them after one release cycle or put them into a deprecated module that is not included in releases.

The question becomes: what are the fundamental problems of biojava that truly justify a rewrite from the ground? Certainly, need for a new symbol model could be one; maintenance and testing are not; modular structure is not; and use of generics is not - they do not break old code. 

George


> -----Original Message-----
> From: Richard Holland
> To: george waldon
> Cc: biojava-dev at biojava.org
> Sent: 9/21/2007 12:54 AM
> Subject: Re: [Biojava-dev] The future of BioJava
> 
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Hi George.
> 
> By 'stop development' I really meant just that active development
> efforts would be focused on the new codebase rather than modifying the
> existing one (except of course for fixing bugs, which is always
> important and we wouldn't stop doing that until the new codebase was
> well established as an alternative).
> 
> I agree that modifying the existing codebase would improve many of the
> problems currently experienced with it - code abstraction being just one
> of them. BioJavaX was an attempt at doing this. The big stumbling block
> was interfaces - users do not expect interfaces to change as it breaks
> all code that already uses that interface. They also do not expect the
> defined behaviour of methods in interfaces to change - which meant, for
> instance, that I had real problems trying to get
> RichFeature/RichLocation and RichLocation/Location to match up as some
> parts of Feature and Location conflicted with the more realistic
> requirements of their Rich* equivalents (e.g. circularity).
> 
> If you change interfaces, you might as well start from scratch in terms
> of the effect it has on end-user's code. Also, if we start from scratch,
> it allows us to build up from the very basics the kind of robustness and
> flexibility we need throughout the system. As mentioned in the original
> posting the existing system is heavily sequence-focused, meaning that
> even the simple task of scanning a set of features cannot be done
> without also loading the associated sequences because the two are so
> closely integrated. We need to make it much more flexible and I think
> new code would give us a better opportunity to do so without being tied
> into complying with existing interfaces or behaviour expectations.
> 
> Having said that, I do expect large parts of the new codebase to be only
> slightly modified copies of the original code, particularly regarding
> recent developments such as genetic algorithms and phylogenetics. It
> would be silly to write such logic all over again where the code is
> relatively self-contained.
> 
> cheers,
> Richard
> 
> 
> 
> george waldon wrote:
> > Hello,
> >
> > All this is very exciting. I would certainly contribute to something
> like that. A few remarks that come to my mind while reading all these
> emails.
> >
> > I noticed that the tutorial has seriously improved – thanks for the
> work. I remember my initial steps going to understanding Symbol and
> cross-alphabets (…)  Still, from time to time, I have difficulties with
> basic things that are not intuitive to me such as “token”, e.g.
> Alphabet.getTokenizarion(“token”) or
> SymbolTokenization.tokenizeSymbolList(SymbolList).
> >
> > I am surprised by the all the requests to use String instead of
> SymbolList. The CookBook tells precisely, and with code examples, how to
> make most of all basic operations. Maybe someone could illustrate the
> new kind of code versus the old one? I bet many newbies (and older one)
> actually get their answer in the Cookbook.
> >
> > Richard wrote:
> >> It is suggested that development stops on the existing Biojava(…)
> > Well, I don’t think the license can let you do that :-)
> > Writing new code might be easier but certainly making old code better
> will improve the level of code abstraction. Therefore I am promoting
> improving existing Biojava code versus hazardous code rewrite. I can see
> some of the initial steps on the roadmap:
> > - Switch to Subversion repository
> > - Change of the build process compatible with creation of modules
> > - Improving testing frame (mentioned several times)
> > - Creation of white papers for coding practices, build releases,
> (others?)
> >
> > Then maybe the proper work of restructuring Biojava may start. We can
> either divide the existing mammoth into multiple modules at first or -
> my preference – building modules one by one by selectively picking
> classes. This way it will be easy to find out classes that can be
> deprecated (by lack of users) and we can even have a deprecated module
> at the end. Some coupling may need to loosen up. We will also need a
> list of API change for developers who will use the newer version.  I am
> sure that the kind of data structures proposed by Richard could find
> their place as well as some of the proposed patterns (beans, others?)
> >
> > Anyway, all these are simple ideas. I am not an expert in build
> process, but I can help with improving javadocs, writing examples and
> test cases. I have also a fair knowledge of the molecular biology
> package.
> >
> > Hope it helps,
> > George
> >
> > _______________________________________________
> > biojava-dev mailing list
> > biojava-dev at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/biojava-dev
> >
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.2.2 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
> 
> iD8DBQFG83jK4C5LeMEKA/QRAtOFAJsF9YNdgdsOm1KY65GyRehsO1ElYwCfeUfi
> yXWTMXSzn3mXZqXXo9999rw=
> =WbAQ
> -----END PGP SIGNATURE-----




More information about the biojava-dev mailing list