[Biojava-l] [Biojava-dev] BioJava 3 Begins - Volunteers please!

Tue Oct 21 09:06:41 UTC 2008

>
>
> License: Since it seems we will end up copying code from biojava 1.6
> to biojava 3.0, we need to keep the license the same (LGPL 2.1). I.e.
> people should still use the same biojava license headers when
> committing new files and all code will be considered to be LGPL, if no
> header is present. Do NOT commit code under other licenses.
>
> Installation: We need some installation instructions on the wiki site,
> e.g. how to get the maven setup running.  What are the code
> conventions for the new version?

Not sure where best to put it in the Wiki, but I agree it needs to go there
somewhere.

Installation is a one-liner from within the top level of the project:

   mvn install

This compiles and installs the JARs into your local Maven repository, and
also downloads and installs any external dependencies. Then you can add the
installed modules as dependencies in your own Maven projects.

If you need to write a launcher script for your project, or you want to use
the JAR files outside Maven, you can use this command to generate the
CLASSPATH for use outside Maven. This only includes external dependencies -
you'll also need to add to it the individual JAR files from inside the
various target/ folders that Maven built for you:

  mvn dependency:build-classpath

Code conventions are simple:

1. I'm not fussed about the specific formatter people use in each module, as
long as the code is all formatted using some kind of consistent method. I
personally just use the default settings from Format code in NetBeans.

2. Use 'this' wherever possible, and for static references, use the
classname prefix (e.g. MyClass.staticField). I hate having to try and work
out in my head which references are going where, and which are static and
which are not!

3. Comment every single method, even if it's private. This helps understand
the flow of your code. Also comment liberally inside methods if they are
longer than just a few lines (i.e. if you can't fit the entire method within
the code panel in NetBeans, its going to need internal comments).

4. When writing getters/setters, follow the Java beans conventions so that
automated frameworks like Spring can easily pick it up and work with it.

5. Please write tests for your code using JUnit conventions, inside the
test/ folder of each module. I know I haven't done this myself yet, but I'm
going to!

>
>
> Blast: the Blast parsing modules are among the most frequently used
> ones in biojava 1.6. To make people use biojava v3 it will be crucial
> to have a port of them to the new version. Does anybody want to take
> care of that?

I'll second that. Blast is vital. We'd really appreciate a volunteer,
please!

>
> Automated builds: is it interesting to have automated builds set up
> for the new version at this stage, or should we wait until a more
> mature stage? I could easily add another auto-build similar to the one
> for biojava 1.6 at http://www.spice-3d.org/cruise/

You could do, although I don't think they'd be much use yet. But why not
start early then we won't forget to do it later.

Richard

>
> Andreas
>
> On Sun, Oct 19, 2008 at 5:18 PM, Richard Holland
> <holland at eaglegenomics.com> wrote:
> > Hi all,
> >
> > I've just committed some new code to the biojava3 branch of the
> biojava-live
> > subversion repository. It's the foundations of a brand new
> alphabet+symbol
> > set of classes, and an example of how to use them to represent DNA.
> You'll
> > notice that the new code is very lightweight and allows for a lot more
> > flexibility than the old code - for instance, the concept of Alphabet has
> > changed radically. It also makes much more extensive use of the
> Collections
> > API.
> >
> > I haven't got any test cases or usage examples yet but give me a shout if
> > you don't understand the code and I'll explain how it works. (Hint:
> > SymbolFormat is there to convert Strings into SymbolList objects, and
> vice
> > versa).
> >
> > So, now we want some volunteers! We're starting from scratch here so
> there's
> > a lot of work to do. The whole of BioJava needs 'translating' into BJ3,
> > whether it be copy-and-paste existing classes and modify them to suit the
> > new style, or write completely new ones to provide equivalent
> functionality.
> >
> >
> > I'll post an example of how to do file parsing soon, probably starting
> with
> > FASTA. In the meantime, a good place to start would be for people to
> design
> > object models to represent their favourite data types (e.g. Genbank, or
> > microarray data). Utility classes to manipulate those objects would be
> great
> > too.
> >
> > The object models need to be normalised as much as possible - e.g. if
> your
> > data has a lot of comments, and the order of those comments is important,
> > then give your object model a collection of comment objects. The object
> > model for each data type should be completely independent and use basic
> data
> > types wherever possible (e.g. store sequences as strings, don't attempt
> to
> > parse them into anything fancy like SymbolLists). The closer the object
> > model is to the original data format, the better. There's going to be
> clever
> > tricks when it comes to converting data between different object models
> > (e.g. Genbank to INSDSeq), which I will explain later when I put the file
> > parsing examples up.
> >
> > You'll notice how the biojava3 branch uses Maven instead of Ant. This is
> > because we want to make it as modular as possible, so if you want to
> write
> > microarray stuff, create a new microarray sub-project (as per the dna
> > example that's already there). This way if someone only wants the
> microarray
> > bit of BJ3, they only need install the appropriate JAR file and can
> ignore
> > the rest. (The 'core' module is for stuff that is so generic it could be
> > used anywhere, or is used in every single other module.)
> >
> > If coding isn't your cup of tea, then we would very much welcome testers
> > (particularly those who enjoy writing test cases!), documenters
> > (particularly code commenters), translators (for internationalisation of
> the
> > code), and of course all those who wish to contribute ideas and
> suggestions
> > no matter how off-the-wall they might be. In particular if you'd like to
> > take charge of an area of the development process, e.g. Documentation
> Chief,
> > or Protein Champion, then that would be much appreciated.
> >
> > I'm very much looking forward to working with everyone on this. Good
> luck,
> > and happy coding!
> >
> > cheers,
> > Richard
> >
> > PS. Please don't forget to attach the appropriate licence to your code.
> You
> > can copy-and-paste it from the existing classes I just committed this
> > evening.
> >
> > PPS. For those who are worried about backwards compatibility - this was
> > discussed on the lists a while back and it was made clear that BJ3 is a
> > clean break. However, the existing code will continue to be maintained
> and
> > bugfixed for a couple of years so you don't have to upgrade if you don't
> > want to - it just won't have any new features developed for it. This is
> > largely because it'll probably take just that long to write all the new
> BJ3
> > code. When we do decide to desupport the existing BJ code, plenty of
> notice
> > will be given (i.e. years as opposed to months).
> >
> >
> > --
> > Richard Holland, BSc MBCS
> > Finance Director, Eagle Genomics Ltd
> > M: +44 7500 438846 | E: holland at eaglegenomics.com
> > http://www.eaglegenomics.com/
> > _______________________________________________
> > biojava-dev mailing list
> > biojava-dev at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/biojava-dev
> >
>

-- 
Richard Holland, BSc MBCS
Finance Director, Eagle Genomics Ltd
M: +44 7500 438846 | E: holland at eaglegenomics.com
http://www.eaglegenomics.com/