[BioRuby] Bringing the fun back to programming! (The first BioRuby IRC conference on Dec 19th)
Pjotr Prins
pjotr.public14 at thebird.nl
Sat Dec 11 09:46:54 UTC 2010
Hi Rubyistas,
We have a special community, with a special language. Ruby is one of
the most fun languages to work with, and we know it.
Here I want to argue that for bioinformatics JRuby is one of the most
exciting developments. Not only is it pretty fast, once compiled,
but it also allows easy integration with Java (and BioJava). Before
you recoil in horror, it also allows integration with some really cool
programming languages, i.e. Scala, Clojure and Groovy.
You know Ruby is great. But it has some weaknesses too. For
Bioinformatics (B) and big data (BD) the problems are:
(1) Weak B functionality
(2) BD performance issues
(3) So so parallel computing support (for BD)
(4) Only partial functional programming support
(1) and (2) can be resolved by using JRuby, BioJava and the JVM. (3)
and (4) can be resolved by tapping into Scala and Clojure.
Let me try to explain.
(1) Weak B functionality: BioRuby is a great achievement, but I have a
number of criticisms. First is that it is not suitable for BD. Almost
every module loads all data in RAM, and there is no concept of
parallel computations in the design. Finally the development is not
fast - we are suffering from the fact that we are a small community.
You could argue about reasons, but I don't think we should spend
energy on the past, when there is such an obvious way forward. Let me
continue.
(2) BD performance issues: Ruby is slow. By definition compared to
statically typed and compiled languages (such as C, Java, Scala,
Clojure). It is pretty amazing to see how much speed Ruby 1.9 has.
But, for BD it breaks down quickly. Ruby's strength is in beauty of
code, but not in raw power.
(3) So so parallel computing support (for BD): Functional languages
(Haskell, Erlang, Scala, Clojure) have immutable data, and
abstractions for parallelization, such as shared memory and actors,
which make it much easier to write parallelized code. For performance
and BD, this is extremely useful.
(4) Only partial functional programming support: Once you get into
functional programming you realize Ruby gets in the way. Support for
functional programming in Ruby is patchy, though there is some.
It is no accident that I have started the BioScala project, and Jan Aerts has
started the BioClojure project in 2010 (!) BioRuby has spin-offs.
My experience with Scala has been great. Scala is statically typed,
and very fast. It also allows beautiful code with functional
programming and parallelization thrown in. For me, there is a clear
path where I use Ruby and Scala on a 50/50 basis. Essentially using
the best of both worlds. JRuby is key to combining them.
And, you know what? It is great fun!! I would get frustrated if I was
locked in either language. But now it is seamless moving between the
two, thanks to the JVM. Which, btw, these days can outperform even C
code.
Believe me, even two years ago, I would not have thought I would
*ever* champion the JVM. But as a saaientist, you go by evidence.
Programming is fun. And it has not ever been this great. I want to
share that with you, and I would like to use the coming holiday
season, and years after, to pass that on. I would guess Jan thinks the
same way about Ruby and Clojure.
Who is interested in getting back into the fun of programming? Who
wants to experiment and become an even more productive programmer?
It could be the goal of BioRuby in 2011 to show the way to other Bio*
projects of handling development in such a way that we can easily move
between the strengths of dynamic programming languages and high
performant functional languages.
I would like that.
Pj.
More information about the BioRuby
mailing list