[BioRuby] Beautiful code for Bioinformatics

Pjotr Prins pjotr.public14 at thebird.nl
Tue Feb 14 08:24:32 UTC 2012


On Mon, Feb 13, 2012 at 11:46:26AM +0100, Raoul Bonnal wrote:
> in this ML I found that the quality of code and its beauty increase
> only if you chat/talk with people and you are available to accept
> critics and contributes. 

I feel we can keep the momentum going if we use this list as a more
general outlet of our personal development. It may be a metamorphosis
of the old style of ML. Bio* is no longer about specialized libraries,
it is mainly about the problem of software development in biology. I
feel Ruby attracts the right type of people - that is why we have
'beautiful code' in the subject ;).

The ML is the first place to share information. This is what new
potential recruits may find interesting. Have them find us.

In that vein I am presenting another piece of beautiful code, the
omnipresent FlatFile handler of BioRuby. See 

  https://github.com/bioruby/bioruby/blob/master/sample/any2fasta.rb

e.g.

  ARGV.each do | fn |
    ff = Bio::FlatFile.auto(fn)
    ff.each_entry do |entry|
      if regex != nil
        next if eval("entry.seq !~ #{regex}")
      end
      print entry.seq.to_fasta(entry.definition,70)
    end
  end

which does a lot of work in a few lines, with remarkable flexibility!
Including automatic data format checking and a runtime defined regex
search.

Nowadays we would probably do it a little different. The eval can be
taken out of the loop and the regex compiled (in a Pythonesque way).
Also both each_entry and the Sequence objects should be lazy (it is
not underneath) and properly iterate to avoid loading everything in
RAM and parsing too much. But hey, it is still a great example what
what we can do with Ruby anyway!

I wrote this simple example any2fasta.rb 6 years ago, but the FlatFile
and Sequence code is not mine. It is mostly by Toshiaka and Naohisa
going all the way back to 2002. So the beautiful code is really by
those two geniuses who are at the heart of the BioRuby project.

Pj.



More information about the BioRuby mailing list