[BioRuby] Beautiful code for Bioinformatics

Francesco Strozzi francesco.strozzi at gmail.com
Mon Feb 13 10:06:03 UTC 2012


Some code I particularly liked comes from the BioNGS Wrapper library (by
Raoul). The gem is in development and I'm contributing to it so I could
look at the code from a very close perspective!

https://gist.github.com/1815501

What is interesting is the use of an internal DSL which makes the code
particularly expressive and readable, and writing a wrapper becomes only a
matter of defining the options that the command line binary expects. The
library is quite large and includes also methods to define aliases for the
options.
This module makes use also of the powerful Thor gem (used in Rails) to
create and define tasks that can be used to launch the binaries and to
include them in larger workflows and pipelines.
Now let's say you want to turn your freshly new wrapped binary into a
mighty Thor taks, all you need to do is:

https://gist.github.com/1815523

The amazing part is that the parameters passed to the block works like
arguments definition for the Thor task itself. The code that makes this
possible behind the scenes is here:

https://gist.github.com/1815481

It checks also if the number of arguments passed are equal to the arguments
of the block. If not, it raises an error to the user as he/she is calling
the task with the wrong number of arguments. To understand every part you
need to know a bit the way the Thor library defines tasks (more here:
https://github.com/wycats/thor) but the code here in BioNGS definitely
worth a look!

P.S. I'm diving into Scala too :-). I took the Odersky book few months ago
and now I'm about to start looking in details at this new programming
language. Easy and powerful parallelism is the new goal we need to achieve
to keep up with the big data era.

Cheers

On Mon, Feb 13, 2012 at 08:54, Pjotr Prins <pjotr.public14 at thebird.nl>wrote:

> OK, here another candidate for the price of beautiful code:
>
>
> https://github.com/trevor/bioruby-restriction_enzyme/blob/master/lib/bio/util/restriction_enzyme/range/sequence_range/calculated_cuts.rb
>
> Trevor has implemented some hairy logic into the RE code. I mean
> hairy, that if it were done by someone else it would become spaghetti
> code (plenty of examples there in the real world!). You can see, that
> even when choosing sensible names, and explaining the code with good
> comments, it may still be hard to understand! But I think
>
>  def add_cuts_from_cut_ranges(cut_ranges)
>
> pretty much sums it up :). Still, it is beautiful, because it is hard
> to think of doing it better. The Ruby code is short and self
> explanatory and RE library has almost become a DSL for cutting
> sequences using restriction enzymes. That is beautiful.
>
> Pj.
>
> On Sat, Feb 11, 2012 at 11:27:44PM +0300, George Githinji wrote:
> > Hi All
> > Beauty is in the eyes of the beholder!
> > The Bio-Alignment plugin can  read and interconvert a nucleotide
> > alignment to an amino acid alignment.  I liked the simplicity of how
> > PJ has implemented the codon to amino acid conversion helper method
> > while taking care of the gaps or undefined aa translations.
> >
> >       # lazily convert to Amino acid (once only)
> >       def to_aa
> >         aa = translate
> >         if not aa
> >           if gap?
> >             return '-'
> >           elsif undefined?
> >             return 'X'
> >           else
> >             raise 'What?'
> >           end
> >         end
> >         aa
> >       end
> >
> > This method does not have any ruby 'magic' and is self documenting.
> > The gap? and undefined? methods are implemented as simple one line
> > standalone methods.
> >
> > Again I like this simple 'trick' of getting an array of codons from a
> > sequence in the codonsequence class.
> >
> > seq.scan(/\S\S\S/) #gets an array of codons
> >
> > The longer alternative would be to create a bio::sequence::NA object
> > and iterate
> > seq = Bio::Sequence::NA.new("blahahahha")
> > seq.window_search(3, 3) do |subseq|
> >   puts subseq
> > end
> >
> > It seems more intuitive to represent a sequence as an array of codon
> > objects. In this way the codons have some state and can carry
> > 'luggage'. getting the string representation of the sequence is as
> > simple as
> > def to_s
> >  @seq.map { |codon| codon.to_s }.join(' ')
> > end
> >
> > To be more DRY, the to_nt method in the same class could be aliased
> > from the to_s method
> >
> > It seems the bio-plugins are a rich source of tricks and great
> learning.... !
> >
> >
> > On Sat, Feb 11, 2012 at 10:08 PM, Peter Cock <p.j.a.cock at googlemail.com>
> wrote:
> > > On Sat, Feb 11, 2012 at 5:46 PM, Pjotr Prins <
> pjotr.public14 at thebird.nl> wrote:
> > >> Correct me if I am wrong, but has everyone moved across to BioPython
> > >> now? Or even to the dark side?
> > >>
> > >> Pj.
> > >
> > > I haven't noticed any BioRuby developers posting on the Biopython
> > > mailing lists recently - but you'd be welcome ;)
> > >
> > > On a related topic, my first BioRuby pull request was merged, so
> > > there is a little direct cross project contribution going on :)
> > >
> > >
> https://github.com/bioruby/bioruby/commit/f33abf9bbd90c3c1e320f06447fdb54ffd094c5d
> > >
> > > Peter
> > > _______________________________________________
> > > BioRuby Project - http://www.bioruby.org/
> > > BioRuby mailing list
> > > BioRuby at lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/bioruby
> >
> >
> >
> > --
> > ---------------
> > Sincerely
> > George
> > Skype: george_g2
> > Blog: http://biorelated.wordpress.com/
> > Twitter: http://twitter.com/#!/george_l
> >
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby
>



-- 

Francesco



More information about the BioRuby mailing list