[BioRuby] rake

MISHIMA, Hiroyuki missy at be.to
Thu Dec 23 01:13:16 UTC 2010

Hi Yannick and all,

I will try to answer your questions....

Yannick Wurm wrote (2010/12/22 19:06):
> Hiro-san's comment about Pwrake for workflows convinced me to
> finally have a look at Rake.
> So I was trying to make a Rakefile that gives me a tree from the
> gblocks run on the cordon alignment from the backtranslated protein
> alignment from the protein sequence from nucleotide squences. But I
> think I quickly bumped into a killer limitation with rules: that
> they apparently only do one level of inference
> And I quote from
> http://onestepback.org/articles/buildingwithrake/rulelimitations.html
 >> rule ".c" =>  [".y"] do |t| yacc(t.source) end
>> rule ".o" =>  [".c"] do |t| compile_c(t.source) end
 >> If lex.y exists
>> … Rake will not Build lex.c from lex.y and lex.o from lex.c.

> Is this limitation still true? Or is something else wrong with my
> code? Do you have a workaround?

rule ".c" => [".y"] do |t|
   puts "run yacc(#{t.source})"
   touch t.source.ext('c')

rule ".o" => [".c"] do |t|
   puts "run compile_c(#{t.source})"
   touch t.source.ext('o')

$ touch lex.y
$ rake lex.o
run yacc(lex.y)
touch lex.c
run compile_c(lex.c)
touch lex.o

Hmm.. it seems to work well.. Do I make misconception about the slide?

> And question two: lets say I want a generic rule that cleans up fasta
> sequences (so that if I require 'aasdflkjsalkfjasdlkj.fasta', the
> file aasdflkjsalkfjasdlkj should be run through Emboss' seqret. Is
> that possible? The following hasn't been working:
>> rule ".fasta" =>  "" do |task|
 >>   sh "seqret -sequence
>> #{task.prerequisites.join} -outseq #{task.name}"
 >> end

Try dynamic definition using the "file" method.

Rekefile (put all the files to be cleaned into the source directory):
SOURCES = FileList["source/*"]
SOURCES.each do |src|
   file "#{src}.fasta" => src do |t|
     sh ["segret",
         "-sequence #{t.prerequisites.join(" ")}",
         "-outseq #{t.name}",
        ].join(" ")
task :default => SOURCES

Sometimes the "rule" methods can be complicated because it basically
assumes that all files have one extension (not ".tar.gz" but ".tgz").
Recently I prefer to use "file" instead of "rule". Because Rakefiles
are Ruby codes themselves, dynamic definition using the file methods are
more flexible. However, I think some hack introducing some new methods
(Rake DSL syntax) can make the description simpler...

MISHIMA, Hiroyuki, DDS, Ph.D.
COE Research Fellow
Department of Human Genetics
Nagasaki University Graduate School of Biomedical Sciences

