[BioRuby] Ruby speed

Yannick Wurm yannick.wurm at unil.ch
Tue Nov 3 22:49:12 UTC 2009


Hi Mike,

thanks for your response. I'm running:
ruby 1.8.6 (2008-03-03 patchlevel 114) [x86_64-linux]
Starting to age, but on a production machine I'd rather stay with what  
works than risk breaking things by upgrading them.

the command sed 's/^>/>MyPrefix/' is indeed 30% faster than perl :)

My reasons for preferring ruby are the same as yours. But a 5 to 10x  
speed difference is expensive  (I'm calling the one-liner below about  
10,000 times from a larger ruby script - YES, it's ugly, but  
refactoring the script to avoid calling that type of oneliner would be  
a pain since I use 10,000 different prefixes).

I have the feeling that it's ruby's startup-time especially. Running  
the ruby one-liner my a fasta of 40,000 sequences takes 20 seconds;  
running it a fasta of only 10 lines still takes 13 seconds!!

I found some generic benchmarks indicating that ruby is generally only  
a bit slower than perl
http://shootout.alioth.debian.org/u32/benchmark.php?test=all&lang=ruby&lang2=perl

So maybe I can keep using ruby - just avoiding one-liners!

Best,
yannick

On 3 Nov 2009, at 22:26, Michael Barton wrote:

> What version of Ruby are you using?
> Ruby is an expressive language rather than a "fast" language.
> I use Ruby because it's easer to read and maintain my programs, rather
> than because how fast it is.
>
> If you are interested purely in speed you could write in C?
> What are the benchmarks for something like this?
>
> time sed 's/^>/>MyPrefix.' clustering/dirsForAssembly/singlets.fasta  
> > abc
>
> Mike
>
> 2009/11/3 Yannick Wurm <yannick.wurm at unil.ch>:
>> Hi,
>>
>> this is a more general ruby question, but since my application is
>> bioinformatics, I'm posting it here.
>>
>> Just wanted to prepend a few characters in front of FASTA  
>> identifiers.
>>
>>
>> $time cat clustering/dirsForAssembly/singlets.fasta | ruby -pe  
>> "gsub(/^>/,
>> '>MyPrefix')" > abc
>>        real    0m20.379s
>>        user    0m0.741s
>>        sys     0m0.168s
>>
>>
>> While the perl equivalent is one heck of a lot faster!!!
>>
>>
>> $time cat clustering/dirsForAssembly/singlets.fasta | perl -p -i -e
>> 's/^>/>MyPrefix/g' > ab
>>        real    0m2.165s
>>        user    0m0.266s
>>        sys     0m0.146s
>>
>>
>> Is there any hope for ruby?
>>
>> Thanks,
>> yannick
>>
>>
>> --------------------------------------------
>>          yannick . wurm @ unil . ch
>> Ant Genomics, Ecology & Evolution @ Lausanne
>>   http://www.unil.ch/dee/page28685_fr.html
>>
>>
>> _______________________________________________
>> BioRuby mailing list
>> BioRuby at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioruby
>>




More information about the BioRuby mailing list