[BioLib-dev] EMBOSS mapping in Biolib

Pjotr Prins pjotr.public14 at thebird.nl
Thu Nov 26 12:42:22 UTC 2009


Even better: the EMBOSS version of translating a C.elegans genome into
six reading frames is about 30 times faster than the Bioruby one:

Bioruby version:

#  22929 records 137574 times translated!
#   real    9m30.952s
#   user    8m42.877s
#   sys     0m32.878s

Biolib version:

#  22929 records 137574 times translated!
#   real    0m20.306s
#   user    0m15.997s
#   sys     0m1.344s

This is including IO - which is handled by Ruby. The code is

  nt = FastaReader.new(fn)
  trnTable = Biolib::Emboss.ajTrnNewI(1);
  nt.each { | rec |
    (0..iter).each do | repeat |
      ajpseq   = Biolib::Emboss.ajSeqNewNameC(rec.seq,"Test sequence")
      [-3,-2,-1,1,2,3].each do | frame |
        ajpseqt  = Biolib::Emboss.ajTrnSeqOrig(trnTable,ajpseq,frame)
        aa       = Biolib::Emboss.ajSeqGetSeqCopyC(ajpseqt)
        print "> ",rec.id," ",frame.to_s,"\n"
        print aa,"\n"
      end
    end
  }
  $stderr.print nt.size," records ",nt.size*6*iter," times translated!"

It just shows. For BIGDATA C rules. With Biolib there are no
concessions :-)

Pj.



More information about the BioLib-dev mailing list