[BioRuby] Parsing FASTA result

Fredrik Johansson fredjoha at bioreg.kyushu-u.ac.jp
Mon May 21 08:16:31 UTC 2007


I have encountered a problem again when running FASTA. I got a huge
amount of homologs from fasta (25 MB data) for one sequence, and then
the Bio::Fasta::Report class gets this error when initializing:

format10.rb:21:in `sub!': failed to allocate memory (NoMemoryError)

so I made the following changes to my code. It is just a quick fix, and
I am not sure about that 'else' case that I took away. It does not seem
to be covered by the line that I added. Also I did not bother about the
@list variable since it does not seem to be used anywhere.

/Fredrik

The patch:

--- fasta/format10.rb   2007-05-21 16:50:38.000000000 +0900
+++ fasta/format10.new.rb       2007-05-21 16:52:55.000000000 +0900
@@ -17,13 +17,7 @@
 
   def initialize(data)
     # header lines - brief list of the hits
-    if data.sub!(/.*\nThe best scores are/m, '')
-      data.sub!(/(.*)\n\n>>>/m, '')
-      @list = "The best scores are" + $1
-    else 
-      data.sub!(/.*\n!!\s+/m, '')
-      data.sub!(/.*/) { |x| @list = x; '' }
-    end
+    data = data[data.index("\n\n>>>")+5..data.size]
 
     # body lines - fasta execution result
     program, *hits = data.split(/\n>>/)





More information about the BioRuby mailing list