[BioRuby] Fastq.to_s

Tomoaki NISHIYAMA tomoakin at kenroku.kanazawa-u.ac.jp
Sat Aug 13 10:05:09 UTC 2011


Hi,

For flatfiles I think its nice if we can output the original text entries as split.
For example 

#!/bin/env ruby

require 'bio'

ff1 = Bio::FlatFile.open(nil, ARGV[0])
ff2 = Bio::FlatFile.open(nil, ARGV[1])

ff1.each_entry do |fe1|
  fe2 = ff2.next_entry
  puts fe1
  puts fe2
end

should be able to merge read1 and read2 in different file to a single file.
This does work with fasta format but not with fastq format right now, because
Bio::Fastq does not have to_s method.  As Fastq does not hold really original 
data, reconstructing as the following patch is perhaps a good way (don't use
twice memory just for the to_s function). Or, do we need to fold the sequence
to some (original or fixed) length?

diff --git a/lib/bio/db/fastq.rb b/lib/bio/db/fastq.rb
index f913e6d..5ff1a15 100644
--- a/lib/bio/db/fastq.rb
+++ b/lib/bio/db/fastq.rb
@@ -407,6 +407,10 @@ class Fastq
   # raw sequence data as a String object
   attr_reader :sequence_string
 
+  def to_s
+    "@#{@definition}\n#{@sequence_string}\n+#{@definition2}\n#{@quality_string}\n"
+  end
+
   # returns Bio::Sequence::NA
   def naseq
     unless defined? @naseq then

Best regards,
-- 
Tomoaki NISHIYAMA

Advanced Science Research Center,
Kanazawa University,
13-1 Takara-machi, 
Kanazawa, 920-0934, Japan






More information about the BioRuby mailing list