[BioRuby] Proposal: Bio::FastaFormat#each_entry

MISHIMA, Hiroyuki missy at be.to
Fri Jan 29 06:46:15 UTC 2010


Hi all,

How about implementing the following methods?

	Bio::FastaFormat#each_entry
	Bio::FastaNumericFormat#each_entry

The following is a sample code to generate a FASTQ string from a FASTA 
string and a FASTA.QUAL string. This sample may need ruby 1.8.7 or later.

I am afraid that simpler or easier ways are already existed in BioRuby...

Hiro.

-----
#!/usr/local/bin/ruby
require 'rubygems'
require 'bio'

module Bio
   class FastaFormat
     def each_entry
       return to_enum(:each_entry) unless block_given?
       @continue = self.dup
       loop do
         yield @continue
         overrun = @continue.entry_overrun
         break unless overrun
         @continue = Bio::FastaFormat.new(overrun)
       end
     end
   end

   class FastaNumericFormat
     def each_entry
       return to_enum(:each_entry) unless block_given?
       @continue = self.dup
       loop do
         yield @continue
         overrun = @continue.entry_overrun
         break unless overrun
         @continue = Bio::FastaNumericFormat.new(overrun)
       end
     end
   end
end

fasta = <<EOS
>FXQB1I00000001
TATGGAATCTGTAGAATCAGTGGTAGGTGCAGCAGATGGAGGAAGG
>FXQB1I00000002
CTGGAGAATTCTGGATCCTCGACTTATGACTTGGTGGTTCTGGTAACTGTGAGCTTAGGATAGTCAG
EOS

qual = <<EOS
>FXQB1I00000001
30 30 29 42 25 24 5 30 30 30 30 30 28 30 26 9 30 30 30 30 30 42 25 30 30 
42 25 29 22 30 29 26 30 30 30 29 30 42 25 30 32 17 40 23 39 24
>FXQB1I00000002
30 30 33 19 28 30 26 9 32 12 30 30 33 20 30 30 32 15 27 27 30 28 28 34 
22 27 22 28 28 29 26 9 33 19 22 43 25 33 19 28 27 32 15 30 32 12 28 30 
27 30 30 26 27 30 40 23 30 40 23 30 29 29 30 30 30 29 30
EOS

enum_fasta = Bio::FastaFormat.new(fasta).each_entry
enum_qual = Bio::FastaNumericFormat.new(qual).each_entry

loop do
   fastq = Bio::Sequence.adapter(enum_fasta.next,
                                 Bio::Sequence::Adapter::Fastq)
   fastq.quality_score_type = :phred
   fastq.quality_scores = enum_qual.next.data
   puts fastq.output(:fastq)
end

-- 
MISHIMA, Hiroyuki, DDS, Ph.D.
COE Research Fellow
Department of Human Genetics
Nagasaki University Graduate School of Biomedical Sciences



More information about the BioRuby mailing list