From ngoto at dev.open-bio.org Wed May 7 02:17:55 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Wed, 07 May 2008 06:17:55 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/genbank format_genbank.rb, 1.1.2.1, 1.1.2.2 Message-ID: <200805070617.m476Htsk000805@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/genbank In directory dev.open-bio.org:/tmp/cvs-serv773/lib/bio/db/genbank Modified Files: Tag: BRANCH-biohackathon2008 format_genbank.rb Log Message: Bug fix: string in given object may be broken Index: format_genbank.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/genbank/Attic/format_genbank.rb,v retrieving revision 1.1.2.1 retrieving revision 1.1.2.2 diff -C2 -d -r1.1.2.1 -r1.1.2.2 *** format_genbank.rb 4 Mar 2008 11:19:16 -0000 1.1.2.1 --- format_genbank.rb 7 May 2008 06:17:52 -0000 1.1.2.2 *************** *** 64,72 **** pos = " (bases #{pos})" end ! journal = ref.journal.to_s ! volissue = ref.volume.to_s volissue += " (#{ref.issue})" unless ref.issue.to_s.empty? ! journal += " #{volissue}," unless volissue.empty? ! journal += " #{ref.pages}" unless ref.pages.to_s.empty? journal += " (#{ref.year})" unless ref.year.to_s.empty? --- 64,72 ---- pos = " (bases #{pos})" end ! volissue = "#{ref.volume.to_s}" volissue += " (#{ref.issue})" unless ref.issue.to_s.empty? ! journal = "#{ref.journal.to_s}" ! journal += " #{volissue}" unless volissue.empty? ! journal += ", #{ref.pages}" unless ref.pages.to_s.empty? journal += " (#{ref.year})" unless ref.year.to_s.empty? From ngoto at dev.open-bio.org Wed May 7 08:22:13 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Wed, 07 May 2008 12:22:13 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/embl common.rb,1.12.2.4,1.12.2.5 Message-ID: <200805071222.m47CMDaf007700@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/embl In directory dev.open-bio.org:/tmp/cvs-serv7680/lib/bio/db/embl Modified Files: Tag: BRANCH-biohackathon2008 common.rb Log Message: In Bio::EMBL#references method, authors' names are changed to be normalized as of GenBank-like style. For example, "van der Waals J.D." in EMBL is normalized into "van der Waals, J.D.". Index: common.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/embl/common.rb,v retrieving revision 1.12.2.4 retrieving revision 1.12.2.5 diff -C2 -d -r1.12.2.4 -r1.12.2.5 *** common.rb 23 Apr 2008 18:52:18 -0000 1.12.2.4 --- common.rb 7 May 2008 12:22:10 -0000 1.12.2.5 *************** *** 272,276 **** unless @data['references'] ary = self.ref.map {|ent| ! hash = Hash.new('') ent.each {|key, value| case key --- 272,276 ---- unless @data['references'] ary = self.ref.map {|ent| ! hash = Hash.new ent.each {|key, value| case key *************** *** 287,291 **** hash['sequence_position'] = value when 'RA' ! hash['authors'] = value.split(/\, /) when 'RT' hash['title'] = value --- 287,295 ---- hash['sequence_position'] = value when 'RA' ! a = value.split(/\, /) ! a.each do |x| ! x.sub!(/( [^ ]+)\z/, ",\\1") ! end ! hash['authors'] = a when 'RT' hash['title'] = value From ngoto at dev.open-bio.org Wed May 7 08:24:28 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Wed, 07 May 2008 12:24:28 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/embl format_embl.rb, 1.1.2.3, 1.1.2.4 Message-ID: <200805071224.m47COSwB007749@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/embl In directory dev.open-bio.org:/tmp/cvs-serv7729/lib/bio/db/embl Modified Files: Tag: BRANCH-biohackathon2008 format_embl.rb Log Message: Bug fix: in RA line, every author's name should not be splitted into two lines. Index: format_embl.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/embl/Attic/format_embl.rb,v retrieving revision 1.1.2.3 retrieving revision 1.1.2.4 diff -C2 -d -r1.1.2.3 -r1.1.2.4 *** format_embl.rb 23 Apr 2008 18:52:18 -0000 1.1.2.3 --- format_embl.rb 7 May 2008 12:24:26 -0000 1.1.2.4 *************** *** 21,28 **** --- 21,52 ---- private + # wrapping with EMBL style def embl_wrap(prefix, str) wrap(str.to_s, 80, prefix) end + # Given words (an Array of String) are wrapping with EMBL style. + # Each word is never splitted inside the word. + def embl_wrap_words(prefix, array) + width = 80 + result = [] + str = nil + array.each do |x| + if str then + if str.length + 1 + x.length > width then + str = nil + else + str.concat ' ' + str.concat x + end + end + unless str then + str = prefix + x + result.push str + end + end + result.join("\n") + end + # format reference # ref:: Bio::Reference object *************** *** 53,58 **** lines << embl_wrap("RX ", "PUBMED; #{ref.pubmed}.") end ! unless ref.authors.empty? ! lines << embl_wrap('RA ', ref.authors.join(', ') + ';') end lines << embl_wrap('RT ', --- 77,90 ---- lines << embl_wrap("RX ", "PUBMED; #{ref.pubmed}.") end ! unless ref.authors.empty? then ! auth = ref.authors.collect do |x| ! y = x.to_s.strip.split(/\, *([^\,]+)\z/) ! y[1].gsub!(/\. +/, '.') if y[1] ! y.join(' ') ! end ! lastauth = auth.pop ! auth.each { |x| x.concat ',' } ! auth.push(lastauth.to_s + ';') ! lines << embl_wrap_words('RA ', auth) end lines << embl_wrap('RT ', From ngoto at dev.open-bio.org Wed May 7 08:25:44 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Wed, 07 May 2008 12:25:44 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/genbank common.rb, 1.11.2.3, 1.11.2.4 Message-ID: <200805071225.m47CPivB007816@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/genbank In directory dev.open-bio.org:/tmp/cvs-serv7778/lib/bio/db/genbank Modified Files: Tag: BRANCH-biohackathon2008 common.rb Log Message: added support for 'REMARK' (comment in reference) Index: common.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/genbank/common.rb,v retrieving revision 1.11.2.3 retrieving revision 1.11.2.4 diff -C2 -d -r1.11.2.3 -r1.11.2.4 *** common.rb 4 Mar 2008 10:32:55 -0000 1.11.2.3 --- common.rb 7 May 2008 12:25:42 -0000 1.11.2.4 *************** *** 138,142 **** ary = [] toptag2array(get('REFERENCE')).each do |ref| ! hash = Hash.new('') subtag2array(ref).each do |field| case tag_get(field) --- 138,142 ---- ary = [] toptag2array(get('REFERENCE')).each do |ref| ! hash = Hash.new subtag2array(ref).each do |field| case tag_get(field) *************** *** 175,178 **** --- 175,181 ---- when /PUBMED/ hash['pubmed'] = truncate(tag_cut(field)) + when /REMARK/ + hash['comments'] ||= [] + hash['comments'].push truncate(tag_cut(field)) end end From ngoto at dev.open-bio.org Wed May 7 08:28:58 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Wed, 07 May 2008 12:28:58 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/genbank format_genbank.rb, 1.1.2.2, 1.1.2.3 Message-ID: <200805071228.m47CSw3A007865@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/genbank In directory dev.open-bio.org:/tmp/cvs-serv7845/lib/bio/db/genbank Modified Files: Tag: BRANCH-biohackathon2008 format_genbank.rb Log Message: * added support for 'REMARK' (comment in reference). * Bug Fix: an author's name should not be separated into two lines. Index: format_genbank.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/genbank/Attic/format_genbank.rb,v retrieving revision 1.1.2.2 retrieving revision 1.1.2.3 diff -C2 -d -r1.1.2.2 -r1.1.2.3 *** format_genbank.rb 7 May 2008 06:17:52 -0000 1.1.2.2 --- format_genbank.rb 7 May 2008 12:28:56 -0000 1.1.2.3 *************** *** 33,36 **** --- 33,104 ---- end + # Given words (an Array of String) are wrapping with EMBL style. + # Each word is never splitted inside the word. + def genbank_wrap_words(array) + width = 67 + result = [] + str = nil + array.each do |x| + if str then + if str.length + 1 + x.length > width then + str = nil + else + str.concat ' ' + str.concat x + end + end + unless str then + str = "#{x}" + result.push str + end + end + result.join("\n" + " " * 12) + end + + # formats references + def reference_format_genbank(ref, num) + pos = ref.sequence_position.to_s.gsub(/\s/, '') + pos.gsub!(/(\d+)\-(\d+)/, "\\1 to \\2") + pos.gsub!(/\s*\,\s*/, '; ') + if pos.empty? + pos = '' + else + pos = " (bases #{pos})" + end + volissue = "#{ref.volume.to_s}" + volissue += " (#{ref.issue})" unless ref.issue.to_s.empty? + journal = "#{ref.journal.to_s}" + journal += " #{volissue}" unless volissue.empty? + journal += ", #{ref.pages}" unless ref.pages.to_s.empty? + journal += " (#{ref.year})" unless ref.year.to_s.empty? + + alist = ref.authors.collect do |x| + y = x.to_s.strip.split(/\, *([^\,]+)\z/) + y[1].gsub!(/\. +/, '.') if y[1] + y.join(',') + end + lastauthor = alist.pop + last2author = alist.pop + alist.each { |x| x.concat ',' } + alist.push last2author if last2author + alist.push "and" unless alist.empty? + alist.push lastauthor.to_s + result = <<__END_OF_REFERENCE__ + REFERENCE #{ genbank_wrap(sprintf('%-2d%s', num, pos))} + AUTHORS #{ genbank_wrap_words(alist) } + TITLE #{ genbank_wrap(ref.title.to_s) } + JOURNAL #{ genbank_wrap(journal) } + __END_OF_REFERENCE__ + unless ref.pubmed.to_s.empty? then + result.concat " PUBMED #{ genbank_wrap(ref.pubmed) }\n" + end + if ref.comments and !(ref.comments.empty?) then + ref.comments.each do |c| + result.concat " REMARK #{ genbank_wrap(c) }\n" + end + end + result + end + # formats sequence lines as GenBank def each_genbank_seqline(str) #:yields: counter, seqline *************** *** 56,87 **** (references or []).each do |ref| n += 1 ! pos = ref.sequence_position.to_s.gsub(/\s/, '') ! pos.gsub!(/(\d+)\-(\d+)/, "\\1 to \\2") ! pos.gsub!(/\s*\,\s*/, '; ') ! if pos.empty? ! pos = '' ! else ! pos = " (bases #{pos})" ! end ! volissue = "#{ref.volume.to_s}" ! volissue += " (#{ref.issue})" unless ref.issue.to_s.empty? ! journal = "#{ref.journal.to_s}" ! journal += " #{volissue}" unless volissue.empty? ! journal += ", #{ref.pages}" unless ref.pages.to_s.empty? ! journal += " (#{ref.year})" unless ref.year.to_s.empty? ! ! alist = ref.authors.collect { |x| x.gsub(/\, /, ',') } ! lastauthor = alist.pop ! authorsline = alist.join(', ') ! authorsline.concat(" and ") unless alist.empty? ! authorsline.concat lastauthor.to_s ! ! %>REFERENCE <%= genbank_wrap(sprintf('%-2d%s', n, pos)) %> ! AUTHORS <%= genbank_wrap(authorsline) %> ! TITLE <%= genbank_wrap(ref.title.to_s) %> ! JOURNAL <%= genbank_wrap(journal) %> ! <% unless ref.pubmed.to_s.empty? ! %> PUBMED <%= ref.pubmed %> ! <% end end %>FEATURES Location/Qualifiers --- 124,128 ---- (references or []).each do |ref| n += 1 ! %><%= reference_format_genbank(ref, n) %><% end %>FEATURES Location/Qualifiers From ngoto at dev.open-bio.org Thu May 8 01:38:03 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Thu, 08 May 2008 05:38:03 +0000 Subject: [BioRuby-cvs] bioruby/test/unit/bio test_reference.rb, 1.3, 1.3.2.1 test_feature.rb, 1.5, 1.5.2.1 Message-ID: <200805080538.m485c31o010619@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/test/unit/bio In directory dev.open-bio.org:/tmp/cvs-serv10599/test/unit/bio Modified Files: Tag: BRANCH-biohackathon2008 test_reference.rb test_feature.rb Log Message: Unit test codes are changed due to the changes of Bio::References and Bio::Features (those are obsoleted but exist for backaward compatibility). Index: test_feature.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/test/unit/bio/test_feature.rb,v retrieving revision 1.5 retrieving revision 1.5.2.1 diff -C2 -d -r1.5 -r1.5.2.1 *** test_feature.rb 5 Apr 2007 23:35:42 -0000 1.5 --- test_feature.rb 8 May 2008 05:38:01 -0000 1.5.2.1 *************** *** 15,18 **** --- 15,19 ---- require 'test/unit' require 'bio/feature' + require 'bio/compat/features' *************** *** 89,96 **** --- 90,124 ---- end + class NullStderr + def initialize + @log = [] + end + + def write(*arg) + #p arg + @log.push([ :write, *arg ]) + nil + end + + def method_missing(*arg) + #p arg + @log.push arg + nil + end + end + class TestFeatures < Test::Unit::TestCase def setup + # To suppress warning messages, $stderr is replaced by dummy object. + @stderr_orig = $stderr + $stderr = NullStderr.new + @obj = Bio::Features.new([Bio::Feature.new('gene', '1..615', [])]) end + + def teardown + # bring back $stderr + $stderr = @stderr_orig + end def test_features Index: test_reference.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/test/unit/bio/test_reference.rb,v retrieving revision 1.3 retrieving revision 1.3.2.1 diff -C2 -d -r1.3 -r1.3.2.1 *** test_reference.rb 5 Apr 2007 23:35:42 -0000 1.3 --- test_reference.rb 8 May 2008 05:38:01 -0000 1.3.2.1 *************** *** 15,18 **** --- 15,19 ---- require 'test/unit' require 'bio/reference' + require 'bio/compat/references' *************** *** 173,179 **** --- 174,202 ---- end + class NullStderr + def initialize + @log = [] + end + + def write(*arg) + #p arg + @log.push([ :write, *arg ]) + nil + end + + def method_missing(*arg) + #p arg + @log.push arg + nil + end + end + class TestReferences < Test::Unit::TestCase def setup + # To suppress warning messages, $stderr is replaced by dummy object. + @stderr_orig = $stderr + $stderr = NullStderr.new + hash = {} ary = [Bio::Reference.new(hash), *************** *** 182,185 **** --- 205,213 ---- end + def teardown + # bring back $stderr + $stderr = @stderr_orig + end + def test_append hash = {} From ngoto at dev.open-bio.org Thu May 8 22:32:47 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Fri, 09 May 2008 02:32:47 +0000 Subject: [BioRuby-cvs] bioruby/test/unit/bio/appl/blast test_xmlparser.rb, 1.7, 1.7.2.1 Message-ID: <200805090232.m492Wlfl015068@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/test/unit/bio/appl/blast In directory dev.open-bio.org:/tmp/cvs-serv15048/test/unit/bio/appl/blast Modified Files: Tag: BRANCH-biohackathon2008 test_xmlparser.rb Log Message: Bug fix: tests in test/unit/bio/appl/blast/test_report.rb was ignored because of conflicts of test classes' names (TestBlastReport, etc.). The class names in test/unit/bio/appl/blast/test_xmlparser.rb is changed because it contains less assertions than that of test_report.rb. Index: test_xmlparser.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/test/unit/bio/appl/blast/test_xmlparser.rb,v retrieving revision 1.7 retrieving revision 1.7.2.1 diff -C2 -d -r1.7 -r1.7.2.1 *** test_xmlparser.rb 5 Apr 2007 23:35:43 -0000 1.7 --- test_xmlparser.rb 9 May 2008 02:32:44 -0000 1.7.2.1 *************** *** 16,20 **** ! module Bio class TestBlastFormat7XMLParserData bioruby_root = Pathname.new(File.join(File.dirname(__FILE__), ['..'] * 5)).cleanpath.to_s --- 16,20 ---- ! module Bio::TestBlastXMLParser class TestBlastFormat7XMLParserData bioruby_root = Pathname.new(File.join(File.dirname(__FILE__), ['..'] * 5)).cleanpath.to_s *************** *** 36,40 **** def setup ! @report = Bio::Blast::Report.new(Bio::TestBlastFormat7XMLParserData.output) end --- 36,40 ---- def setup ! @report = Bio::Blast::Report.new(TestBlastFormat7XMLParserData.output) end *************** *** 188,192 **** class TestBlastReportHit < Test::Unit::TestCase def setup ! data = Bio::TestBlastFormat7XMLParserData.output report = Bio::Blast::Report.new(data) @hit = report.hits.first --- 188,192 ---- class TestBlastReportHit < Test::Unit::TestCase def setup ! data = TestBlastFormat7XMLParserData.output report = Bio::Blast::Report.new(data) @hit = report.hits.first *************** *** 293,297 **** class TestBlastReportHsp < Test::Unit::TestCase def setup ! data = Bio::TestBlastFormat7XMLParserData.output report = Bio::Blast::Report.new(data) @hsp = report.hits.first.hsps.first --- 293,297 ---- class TestBlastReportHsp < Test::Unit::TestCase def setup ! data = TestBlastFormat7XMLParserData.output report = Bio::Blast::Report.new(data) @hsp = report.hits.first.hsps.first From ngoto at dev.open-bio.org Mon May 12 05:52:18 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Mon, 12 May 2008 09:52:18 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/appl/blast format0.rb,1.26,1.26.2.1 Message-ID: <200805120952.m4C9qIUl004178@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/appl/blast In directory dev.open-bio.org:/tmp/cvs-serv4135 Modified Files: Tag: BRANCH-biohackathon2008 format0.rb Log Message: same changes as 1.26 => 1.27 in trunk: Fixed a bug when a null line is inserted after database title in some cases, reported by Tomoaki NISHIYAMA. Index: format0.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/appl/blast/format0.rb,v retrieving revision 1.26 retrieving revision 1.26.2.1 diff -C2 -d -r1.26 -r1.26.2.1 *** format0.rb 12 Feb 2008 02:13:31 -0000 1.26 --- format0.rb 12 May 2008 09:52:15 -0000 1.26.2.1 *************** *** 294,297 **** --- 294,302 ---- @f0query = data.shift @f0database = data.shift + # In special case, a void line is inserted after database name. + if /\A +[\d\,]+ +sequences\; +[\d\,]+ total +letters\s*\z/ =~ data[0] then + @f0database.concat "\n" + @f0database.concat data.shift + end end From ngoto at dev.open-bio.org Mon May 12 07:16:19 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Mon, 12 May 2008 11:16:19 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/appl/blast format0.rb, 1.26.2.1, 1.26.2.2 Message-ID: <200805121116.m4CBGJMX004407@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/appl/blast In directory dev.open-bio.org:/tmp/cvs-serv4387/lib/bio/appl/blast Modified Files: Tag: BRANCH-biohackathon2008 format0.rb Log Message: Bug fix: Bio::Blast::Default::Report#eff_space returns wrong value ("Effective length of database"). It should return "Effective search space". Index: format0.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/appl/blast/format0.rb,v retrieving revision 1.26.2.1 retrieving revision 1.26.2.2 diff -C2 -d -r1.26.2.1 -r1.26.2.2 *** format0.rb 12 May 2008 09:52:15 -0000 1.26.2.1 --- format0.rb 12 May 2008 11:16:17 -0000 1.26.2.2 *************** *** 388,392 **** #@db_num = @hash['Number of Sequences'] unless defined?(@db_num) #@db_len = @hash['length of database'] unless defined?(@db_len) ! if val = @hash['effective length of database'] then @eff_space = val.tr(',', '').to_i end --- 388,392 ---- #@db_num = @hash['Number of Sequences'] unless defined?(@db_num) #@db_len = @hash['length of database'] unless defined?(@db_len) ! if val = @hash['effective search space'] then @eff_space = val.tr(',', '').to_i end From ngoto at dev.open-bio.org Mon May 12 07:25:57 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Mon, 12 May 2008 11:25:57 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/appl/blast format0.rb,1.27,1.28 Message-ID: <200805121125.m4CBPvmO004456@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/appl/blast In directory dev.open-bio.org:/tmp/cvs-serv4436 Modified Files: format0.rb Log Message: The same change as 1.26.2.1 ==> 1.26.2.2: Bug fix: Bio::Blast::Default::Report#eff_space returns wrong value ("Effective length of database"). It should return "Effective search space". Index: format0.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/appl/blast/format0.rb,v retrieving revision 1.27 retrieving revision 1.28 diff -C2 -d -r1.27 -r1.28 *** format0.rb 1 Apr 2008 06:31:35 -0000 1.27 --- format0.rb 12 May 2008 11:25:55 -0000 1.28 *************** *** 388,392 **** #@db_num = @hash['Number of Sequences'] unless defined?(@db_num) #@db_len = @hash['length of database'] unless defined?(@db_len) ! if val = @hash['effective length of database'] then @eff_space = val.tr(',', '').to_i end --- 388,392 ---- #@db_num = @hash['Number of Sequences'] unless defined?(@db_num) #@db_len = @hash['length of database'] unless defined?(@db_len) ! if val = @hash['effective search space'] then @eff_space = val.tr(',', '').to_i end From ngoto at dev.open-bio.org Mon May 12 07:49:10 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Mon, 12 May 2008 11:49:10 +0000 Subject: [BioRuby-cvs] bioruby/test/unit/bio/appl/blast test_report.rb, 1.5, 1.5.2.1 Message-ID: <200805121149.m4CBnAL5004568@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/test/unit/bio/appl/blast In directory dev.open-bio.org:/tmp/cvs-serv4548 Modified Files: Tag: BRANCH-biohackathon2008 test_report.rb Log Message: * Class TestBlastReportData is changed to module TestBlastReportHelper and improved. * Changed to test both rexml and xmlparser. * Added tests for Bio::Blast::Default::Report. * Name of a method Bio::TestBlastReport#test_extrez_query is changed to "test_entrez_query" because it may be a typo. * Some tests are changed to use assert_nothing_raised{ ... } instead of assert() (or no assertions). Index: test_report.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/test/unit/bio/appl/blast/test_report.rb,v retrieving revision 1.5 retrieving revision 1.5.2.1 diff -C2 -d -r1.5 -r1.5.2.1 *** test_report.rb 5 Apr 2007 23:35:43 -0000 1.5 --- test_report.rb 12 May 2008 11:49:08 -0000 1.5.2.1 *************** *** 17,46 **** module Bio ! class TestBlastReportData bioruby_root = Pathname.new(File.join(File.dirname(__FILE__), ['..'] * 5)).cleanpath.to_s TestDataBlast = Pathname.new(File.join(bioruby_root, 'test', 'data', 'blast')).cleanpath.to_s ! def self.input ! File.open(File.join(TestDataBlast, 'b0002.faa')).read end ! def self.output(format = 7) ! case format ! when 0 ! File.open(File.join(TestDataBlast, 'b0002.faa.m0')).read ! when 7 ! File.open(File.join(TestDataBlast, 'b0002.faa.m7')).read ! when 8 ! File.open(File.join(TestDataBlast, 'b0002.faa.m8')).read end end ! end - class TestBlastReport < Test::Unit::TestCase ! require 'bio/appl/blast/report' def setup ! @report = Bio::Blast::Report.new(Bio::TestBlastReportData.output) end --- 17,68 ---- module Bio ! ! module TestBlastReportHelper bioruby_root = Pathname.new(File.join(File.dirname(__FILE__), ['..'] * 5)).cleanpath.to_s TestDataBlast = Pathname.new(File.join(bioruby_root, 'test', 'data', 'blast')).cleanpath.to_s ! private ! ! def get_input_data(basename = 'b0002.faa') ! File.open(File.join(TestDataBlast, basename)).read end ! def get_output_data(basename = 'b0002.faa', format = 7) ! fn = basename + ".m#{format.to_i}" ! ! # available filenames: ! # 'b0002.faa.m0' ! # 'b0002.faa.m7' ! # 'b0002.faa.m8' ! ! File.open(File.join(TestDataBlast, fn)).read ! end ! ! def create_report_object(basename = 'b0002.faa') ! case self.class.name.to_s ! when /XMLParser/i ! text = get_output_data(basename, 7) ! Bio::Blast::Report.new(text, :xmlparser) ! when /REXML/i ! text = get_output_data(basename, 7) ! Bio::Blast::Report.new(text, :rexml) ! when /Default/i ! text = get_output_data(basename, 0) ! Bio::Blast::Default::Report.new(text) ! when /Tab/i ! text = get_output_data(basename, 8) ! Bio::Blast::Report.new(text) ! else ! text = get_output_data(basename, 7) ! Bio::Blast::Report.new(text) end end ! end #module TestBlastReportHelper class TestBlastReport < Test::Unit::TestCase ! include TestBlastReportHelper def setup ! @report = create_report_object end *************** *** 97,109 **** def test_inclusion ! assert(@report.inclusion) end def test_sc_match ! assert(@report.sc_match) end def test_sc_mismatch ! assert(@report.sc_mismatch) end --- 119,131 ---- def test_inclusion ! assert_nothing_raised { @report.inclusion } end def test_sc_match ! assert_nothing_raised { @report.sc_match } end def test_sc_mismatch ! assert_nothing_raised { @report.sc_mismatch } end *************** *** 124,137 **** end ! def test_extrez_query assert_equal(nil, @report.entrez_query) end def test_each_iteration ! @report.each_iteration { |itr| } end def test_each_hit ! @report.each_hit { |hit| } end --- 146,163 ---- end ! def test_entrez_query assert_equal(nil, @report.entrez_query) end def test_each_iteration ! assert_nothing_raised { ! @report.each_iteration { |itr| } ! } end def test_each_hit ! assert_nothing_raised { ! @report.each_hit { |hit| } ! } end *************** *** 178,184 **** class TestBlastReportIteration < Test::Unit::TestCase def setup ! data = Bio::TestBlastReportData.output ! report = Bio::Blast::Report.new(data) @itr = report.iterations.first end --- 204,211 ---- class TestBlastReportIteration < Test::Unit::TestCase + include TestBlastReportHelper + def setup ! report = create_report_object @itr = report.iterations.first end *************** *** 205,211 **** class TestBlastReportHit < Test::Unit::TestCase def setup ! data = Bio::TestBlastReportData.output ! report = Bio::Blast::Report.new(data) @hit = report.hits.first end --- 232,239 ---- class TestBlastReportHit < Test::Unit::TestCase + include TestBlastReportHelper + def setup ! report = create_report_object @hit = report.hits.first end *************** *** 316,322 **** class TestBlastReportHsp < Test::Unit::TestCase def setup ! data = Bio::TestBlastReportData.output ! report = Bio::Blast::Report.new(data) @hsp = report.hits.first.hsps.first end --- 344,351 ---- class TestBlastReportHsp < Test::Unit::TestCase + include TestBlastReportHelper + def setup ! report = create_report_object @hsp = report.hits.first.hsps.first end *************** *** 343,347 **** def test_Hsp_gaps ! assert(@hsp.gaps) end --- 372,376 ---- def test_Hsp_gaps ! assert_nothing_raised { @hsp.gaps } end *************** *** 383,391 **** def test_Hsp_pattern_from ! @hsp.pattern_from end def test_Hsp_pattern_to ! @hsp.pattern_to end --- 412,420 ---- def test_Hsp_pattern_from ! assert_nothing_raised { @hsp.pattern_from } end def test_Hsp_pattern_to ! assert_nothing_raised { @hsp.pattern_to } end *************** *** 406,417 **** def test_Hsp_percent_identity ! @hsp.percent_identity end def test_Hsp_mismatch_count ! @hsp.mismatch_count end end end # module Bio --- 435,614 ---- def test_Hsp_percent_identity ! assert_nothing_raised { @hsp.percent_identity } end def test_Hsp_mismatch_count ! assert_nothing_raised { @hsp.mismatch_count } end end + class TestBlastReportREXML < TestBlastReport + end + + class TestBlastReportIterationREXML < TestBlastReportIteration + end + + class TestBlastReportHitREXML < TestBlastReportHit + end + + class TestBlastReportHspREXML < TestBlastReportHsp + end + + if defined? XMLParser then + + class TestBlastReportXMLParser < TestBlastReport + end + + class TestBlastReportIterationXMLParser < TestBlastReportIteration + end + + class TestBlastReportHitXMLParser < TestBlastReportHit + end + + class TestBlastReportHspXMLParser < TestBlastReportHsp + end + + end #if defined? XMLParser + + class TestBlastReportDefault < TestBlastReport + undef test_entrez_query + undef test_filter + undef test_hsp_len + undef test_inclusion + undef test_parameters + undef test_query_id + undef test_statistics + + def test_program + assert_equal('BLASTP', @report.program) + end + + def test_reference + text_str = 'Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402.' + assert_equal(text_str, @report.reference) + end + + def test_version + assert_equal('BLASTP 2.2.10 [Oct-19-2004]', @report.version) + end + + def test_kappa + assert_equal(0.134, @report.kappa) + end + + def test_lambda + assert_equal(0.319, @report.lambda) + end + + def test_entropy + assert_equal(0.383, @report.entropy) + end + + def test_gapped_kappa + assert_equal(0.0410, @report.gapped_kappa) + end + + def test_gapped_lambda + assert_equal(0.267, @report.gapped_lambda) + end + + def test_gapped_entropy + assert_equal(0.140, @report.gapped_entropy) + end + end + + class TestBlastReportIterationDefault < TestBlastReportIteration + undef test_statistics + end + + class TestBlastReportHitDefault < TestBlastReportHit + undef test_Hit_accession + undef test_Hit_hit_id + undef test_Hit_num + undef test_Hit_query_def + undef test_Hit_query_id + undef test_Hit_query_len + + def setup + @filtered_query_sequence = 'MRVLKFGGTSVANAERFLRVADILESNARQGQVATVLSAPAKITNHLVAMIEKTISGQDALPNISDAERIFAELLTGLAAAQPGFPLAQLKTFVDQEFAQIKHVLHGISLLGQCPDSINAALICRGEKMSIAIMAGVLEARGHNVTVIDPVEKLLAVGHYLESTVDIAESTRRIAASRIPADHMVLMAGFTAGNEKGELVVLGRNGSDYSAAVLAACLRADCCEIWTDVDGVYTCDPRQVPDARLLKSMSYQEAMELSYFGAKVLHPRTITPIAQFQIPCLIKNTGNPQAPGTLIGASRDEDELPVKGISNLNNMAMFSVSGPGMKGMVGMAARVFAAMSRARISVVLITQSSSEYSISFCVPQSDCVRAERAMQEEFYLELKEGLLEPLAVTERLAIISVVGDGMRTLRGISAKFFAALARANINIVAIAQGSSERSISVVVNNDDATTGVRVTHQMLFNTDQxxxxxxxxxxxxxxALLEQLKRQQSWLKNKHIDLRVCGVANSKALLTNVHGLNLENWQEELAQAKEPFNLGRLIRLVKEYHLLNPVIVDCTSSQAVADQYADFLREGFHVVTPNKKANTSSMDYYHQLRYAAEKSRRKFLYDTNVGAGLPVIENLQNLLNAGDELMKFSGILSGSLSYIFGKLDEGMSFSEATTLAREMGYTEPDPRDDLSGMDVARKLLILARETGRELELADIEIEPVLPAEFNAEGDVAAFMANLSQLDDLFAARVAKARDEGKVLRYVGNIDEDGVCRVKIAEVDGNDPLFKVKNGENALAFYSHYYQPLPLVLRGYGAGNDVTAAGVFADLLRTLSWKLGV' + super + end + + def test_Hit_bit_score + # differs from XML because of truncation in the default format + assert_equal(1567.0, @hit.bit_score) + end + + def test_Hit_identity + # differs from XML because filtered residues are not counted in the + # default format + assert_equal(806, @hit.identity) + end + + def test_Hit_midline + # differs from XML because filtered residues are not specified in XML + seq = @filtered_query_sequence.gsub(/x/, ' ') + assert_equal(seq, @hit.midline) + end + + def test_Hit_query_seq + # differs from XML because filtered residues are not specified in XML + seq = @filtered_query_sequence.gsub(/x/, 'X') + assert_equal(seq, @hit.query_seq) + end + end + + class TestBlastReportHspDefault < TestBlastReportHsp + undef test_Hsp_density + undef test_Hsp_mismatch_count + undef test_Hsp_num + undef test_Hsp_pattern_from + undef test_Hsp_pattern_to + + def setup + @filtered_query_sequence = 'MRVLKFGGTSVANAERFLRVADILESNARQGQVATVLSAPAKITNHLVAMIEKTISGQDALPNISDAERIFAELLTGLAAAQPGFPLAQLKTFVDQEFAQIKHVLHGISLLGQCPDSINAALICRGEKMSIAIMAGVLEARGHNVTVIDPVEKLLAVGHYLESTVDIAESTRRIAASRIPADHMVLMAGFTAGNEKGELVVLGRNGSDYSAAVLAACLRADCCEIWTDVDGVYTCDPRQVPDARLLKSMSYQEAMELSYFGAKVLHPRTITPIAQFQIPCLIKNTGNPQAPGTLIGASRDEDELPVKGISNLNNMAMFSVSGPGMKGMVGMAARVFAAMSRARISVVLITQSSSEYSISFCVPQSDCVRAERAMQEEFYLELKEGLLEPLAVTERLAIISVVGDGMRTLRGISAKFFAALARANINIVAIAQGSSERSISVVVNNDDATTGVRVTHQMLFNTDQxxxxxxxxxxxxxxALLEQLKRQQSWLKNKHIDLRVCGVANSKALLTNVHGLNLENWQEELAQAKEPFNLGRLIRLVKEYHLLNPVIVDCTSSQAVADQYADFLREGFHVVTPNKKANTSSMDYYHQLRYAAEKSRRKFLYDTNVGAGLPVIENLQNLLNAGDELMKFSGILSGSLSYIFGKLDEGMSFSEATTLAREMGYTEPDPRDDLSGMDVARKLLILARETGRELELADIEIEPVLPAEFNAEGDVAAFMANLSQLDDLFAARVAKARDEGKVLRYVGNIDEDGVCRVKIAEVDGNDPLFKVKNGENALAFYSHYYQPLPLVLRGYGAGNDVTAAGVFADLLRTLSWKLGV' + super + end + + def test_Hsp_identity + # differs from XML because filtered residues are not counted in the + # default format + assert_equal(806, @hsp.identity) + end + + def test_Hsp_positive + # differs from XML because filtered residues are not counted in the + # default format + assert_equal(806, @hsp.positive) + end + + def test_Hsp_midline + # differs from XML because filtered residues are not specified in XML + seq = @filtered_query_sequence.gsub(/x/, ' ') + assert_equal(seq, @hsp.midline) + end + + def test_Hsp_qseq + # differs from XML because filtered residues are not specified in XML + seq = @filtered_query_sequence.gsub(/x/, 'X') + assert_equal(seq, @hsp.qseq) + end + + def test_Hsp_hit_score + # differs from XML because of truncation in the default format + assert_equal(1567.0, @hsp.bit_score) + end + + def test_Hsp_hit_frame + # differs from XML because not available in the default BLASTP format + assert_equal(nil, @hsp.hit_frame) + end + + def test_Hsp_query_frame + # differs from XML because not available in the default BLASTP format + assert_equal(nil, @hsp.query_frame) + end + end + end # module Bio From ngoto at dev.open-bio.org Mon May 12 07:50:44 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Mon, 12 May 2008 11:50:44 +0000 Subject: [BioRuby-cvs] bioruby/test/unit/bio/appl/blast test_xmlparser.rb, 1.7.2.1, NONE Message-ID: <200805121150.m4CBoiA4004596@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/test/unit/bio/appl/blast In directory dev.open-bio.org:/tmp/cvs-serv4576 Removed Files: Tag: BRANCH-biohackathon2008 test_xmlparser.rb Log Message: test_xmlparser.rb is removed because it has few assertions and its role is now merged into test_report.rb. --- test_xmlparser.rb DELETED --- From ngoto at dev.open-bio.org Mon May 12 08:01:22 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Mon, 12 May 2008 12:01:22 +0000 Subject: [BioRuby-cvs] bioruby/test/unit/bio/appl/blast test_xmlparser.rb, 1.7, 1.8 Message-ID: <200805121201.m4CC1Mog004646@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/test/unit/bio/appl/blast In directory dev.open-bio.org:/tmp/cvs-serv4624/test/unit/bio/appl/blast Modified Files: test_xmlparser.rb Log Message: Same changes as 1.7 to 1.7.2.1 in BRANCH-biohackathon2008: Bug fix: tests in test/unit/bio/appl/blast/test_report.rb was ignored because of conflicts of test classes' names (TestBlastReport, etc.). The class names in test/unit/bio/appl/blast/test_xmlparser.rb is changed because it contains less assertions than that of test_report.rb. Index: test_xmlparser.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/test/unit/bio/appl/blast/test_xmlparser.rb,v retrieving revision 1.7 retrieving revision 1.8 diff -C2 -d -r1.7 -r1.8 *** test_xmlparser.rb 5 Apr 2007 23:35:43 -0000 1.7 --- test_xmlparser.rb 12 May 2008 12:01:20 -0000 1.8 *************** *** 16,20 **** ! module Bio class TestBlastFormat7XMLParserData bioruby_root = Pathname.new(File.join(File.dirname(__FILE__), ['..'] * 5)).cleanpath.to_s --- 16,20 ---- ! module Bio::TestBlastXMLParser class TestBlastFormat7XMLParserData bioruby_root = Pathname.new(File.join(File.dirname(__FILE__), ['..'] * 5)).cleanpath.to_s *************** *** 36,40 **** def setup ! @report = Bio::Blast::Report.new(Bio::TestBlastFormat7XMLParserData.output) end --- 36,40 ---- def setup ! @report = Bio::Blast::Report.new(TestBlastFormat7XMLParserData.output) end *************** *** 188,192 **** class TestBlastReportHit < Test::Unit::TestCase def setup ! data = Bio::TestBlastFormat7XMLParserData.output report = Bio::Blast::Report.new(data) @hit = report.hits.first --- 188,192 ---- class TestBlastReportHit < Test::Unit::TestCase def setup ! data = TestBlastFormat7XMLParserData.output report = Bio::Blast::Report.new(data) @hit = report.hits.first *************** *** 293,297 **** class TestBlastReportHsp < Test::Unit::TestCase def setup ! data = Bio::TestBlastFormat7XMLParserData.output report = Bio::Blast::Report.new(data) @hsp = report.hits.first.hsps.first --- 293,297 ---- class TestBlastReportHsp < Test::Unit::TestCase def setup ! data = TestBlastFormat7XMLParserData.output report = Bio::Blast::Report.new(data) @hsp = report.hits.first.hsps.first From ngoto at dev.open-bio.org Mon May 12 09:11:47 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Mon, 12 May 2008 13:11:47 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/appl/blast rexml.rb, 1.12, 1.13 xmlparser.rb, 1.17, 1.18 Message-ID: <200805121311.m4CDBl0K004957@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/appl/blast In directory dev.open-bio.org:/tmp/cvs-serv4930/lib/bio/appl/blast Modified Files: rexml.rb xmlparser.rb Log Message: * lib/bio/appl/blast/xmlparser.rb, lib/bio/appl/blast/rexml.rb Bug fix: unit test sometime fails due to improper treatment of some Blast parameters and difference between rexml and xmlparser. To fix the bug, types of some parameters may be changed, e.g. Bio::Blast::Report#expect is changed to return Float or nil. * ChangeLog ChangeLog for today's changes to lib/bio/appl/blast/* and related files. Index: xmlparser.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/appl/blast/xmlparser.rb,v retrieving revision 1.17 retrieving revision 1.18 diff -C2 -d -r1.17 -r1.18 *** xmlparser.rb 5 Apr 2007 23:35:39 -0000 1.17 --- xmlparser.rb 12 May 2008 13:11:45 -0000 1.18 *************** *** 116,139 **** end ! def xmlparser_parse_parameters(hash) ! labels = { ! 'matrix' => 'Parameters_matrix', ! 'expect' => 'Parameters_expect', ! 'include' => 'Parameters_include', ! 'sc-match' => 'Parameters_sc-match', ! 'sc-mismatch' => 'Parameters_sc-mismatch', ! 'gap-open' => 'Parameters_gap-open', ! 'gap-extend' => 'Parameters_gap-extend', ! 'filter' => 'Parameters_filter', ! 'pattern' => 'Parameters_pattern', ! 'entrez-query' => 'Parameters_entrez-query', ! } ! labels.each do |k,v| case k ! when 'filter', 'matrix' ! @parameters[k] = hash[v].to_s else ! @parameters[k] = hash[v].to_i end end end --- 116,148 ---- end ! # set parameter of the key as val ! def xml_set_parameter(key, val) ! #labels = { ! # 'matrix' => 'Parameters_matrix', ! # 'expect' => 'Parameters_expect', ! # 'include' => 'Parameters_include', ! # 'sc-match' => 'Parameters_sc-match', ! # 'sc-mismatch' => 'Parameters_sc-mismatch', ! # 'gap-open' => 'Parameters_gap-open', ! # 'gap-extend' => 'Parameters_gap-extend', ! # 'filter' => 'Parameters_filter', ! # 'pattern' => 'Parameters_pattern', ! # 'entrez-query' => 'Parameters_entrez-query', ! #} ! k = key.sub(/\AParameters\_/, '') ! @parameters[k] = case k ! when 'expect', 'include' ! val.to_f ! when /\Agap\-/, /\Asc\-/ ! val.to_i else ! val end + end + + def xmlparser_parse_parameters(hash) + hash.each do |k, v| + xml_set_parameter(k, v) end end Index: rexml.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/appl/blast/rexml.rb,v retrieving revision 1.12 retrieving revision 1.13 diff -C2 -d -r1.12 -r1.13 *** rexml.rb 5 Apr 2007 23:35:39 -0000 1.12 --- rexml.rb 12 May 2008 13:11:45 -0000 1.13 *************** *** 38,44 **** when 'BlastOutput_param' e.elements["Parameters"].each_element_with_text do |p| ! k = p.name.sub(/Parameters_/, '') ! v = p.text =~ /\D/ ? p.text : p.text.to_i ! @parameters[k] = v end else --- 38,42 ---- when 'BlastOutput_param' e.elements["Parameters"].each_element_with_text do |p| ! xml_set_parameter(p.name, p.text) end else From ngoto at dev.open-bio.org Mon May 12 09:11:47 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Mon, 12 May 2008 13:11:47 +0000 Subject: [BioRuby-cvs] bioruby ChangeLog,1.85,1.86 Message-ID: <200805121311.m4CDBlOs004952@dev.open-bio.org> Update of /home/repository/bioruby/bioruby In directory dev.open-bio.org:/tmp/cvs-serv4930 Modified Files: ChangeLog Log Message: * lib/bio/appl/blast/xmlparser.rb, lib/bio/appl/blast/rexml.rb Bug fix: unit test sometime fails due to improper treatment of some Blast parameters and difference between rexml and xmlparser. To fix the bug, types of some parameters may be changed, e.g. Bio::Blast::Report#expect is changed to return Float or nil. * ChangeLog ChangeLog for today's changes to lib/bio/appl/blast/* and related files. Index: ChangeLog =================================================================== RCS file: /home/repository/bioruby/bioruby/ChangeLog,v retrieving revision 1.85 retrieving revision 1.86 diff -C2 -d -r1.85 -r1.86 *** ChangeLog 15 Apr 2008 13:54:38 -0000 1.85 --- ChangeLog 12 May 2008 13:11:45 -0000 1.86 *************** *** 1,2 **** --- 1,23 ---- + 2008-05-12 Naohisa Goto + + * lib/bio/appl/blast/xmlparser.rb, lib/bio/appl/blast/rexml.rb + + Bug fix: unit test sometime fails due to improper treatment of some + Blast parameters and difference between rexml and xmlparser. + To fix the bug, types of some parameters may be changed, e.g. + Bio::Blast::Report#expect is changed to return Float or nil. + + * lib/bio/appl/blast/format0.rb + + Bug fix: Bio::Blast::Default::Report#eff_space returns wrong value + ("Effective length of database"). It should return the value of + "Effective search space". + + * test/unit/bio/appl/blast/test_xmlparser.rb + + Bug fix: tests in test/unit/bio/appl/blast/test_report.rb were + ignored because of conflicts of the names of test classes. + Class name in test_xmlparser.rb is changed to fix the bug. + 2008-04-15 Naohisa Goto From ngoto at dev.open-bio.org Mon May 12 09:19:35 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Mon, 12 May 2008 13:19:35 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/appl/blast rexml.rb, 1.12, 1.12.2.1 xmlparser.rb, 1.17, 1.17.2.1 Message-ID: <200805121319.m4CDJZTf004986@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/appl/blast In directory dev.open-bio.org:/tmp/cvs-serv4966/lib/bio/appl/blast Modified Files: Tag: BRANCH-biohackathon2008 rexml.rb xmlparser.rb Log Message: Merging differences between 1.17 and 1.18 into xmplarser.rb and between 1.12 and 1.13 into rexml.rb. Index: xmlparser.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/appl/blast/xmlparser.rb,v retrieving revision 1.17 retrieving revision 1.17.2.1 diff -C2 -d -r1.17 -r1.17.2.1 *** xmlparser.rb 5 Apr 2007 23:35:39 -0000 1.17 --- xmlparser.rb 12 May 2008 13:19:32 -0000 1.17.2.1 *************** *** 116,139 **** end ! def xmlparser_parse_parameters(hash) ! labels = { ! 'matrix' => 'Parameters_matrix', ! 'expect' => 'Parameters_expect', ! 'include' => 'Parameters_include', ! 'sc-match' => 'Parameters_sc-match', ! 'sc-mismatch' => 'Parameters_sc-mismatch', ! 'gap-open' => 'Parameters_gap-open', ! 'gap-extend' => 'Parameters_gap-extend', ! 'filter' => 'Parameters_filter', ! 'pattern' => 'Parameters_pattern', ! 'entrez-query' => 'Parameters_entrez-query', ! } ! labels.each do |k,v| case k ! when 'filter', 'matrix' ! @parameters[k] = hash[v].to_s else ! @parameters[k] = hash[v].to_i end end end --- 116,148 ---- end ! # set parameter of the key as val ! def xml_set_parameter(key, val) ! #labels = { ! # 'matrix' => 'Parameters_matrix', ! # 'expect' => 'Parameters_expect', ! # 'include' => 'Parameters_include', ! # 'sc-match' => 'Parameters_sc-match', ! # 'sc-mismatch' => 'Parameters_sc-mismatch', ! # 'gap-open' => 'Parameters_gap-open', ! # 'gap-extend' => 'Parameters_gap-extend', ! # 'filter' => 'Parameters_filter', ! # 'pattern' => 'Parameters_pattern', ! # 'entrez-query' => 'Parameters_entrez-query', ! #} ! k = key.sub(/\AParameters\_/, '') ! @parameters[k] = case k ! when 'expect', 'include' ! val.to_f ! when /\Agap\-/, /\Asc\-/ ! val.to_i else ! val end + end + + def xmlparser_parse_parameters(hash) + hash.each do |k, v| + xml_set_parameter(k, v) end end Index: rexml.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/appl/blast/rexml.rb,v retrieving revision 1.12 retrieving revision 1.12.2.1 diff -C2 -d -r1.12 -r1.12.2.1 *** rexml.rb 5 Apr 2007 23:35:39 -0000 1.12 --- rexml.rb 12 May 2008 13:19:32 -0000 1.12.2.1 *************** *** 38,44 **** when 'BlastOutput_param' e.elements["Parameters"].each_element_with_text do |p| ! k = p.name.sub(/Parameters_/, '') ! v = p.text =~ /\D/ ? p.text : p.text.to_i ! @parameters[k] = v end else --- 38,42 ---- when 'BlastOutput_param' e.elements["Parameters"].each_element_with_text do |p| ! xml_set_parameter(p.name, p.text) end else From ngoto at dev.open-bio.org Tue May 13 07:19:44 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Tue, 13 May 2008 11:19:44 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/appl/blast format0.rb, 1.26.2.2, 1.26.2.3 Message-ID: <200805131119.m4DBJiw2008784@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/appl/blast In directory dev.open-bio.org:/tmp/cvs-serv8700/lib/bio/appl/blast Modified Files: Tag: BRANCH-biohackathon2008 format0.rb Log Message: Bio::Blast::Default::Report::Iteration#lambda, #kappa, #entropy, #gapped_lambda, #gapped_kappa, and #gapped_entropy, and the same methods in Bio::Blast::Default::Report class are changed to return float or nil instead of string or nil. Index: format0.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/appl/blast/format0.rb,v retrieving revision 1.26.2.2 retrieving revision 1.26.2.3 diff -C2 -d -r1.26.2.2 -r1.26.2.3 *** format0.rb 12 May 2008 11:16:17 -0000 1.26.2.2 --- format0.rb 13 May 2008 11:19:42 -0000 1.26.2.3 *************** *** 723,733 **** end if gapped then ! @gapped_lambda = h['Lambda'] ! @gapped_kappa = h['K'] ! @gapped_entropy = h['H'] else ! @lambda = h['Lambda'] ! @kappa = h['K'] ! @entropy = h['H'] end end #each --- 723,733 ---- end if gapped then ! @gapped_lambda = (v = h['Lambda']) ? v.to_f : nil ! @gapped_kappa = (v = h['K']) ? v.to_f : nil ! @gapped_entropy = (v = h['H']) ? v.to_f : nil else ! @lambda = (v = h['Lambda']) ? v.to_f : nil ! @kappa = (v = h['K']) ? v.to_f : nil ! @entropy = (v = h['H']) ? v.to_f : nil end end #each From ngoto at dev.open-bio.org Tue May 13 07:21:47 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Tue, 13 May 2008 11:21:47 +0000 Subject: [BioRuby-cvs] bioruby/doc Changes-1.3.rd,NONE,1.1.2.1 Message-ID: <200805131121.m4DBLlqg008833@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/doc In directory dev.open-bio.org:/tmp/cvs-serv8813/doc Added Files: Tag: BRANCH-biohackathon2008 Changes-1.3.rd Log Message: newly added documents describing important or incompatible changes in bioruby-1.3 (after bioruby-1.2.1). --- NEW FILE: Changes-1.3.rd --- = Incompatible and important changes since the BioRuby 1.2.1 release A lot of changes have been made to the BioRuby after the version 1.2.1 is released. == Incompatible changes --- Bio::Features Bio::Features is obsoleted and changed to an array of Bio::Feature object with some backward compatibility methods. The backward compatibility methods will soon be removed in the future. --- Bio::References Bio::References is obsoleted and changed to an array of Bio::Reference object with some backward compatibility methods. The backward compatibility methods will soon be removed in the future. --- Bio::BLAST::Default::Report, Bio::BLAST::Default::Report::Hit, Bio::BLAST::Default::Report::HSP, Bio::BLAST::WU::Report, Bio::BLAST::WU::Report::Hit, Bio::BLAST::WU::Report::HSP * Iteration#lambda, #kappa, #entropy, #gapped_lambda, #gapped_kappa, and #gapped_entropy, and the same methods in the Report class are changed to return float or nil instead of string or nil. From ngoto at dev.open-bio.org Wed May 14 09:30:15 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Wed, 14 May 2008 13:30:15 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/appl/blast format0.rb,1.28,1.29 Message-ID: <200805141330.m4EDUFSv011996@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/appl/blast In directory dev.open-bio.org:/tmp/cvs-serv11956/lib/bio/appl/blast Modified Files: format0.rb Log Message: Bug fix: For some PHI-BLAST (blastpgp) entries, possibly due to the changes of output format, Bio::Blast::Default::Report::Iteration#eff_space (and the shortcut method in the Report class) raises StringScanner::Error. In addition, Iteration#pattern and #pattern_positions returns incorrect values possibly due to the output format changes. Index: format0.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/appl/blast/format0.rb,v retrieving revision 1.28 retrieving revision 1.29 diff -C2 -d -r1.28 -r1.29 *** format0.rb 12 May 2008 11:25:55 -0000 1.28 --- format0.rb 14 May 2008 13:30:12 -0000 1.29 *************** *** 535,539 **** r = data.first break unless r ! if /^Significant alignments for pattern/ =~ r data.shift r = data.first --- 535,539 ---- r = data.first break unless r ! while /^Significant alignments for pattern/ =~ r data.shift r = data.first *************** *** 590,596 **** @f0message.each do |r| sc = StringScanner.new(r) ! if sc.skip_until(/^ *pattern +(.+)$/) then @pattern = sc[1] unless @pattern ! sc.skip_until(/^ at position +(\d+)/) @pattern_positions << sc[1].to_i end --- 590,596 ---- @f0message.each do |r| sc = StringScanner.new(r) ! if sc.skip_until(/^ *pattern +([^\s]+)/) then @pattern = sc[1] unless @pattern ! sc.skip_until(/(?:^ *| +)at position +(\d+) +of +query +sequence/) @pattern_positions << sc[1].to_i end From ngoto at dev.open-bio.org Wed May 14 09:37:05 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Wed, 14 May 2008 13:37:05 +0000 Subject: [BioRuby-cvs] bioruby ChangeLog,1.86,1.87 Message-ID: <200805141337.m4EDb514012065@dev.open-bio.org> Update of /home/repository/bioruby/bioruby In directory dev.open-bio.org:/tmp/cvs-serv12045 Modified Files: ChangeLog Log Message: ChangeLog for lib/bio/appl/blast/format0.rb from 1.28 to 1.29. Index: ChangeLog =================================================================== RCS file: /home/repository/bioruby/bioruby/ChangeLog,v retrieving revision 1.86 retrieving revision 1.87 diff -C2 -d -r1.86 -r1.87 *** ChangeLog 12 May 2008 13:11:45 -0000 1.86 --- ChangeLog 14 May 2008 13:37:03 -0000 1.87 *************** *** 1,2 **** --- 1,12 ---- + 2008-05-14 Naohisa Goto + + * lib/bio/appl/blast/format0.rb + + Bug fix: Possibly because of the output format changes of PHI-BLAST, + Bio::Blast::Default::Report::Iteration#eff_space (and the shortcut + method in the Report class) failed for PHI-BLAST (blastpgp) results, + and Iteration#pattern and #pattern_positions (and the + shortcut methods in the Report class) returned incorrect values. + 2008-05-12 Naohisa Goto From ngoto at dev.open-bio.org Wed May 14 09:39:43 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Wed, 14 May 2008 13:39:43 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/appl/blast format0.rb, 1.26.2.3, 1.26.2.4 Message-ID: <200805141339.m4EDdh98012136@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/appl/blast In directory dev.open-bio.org:/tmp/cvs-serv12115/lib/bio/appl/blast Modified Files: Tag: BRANCH-biohackathon2008 format0.rb Log Message: Merging differences between 1.28 and 1.29 into format0.rb in BRANCH-biohackathon2008. Index: format0.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/appl/blast/format0.rb,v retrieving revision 1.26.2.3 retrieving revision 1.26.2.4 diff -C2 -d -r1.26.2.3 -r1.26.2.4 *** format0.rb 13 May 2008 11:19:42 -0000 1.26.2.3 --- format0.rb 14 May 2008 13:39:41 -0000 1.26.2.4 *************** *** 535,539 **** r = data.first break unless r ! if /^Significant alignments for pattern/ =~ r data.shift r = data.first --- 535,539 ---- r = data.first break unless r ! while /^Significant alignments for pattern/ =~ r data.shift r = data.first *************** *** 590,596 **** @f0message.each do |r| sc = StringScanner.new(r) ! if sc.skip_until(/^ *pattern +(.+)$/) then @pattern = sc[1] unless @pattern ! sc.skip_until(/^ at position +(\d+)/) @pattern_positions << sc[1].to_i end --- 590,596 ---- @f0message.each do |r| sc = StringScanner.new(r) ! if sc.skip_until(/^ *pattern +([^\s]+)/) then @pattern = sc[1] unless @pattern ! sc.skip_until(/(?:^ *| +)at position +(\d+) +of +query +sequence/) @pattern_positions << sc[1].to_i end From pjotr at dev.open-bio.org Mon May 19 07:23:58 2008 From: pjotr at dev.open-bio.org (Pjotr Prins) Date: Mon, 19 May 2008 11:23:58 +0000 Subject: [BioRuby-cvs] bioruby/sample fastasort.rb,NONE,1.1 Message-ID: <200805191123.m4JBNwdr000709@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/sample In directory dev.open-bio.org:/tmp/cvs-serv689 Added Files: fastasort.rb Log Message: Simple example for sorting a flatfile --- NEW FILE: fastasort.rb --- #!/usr/bin/env ruby # # fastasort: Sorts a FASTA file (in fact it can use any flat file input supported # by BIORUBY) while modifying the definition of each record in the # process. # # Copyright (C) 2008 KATAYAMA Toshiaki & Pjotr Prins # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # $Id: fastasort.rb,v 1.1 2008/05/19 11:23:56 pjotr Exp $ # require 'bio' include Bio table = Hash.new # table to sort objects ARGV.each do | fn | Bio::FlatFile.auto(fn).each do | item | # strip JALView extension from definition e.g. .../1-212 if item.definition =~ /\/\d+-\d+$/ item.definition = $` end table[item.definition] = item.data end end # Output sorted table table.sort.each do | definition, data | rec = Bio::FastaFormat.new('> '+definition.strip+"\n"+data) print rec end From pjotr at dev.open-bio.org Mon May 19 08:22:07 2008 From: pjotr at dev.open-bio.org (Pjotr Prins) Date: Mon, 19 May 2008 12:22:07 +0000 Subject: [BioRuby-cvs] bioruby/doc Tutorial.rd,1.21,1.22 Message-ID: <200805191222.m4JCM7nM000852@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/doc In directory dev.open-bio.org:/tmp/cvs-serv829/doc Modified Files: Tutorial.rd Log Message: Piping FASTA files (examples and doc) Index: Tutorial.rd =================================================================== RCS file: /home/repository/bioruby/bioruby/doc/Tutorial.rd,v retrieving revision 1.21 retrieving revision 1.22 diff -C2 -d -r1.21 -r1.22 *** Tutorial.rd 13 Feb 2008 08:04:30 -0000 1.21 --- Tutorial.rd 19 May 2008 12:22:05 -0000 1.22 *************** *** 466,470 **** An example that can take any input, filter using a regular expression to output ! to a FASTA file can be found in sample/any2fasta.rb. Other methods to extract specific data from database objects can be --- 466,477 ---- An example that can take any input, filter using a regular expression to output ! to a FASTA file can be found in sample/any2fasta.rb. With this technique it is ! possible to write a Unix type grep/sort pipe for sequence information. One ! example using scripts in the BIORUBY sample folder: ! ! fastagrep.rb '/At|Dm/' database.seq | fastasort.rb ! ! greps the database for Arabidopsis and Drosophila entries and sorts the output ! to FASTA. Other methods to extract specific data from database objects can be From pjotr at dev.open-bio.org Mon May 19 08:22:07 2008 From: pjotr at dev.open-bio.org (Pjotr Prins) Date: Mon, 19 May 2008 12:22:07 +0000 Subject: [BioRuby-cvs] bioruby/sample fastagrep.rb, NONE, 1.1 fastasort.rb, 1.1, 1.2 Message-ID: <200805191222.m4JCM7KO000857@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/sample In directory dev.open-bio.org:/tmp/cvs-serv829/sample Modified Files: fastasort.rb Added Files: fastagrep.rb Log Message: Piping FASTA files (examples and doc) --- NEW FILE: fastagrep.rb --- #!/usr/bin/env ruby # # fastagrep: Greps a FASTA file (in fact it can use any flat file input supported # by BIORUBY) and outputs sorted FASTA # # Copyright (C) 2008 KATAYAMA Toshiaki & Pjotr Prins # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # $Id: fastagrep.rb,v 1.1 2008/05/19 12:22:05 pjotr Exp $ # require 'bio' include Bio usage = < reduced.fasta As the result is a FASTA stream you could pipe it for sorting: fastagrep.rb "/Arabidopsis|Drosophila/i" *.seq | fastasort.rb USAGE if ARGV.size == 0 print usage exit 1 end skip = (ARGV[0] == '-v') ARGV.shift if skip # ---- Valid regular expression - if it is not a file regex = ARGV[0] if regex=~/^\// and !File.exist?(regex) ARGV.shift else print usage exit 1 end ARGV.each do | fn | Bio::FlatFile.auto(fn).each do | item | if skip next if eval("item.definition =~ #{regex}") else next if eval("item.definition !~ #{regex}") end rec = Bio::FastaFormat.new('> '+item.definition.strip+"\n"+item.data) print rec end end Index: fastasort.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/sample/fastasort.rb,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** fastasort.rb 19 May 2008 11:23:56 -0000 1.1 --- fastasort.rb 19 May 2008 12:22:05 -0000 1.2 *************** *** 3,7 **** # fastasort: Sorts a FASTA file (in fact it can use any flat file input supported # by BIORUBY) while modifying the definition of each record in the ! # process. # # Copyright (C) 2008 KATAYAMA Toshiaki & Pjotr Prins --- 3,8 ---- # fastasort: Sorts a FASTA file (in fact it can use any flat file input supported # by BIORUBY) while modifying the definition of each record in the ! # process so it is suitable for processing with (for example) pal2nal ! # and PAML. # # Copyright (C) 2008 KATAYAMA Toshiaki & Pjotr Prins *************** *** 27,35 **** ARGV.each do | fn | Bio::FlatFile.auto(fn).each do | item | # strip JALView extension from definition e.g. .../1-212 if item.definition =~ /\/\d+-\d+$/ item.definition = $` end ! table[item.definition] = item.data end end --- 28,47 ---- ARGV.each do | fn | Bio::FlatFile.auto(fn).each do | item | + # Some procession of the definition for external programs (just + # an example): + # strip JALView extension from definition e.g. .../1-212 if item.definition =~ /\/\d+-\d+$/ item.definition = $` end ! # substitute slashes: ! definition = item.definition.gsub(/\//,'-') ! # substitute quotes and ampersands: ! definition = item.definition.gsub(/['"&]/,'x') ! # prefix letters if the first position is a number: ! definition = 'seq'+definition if definition =~ /^\d/ ! ! # Now add the data to the sort table ! table[definition] = item.data end end From ngoto at dev.open-bio.org Wed May 21 07:28:56 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Wed, 21 May 2008 11:28:56 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/embl format.erb,1.1.2.1,NONE Message-ID: <200805211128.m4LBSu3f009781@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/embl In directory dev.open-bio.org:/tmp/cvs-serv9740/embl Removed Files: Tag: BRANCH-biohackathon2008 format.erb Log Message: removed unused file lib/bio/db/embl/format.erb. The contents of this file is already moved to lib/bio/db/embl/format_embl.rb and modified. --- format.erb DELETED --- From ngoto at dev.open-bio.org Wed May 21 08:27:35 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Wed, 21 May 2008 12:27:35 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/embl embl.rb,1.29.2.4,1.29.2.5 Message-ID: <200805211227.m4LCRZo6009984@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/embl In directory dev.open-bio.org:/tmp/cvs-serv9964/lib/bio/db/embl Modified Files: Tag: BRANCH-biohackathon2008 embl.rb Log Message: added rDoc for Bio::EMBL#to_biosequence Index: embl.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/embl/embl.rb,v retrieving revision 1.29.2.4 retrieving revision 1.29.2.5 diff -C2 -d -r1.29.2.4 -r1.29.2.5 *** embl.rb 21 Mar 2008 06:24:42 -0000 1.29.2.4 --- embl.rb 21 May 2008 12:27:33 -0000 1.29.2.5 *************** *** 371,377 **** alias naseq seq alias ntseq seq ! # // Line; termination line (end; 1/entry) def to_biosequence bio_seq = Bio::Sequence.new(self.seq) --- 371,383 ---- alias naseq seq alias ntseq seq ! ! #-- # // Line; termination line (end; 1/entry) + #++ + # converts the entry to Bio::Sequence object + # --- + # *Arguments*:: + # *Returns*:: Bio::Sequence object def to_biosequence bio_seq = Bio::Sequence.new(self.seq) From ngoto at dev.open-bio.org Wed May 28 09:09:05 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Wed, 28 May 2008 13:09:05 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/embl embl.rb,1.29.2.5,1.29.2.6 Message-ID: <200805281309.m4SD9504013095@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/embl In directory dev.open-bio.org:/tmp/cvs-serv13075/lib/bio/db/embl Modified Files: Tag: BRANCH-biohackathon2008 embl.rb Log Message: fixed possible typo Index: embl.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/embl/embl.rb,v retrieving revision 1.29.2.5 retrieving revision 1.29.2.6 diff -C2 -d -r1.29.2.5 -r1.29.2.6 *** embl.rb 21 May 2008 12:27:33 -0000 1.29.2.5 --- embl.rb 28 May 2008 13:09:03 -0000 1.29.2.6 *************** *** 384,388 **** bio_seq.entry_id = self.entry_id bio_seq.primary_accession = self.accessions[0] ! bio_seq.secondary_accessions = self.accessions[1,-1] || [] bio_seq.molecule_type = self.molecule_type bio_seq.data_class = self.data_class --- 384,388 ---- bio_seq.entry_id = self.entry_id bio_seq.primary_accession = self.accessions[0] ! bio_seq.secondary_accessions = self.accessions[1..-1] || [] bio_seq.molecule_type = self.molecule_type bio_seq.data_class = self.data_class From ngoto at dev.open-bio.org Wed May 28 09:26:35 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Wed, 28 May 2008 13:26:35 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/genbank format_genbank.rb, 1.1.2.3, 1.1.2.4 Message-ID: <200805281326.m4SDQZDb013144@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/genbank In directory dev.open-bio.org:/tmp/cvs-serv13124/lib/bio/db/genbank Modified Files: Tag: BRANCH-biohackathon2008 format_genbank.rb Log Message: simplify sequence formatting routine Index: format_genbank.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/genbank/Attic/format_genbank.rb,v retrieving revision 1.1.2.3 retrieving revision 1.1.2.4 diff -C2 -d -r1.1.2.3 -r1.1.2.4 *** format_genbank.rb 7 May 2008 12:28:56 -0000 1.1.2.3 --- format_genbank.rb 28 May 2008 13:26:33 -0000 1.1.2.4 *************** *** 102,111 **** # formats sequence lines as GenBank ! def each_genbank_seqline(str) #:yields: counter, seqline i = 1 ! a = str.scan(/.{1,60}/) do |s| ! yield i, s.gsub(/(.{1,10})/, " \\1") i += 60 end end --- 102,114 ---- # formats sequence lines as GenBank ! def seq_format_genbank(str) i = 1 ! result = str.gsub(/.{1,60}/) do |s| ! s = s.gsub(/.{1,10}/, ' \0') ! y = sprintf("%9d%s\n", i, s) i += 60 + y end + result end *************** *** 129,135 **** <%= format_features_genbank(features || []) %>ORIGIN ! <% each_genbank_seqline(seq) do |i, s| ! %><%= sprintf('%9d', i) %><%= s %> ! <% end %>// __END_OF_TEMPLATE__ --- 132,137 ---- <%= format_features_genbank(features || []) %>ORIGIN ! <%= seq_format_genbank(seq) ! %>// __END_OF_TEMPLATE__ From ngoto at dev.open-bio.org Wed May 28 09:38:09 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Wed, 28 May 2008 13:38:09 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/embl format_embl.rb, 1.1.2.4, 1.1.2.5 Message-ID: <200805281338.m4SDc9Dh013213@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/embl In directory dev.open-bio.org:/tmp/cvs-serv13173/lib/bio/db/embl Modified Files: Tag: BRANCH-biohackathon2008 format_embl.rb Log Message: simplify code of seq_format_embl(), and SQ line is changed not to show non-ACGT single base contents (which should be shown together as "other"). Index: format_embl.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/embl/Attic/format_embl.rb,v retrieving revision 1.1.2.4 retrieving revision 1.1.2.5 diff -C2 -d -r1.1.2.4 -r1.1.2.5 *** format_embl.rb 7 May 2008 12:24:26 -0000 1.1.2.4 --- format_embl.rb 28 May 2008 13:38:07 -0000 1.1.2.5 *************** *** 106,121 **** def seq_format_embl(seq) - output_lines = Array.new counter = 0 ! remainder = seq.window_search(60,60) do |subseq| ! counter += 60 ! subseq.gsub!(/(.{10})/, '\1 ') ! output_lines.push(' '*5 + subseq + counter.to_s.rjust(9)) end ! counter += remainder.length ! remainder = (remainder.to_s + ' '*(60-remainder.length)) ! remainder.gsub!(/(.{10})/, '\1 ') ! output_lines.push(' '*5 + remainder + counter.to_s.rjust(9)) ! return output_lines.join("\n") end --- 106,126 ---- def seq_format_embl(seq) counter = 0 ! result = seq.gsub(/.{1,60}/) do |x| ! counter += x.length ! x = x.gsub(/.{10}/, '\0 ') ! sprintf(" %-66s%9d\n", x, counter) end ! result.chomp! ! result ! end ! ! def seq_composition(seq) ! { :a => seq.count('aA'), ! :c => seq.count('cC'), ! :g => seq.count('gG'), ! :t => seq.count('tTuU'), ! :other => seq.count('^aAcCgGtTuU') ! } end *************** *** 140,144 **** FH <%= format_features_embl(features || []) %>XX ! SQ Sequence <%= seq.length %> BP; <%= seq.composition.collect{|k,v| "#{v} #{k.upcase}"}.join('; ') + '; ' + (seq.gsub(/[ACTGactg]/, '').length.to_s ) + ' other;' %> <%= seq_format_embl(seq) %> // --- 145,149 ---- FH <%= format_features_embl(features || []) %>XX ! SQ Sequence <%= seq.length %> BP; <% c = seq_composition(seq) %><%= c[:a] %> A; <%= c[:c] %> C; <%= c[:g] %> G; <%= c[:t] %> T; <%= c[:other] %> other; <%= seq_format_embl(seq) %> // From pjotr at dev.open-bio.org Thu May 29 07:25:47 2008 From: pjotr at dev.open-bio.org (Pjotr Prins) Date: Thu, 29 May 2008 11:25:47 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio reference.rb,1.24,1.25 Message-ID: <200805291125.m4TBPlWZ015209@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio In directory dev.open-bio.org:/tmp/cvs-serv15189 Modified Files: reference.rb Log Message: - Improved bibtex support (optional output of abstract - strip empty fields) - Put generated URL into separate method Index: reference.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/reference.rb,v retrieving revision 1.24 retrieving revision 1.25 diff -C2 -d -r1.24 -r1.25 *** reference.rb 5 Apr 2007 23:35:39 -0000 1.24 --- reference.rb 29 May 2008 11:25:44 -0000 1.25 *************** *** 71,77 **** attr_reader :abstract - # An URL String. - attr_reader :url - # MeSH terms in an Array. attr_reader :mesh --- 71,74 ---- *************** *** 128,132 **** @medline = hash['medline'] # 98765432 @abstract = hash['abstract'] - @url = hash['url'] @mesh = hash['mesh'] @affiliations = hash['affiliations'] --- 125,128 ---- *************** *** 232,241 **** lines << "%P #{@pages}" unless @pages.empty? lines << "%M #{@pubmed}" unless @pubmed.to_s.empty? ! if @pubmed ! cgi = "http://www.ncbi.nlm.nih.gov/entrez/query.fcgi" ! opts = "cmd=Retrieve&db=PubMed&dopt=Citation&list_uids" ! @url = "#{cgi}?#{opts}=#{@pubmed}" ! end ! lines << "%U #{@url}" unless @url.empty? lines << "%X #{@abstract}" unless @abstract.empty? @mesh.each do |term| --- 228,232 ---- lines << "%P #{@pages}" unless @pages.empty? lines << "%M #{@pubmed}" unless @pubmed.to_s.empty? ! lines << "%U #{url}" unless url.empty? lines << "%X #{@abstract}" unless @abstract.empty? @mesh.each do |term| *************** *** 299,318 **** # *Arguments*: # * (optional) _section_: BiBTeX section as String # *Returns*:: String ! def bibtex(section = nil) section = "article" unless section authors = authors_join(' and ', ' and ') pages = @pages.sub('-', '--') ! return <<-"END".gsub(/\t/, '') ! @#{section}{PMID:#{@pubmed}, ! author = {#{authors}}, ! title = {#{@title}}, ! journal = {#{@journal}}, ! year = {#{@year}}, ! volume = {#{@volume}}, ! number = {#{@issue}}, ! pages = {#{pages}}, ! } ! END end --- 290,317 ---- # *Arguments*: # * (optional) _section_: BiBTeX section as String + # * (optional) _keywords_: Array of additional keywords, e.g. ['abstract'] # *Returns*:: String ! def bibtex(section = nil, add_keywords = []) section = "article" unless section authors = authors_join(' and ', ' and ') pages = @pages.sub('-', '--') ! keywords = "author title journal year volume number pages url".split(/ /) ! bib = "@#{section}{PMID:#{@pubmed},\n" ! (keywords+add_keywords).each do | kw | ! if kw == 'author' ! ref = authors ! elsif kw == 'title' ! # strip final dot from title ! ref = @title.sub(/\.$/,'') ! elsif kw == 'number' ! ref = @issue ! elsif kw == 'url' ! ref = url ! else ! ref = eval('@'+kw) ! end ! bib += " #{kw.ljust(12)} = {#{ref}},\n" if ref != '' ! end ! bib+"}\n" end *************** *** 500,503 **** --- 499,513 ---- end + # Returns a valid URL for pubmed records + # + # *Returns*:: String + def url + if @pubmed != '' + cgi = "http://www.ncbi.nlm.nih.gov/entrez/query.fcgi" + opts = "cmd=Retrieve&db=PubMed&dopt=Citation&list_uids" + return "#{cgi}?#{opts}=#{@pubmed}" + end + '' + end private *************** *** 527,530 **** --- 537,541 ---- end + end From pjotr at dev.open-bio.org Sat May 31 05:36:58 2008 From: pjotr at dev.open-bio.org (Pjotr Prins) Date: Sat, 31 May 2008 09:36:58 +0000 Subject: [BioRuby-cvs] bioruby/test/unit/bio test_reference.rb,1.3,1.4 Message-ID: <200805310936.m4V9aw7X020318@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/test/unit/bio In directory dev.open-bio.org:/tmp/cvs-serv20293/test/unit/bio Modified Files: test_reference.rb Log Message: - Bibtex: reverted on url regression per comment Naohisa - now it gets overridden on empty for pubmed only. - Bibtex: fixed unit tests Index: test_reference.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/test/unit/bio/test_reference.rb,v retrieving revision 1.3 retrieving revision 1.4 diff -C2 -d -r1.3 -r1.4 *** test_reference.rb 5 Apr 2007 23:35:42 -0000 1.3 --- test_reference.rb 31 May 2008 09:36:56 -0000 1.4 *************** *** 91,95 **** def test_format_endnote ! str = "%0 Journal Article\n%A Hoge, J.P.\n%A Fuga, F.B.\n%D 2001\n%T Title of the study.\n%J Theor. J. Hoge\n%V 12\n%N 3\n%P 123-145\n%M 12345678\n%U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=12345678\n%X Hoge fuga. hoge fuga.\n%K Hoge\n%+ Tokyo" assert_equal(str, @obj.format('endnote')) assert_equal(str, @obj.endnote) --- 91,95 ---- def test_format_endnote ! str = "%0 Journal Article\n%A Hoge, J.P.\n%A Fuga, F.B.\n%D 2001\n%T Title of the study.\n%J Theor. J. Hoge\n%V 12\n%N 3\n%P 123-145\n%M 12345678\n%U http://example.com\n%X Hoge fuga. hoge fuga.\n%K Hoge\n%+ Tokyo" assert_equal(str, @obj.format('endnote')) assert_equal(str, @obj.endnote) *************** *** 103,117 **** def test_format_bibtex ! str =< Update of /home/repository/bioruby/bioruby/lib/bio In directory dev.open-bio.org:/tmp/cvs-serv20293/lib/bio Modified Files: reference.rb Log Message: - Bibtex: reverted on url regression per comment Naohisa - now it gets overridden on empty for pubmed only. - Bibtex: fixed unit tests Index: reference.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/reference.rb,v retrieving revision 1.25 retrieving revision 1.26 diff -C2 -d -r1.25 -r1.26 *** reference.rb 29 May 2008 11:25:44 -0000 1.25 --- reference.rb 31 May 2008 09:36:55 -0000 1.26 *************** *** 77,80 **** --- 77,83 ---- attr_reader :affiliations + # An URL String. + attr_reader :url + # Create a new Bio::Reference object from a Hash of values. # Data is extracted from the values for keys: *************** *** 125,128 **** --- 128,132 ---- @medline = hash['medline'] # 98765432 @abstract = hash['abstract'] + @url = hash['url'] @mesh = hash['mesh'] @affiliations = hash['affiliations'] *************** *** 503,506 **** --- 507,511 ---- # *Returns*:: String def url + return @url if @url and @url != '' if @pubmed != '' cgi = "http://www.ncbi.nlm.nih.gov/entrez/query.fcgi" From ngoto at dev.open-bio.org Wed May 7 06:17:55 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Wed, 07 May 2008 06:17:55 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/genbank format_genbank.rb, 1.1.2.1, 1.1.2.2 Message-ID: <200805070617.m476Htsk000805@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/genbank In directory dev.open-bio.org:/tmp/cvs-serv773/lib/bio/db/genbank Modified Files: Tag: BRANCH-biohackathon2008 format_genbank.rb Log Message: Bug fix: string in given object may be broken Index: format_genbank.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/genbank/Attic/format_genbank.rb,v retrieving revision 1.1.2.1 retrieving revision 1.1.2.2 diff -C2 -d -r1.1.2.1 -r1.1.2.2 *** format_genbank.rb 4 Mar 2008 11:19:16 -0000 1.1.2.1 --- format_genbank.rb 7 May 2008 06:17:52 -0000 1.1.2.2 *************** *** 64,72 **** pos = " (bases #{pos})" end ! journal = ref.journal.to_s ! volissue = ref.volume.to_s volissue += " (#{ref.issue})" unless ref.issue.to_s.empty? ! journal += " #{volissue}," unless volissue.empty? ! journal += " #{ref.pages}" unless ref.pages.to_s.empty? journal += " (#{ref.year})" unless ref.year.to_s.empty? --- 64,72 ---- pos = " (bases #{pos})" end ! volissue = "#{ref.volume.to_s}" volissue += " (#{ref.issue})" unless ref.issue.to_s.empty? ! journal = "#{ref.journal.to_s}" ! journal += " #{volissue}" unless volissue.empty? ! journal += ", #{ref.pages}" unless ref.pages.to_s.empty? journal += " (#{ref.year})" unless ref.year.to_s.empty? From ngoto at dev.open-bio.org Wed May 7 12:22:13 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Wed, 07 May 2008 12:22:13 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/embl common.rb,1.12.2.4,1.12.2.5 Message-ID: <200805071222.m47CMDaf007700@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/embl In directory dev.open-bio.org:/tmp/cvs-serv7680/lib/bio/db/embl Modified Files: Tag: BRANCH-biohackathon2008 common.rb Log Message: In Bio::EMBL#references method, authors' names are changed to be normalized as of GenBank-like style. For example, "van der Waals J.D." in EMBL is normalized into "van der Waals, J.D.". Index: common.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/embl/common.rb,v retrieving revision 1.12.2.4 retrieving revision 1.12.2.5 diff -C2 -d -r1.12.2.4 -r1.12.2.5 *** common.rb 23 Apr 2008 18:52:18 -0000 1.12.2.4 --- common.rb 7 May 2008 12:22:10 -0000 1.12.2.5 *************** *** 272,276 **** unless @data['references'] ary = self.ref.map {|ent| ! hash = Hash.new('') ent.each {|key, value| case key --- 272,276 ---- unless @data['references'] ary = self.ref.map {|ent| ! hash = Hash.new ent.each {|key, value| case key *************** *** 287,291 **** hash['sequence_position'] = value when 'RA' ! hash['authors'] = value.split(/\, /) when 'RT' hash['title'] = value --- 287,295 ---- hash['sequence_position'] = value when 'RA' ! a = value.split(/\, /) ! a.each do |x| ! x.sub!(/( [^ ]+)\z/, ",\\1") ! end ! hash['authors'] = a when 'RT' hash['title'] = value From ngoto at dev.open-bio.org Wed May 7 12:24:28 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Wed, 07 May 2008 12:24:28 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/embl format_embl.rb, 1.1.2.3, 1.1.2.4 Message-ID: <200805071224.m47COSwB007749@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/embl In directory dev.open-bio.org:/tmp/cvs-serv7729/lib/bio/db/embl Modified Files: Tag: BRANCH-biohackathon2008 format_embl.rb Log Message: Bug fix: in RA line, every author's name should not be splitted into two lines. Index: format_embl.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/embl/Attic/format_embl.rb,v retrieving revision 1.1.2.3 retrieving revision 1.1.2.4 diff -C2 -d -r1.1.2.3 -r1.1.2.4 *** format_embl.rb 23 Apr 2008 18:52:18 -0000 1.1.2.3 --- format_embl.rb 7 May 2008 12:24:26 -0000 1.1.2.4 *************** *** 21,28 **** --- 21,52 ---- private + # wrapping with EMBL style def embl_wrap(prefix, str) wrap(str.to_s, 80, prefix) end + # Given words (an Array of String) are wrapping with EMBL style. + # Each word is never splitted inside the word. + def embl_wrap_words(prefix, array) + width = 80 + result = [] + str = nil + array.each do |x| + if str then + if str.length + 1 + x.length > width then + str = nil + else + str.concat ' ' + str.concat x + end + end + unless str then + str = prefix + x + result.push str + end + end + result.join("\n") + end + # format reference # ref:: Bio::Reference object *************** *** 53,58 **** lines << embl_wrap("RX ", "PUBMED; #{ref.pubmed}.") end ! unless ref.authors.empty? ! lines << embl_wrap('RA ', ref.authors.join(', ') + ';') end lines << embl_wrap('RT ', --- 77,90 ---- lines << embl_wrap("RX ", "PUBMED; #{ref.pubmed}.") end ! unless ref.authors.empty? then ! auth = ref.authors.collect do |x| ! y = x.to_s.strip.split(/\, *([^\,]+)\z/) ! y[1].gsub!(/\. +/, '.') if y[1] ! y.join(' ') ! end ! lastauth = auth.pop ! auth.each { |x| x.concat ',' } ! auth.push(lastauth.to_s + ';') ! lines << embl_wrap_words('RA ', auth) end lines << embl_wrap('RT ', From ngoto at dev.open-bio.org Wed May 7 12:25:44 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Wed, 07 May 2008 12:25:44 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/genbank common.rb, 1.11.2.3, 1.11.2.4 Message-ID: <200805071225.m47CPivB007816@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/genbank In directory dev.open-bio.org:/tmp/cvs-serv7778/lib/bio/db/genbank Modified Files: Tag: BRANCH-biohackathon2008 common.rb Log Message: added support for 'REMARK' (comment in reference) Index: common.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/genbank/common.rb,v retrieving revision 1.11.2.3 retrieving revision 1.11.2.4 diff -C2 -d -r1.11.2.3 -r1.11.2.4 *** common.rb 4 Mar 2008 10:32:55 -0000 1.11.2.3 --- common.rb 7 May 2008 12:25:42 -0000 1.11.2.4 *************** *** 138,142 **** ary = [] toptag2array(get('REFERENCE')).each do |ref| ! hash = Hash.new('') subtag2array(ref).each do |field| case tag_get(field) --- 138,142 ---- ary = [] toptag2array(get('REFERENCE')).each do |ref| ! hash = Hash.new subtag2array(ref).each do |field| case tag_get(field) *************** *** 175,178 **** --- 175,181 ---- when /PUBMED/ hash['pubmed'] = truncate(tag_cut(field)) + when /REMARK/ + hash['comments'] ||= [] + hash['comments'].push truncate(tag_cut(field)) end end From ngoto at dev.open-bio.org Wed May 7 12:28:58 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Wed, 07 May 2008 12:28:58 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/genbank format_genbank.rb, 1.1.2.2, 1.1.2.3 Message-ID: <200805071228.m47CSw3A007865@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/genbank In directory dev.open-bio.org:/tmp/cvs-serv7845/lib/bio/db/genbank Modified Files: Tag: BRANCH-biohackathon2008 format_genbank.rb Log Message: * added support for 'REMARK' (comment in reference). * Bug Fix: an author's name should not be separated into two lines. Index: format_genbank.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/genbank/Attic/format_genbank.rb,v retrieving revision 1.1.2.2 retrieving revision 1.1.2.3 diff -C2 -d -r1.1.2.2 -r1.1.2.3 *** format_genbank.rb 7 May 2008 06:17:52 -0000 1.1.2.2 --- format_genbank.rb 7 May 2008 12:28:56 -0000 1.1.2.3 *************** *** 33,36 **** --- 33,104 ---- end + # Given words (an Array of String) are wrapping with EMBL style. + # Each word is never splitted inside the word. + def genbank_wrap_words(array) + width = 67 + result = [] + str = nil + array.each do |x| + if str then + if str.length + 1 + x.length > width then + str = nil + else + str.concat ' ' + str.concat x + end + end + unless str then + str = "#{x}" + result.push str + end + end + result.join("\n" + " " * 12) + end + + # formats references + def reference_format_genbank(ref, num) + pos = ref.sequence_position.to_s.gsub(/\s/, '') + pos.gsub!(/(\d+)\-(\d+)/, "\\1 to \\2") + pos.gsub!(/\s*\,\s*/, '; ') + if pos.empty? + pos = '' + else + pos = " (bases #{pos})" + end + volissue = "#{ref.volume.to_s}" + volissue += " (#{ref.issue})" unless ref.issue.to_s.empty? + journal = "#{ref.journal.to_s}" + journal += " #{volissue}" unless volissue.empty? + journal += ", #{ref.pages}" unless ref.pages.to_s.empty? + journal += " (#{ref.year})" unless ref.year.to_s.empty? + + alist = ref.authors.collect do |x| + y = x.to_s.strip.split(/\, *([^\,]+)\z/) + y[1].gsub!(/\. +/, '.') if y[1] + y.join(',') + end + lastauthor = alist.pop + last2author = alist.pop + alist.each { |x| x.concat ',' } + alist.push last2author if last2author + alist.push "and" unless alist.empty? + alist.push lastauthor.to_s + result = <<__END_OF_REFERENCE__ + REFERENCE #{ genbank_wrap(sprintf('%-2d%s', num, pos))} + AUTHORS #{ genbank_wrap_words(alist) } + TITLE #{ genbank_wrap(ref.title.to_s) } + JOURNAL #{ genbank_wrap(journal) } + __END_OF_REFERENCE__ + unless ref.pubmed.to_s.empty? then + result.concat " PUBMED #{ genbank_wrap(ref.pubmed) }\n" + end + if ref.comments and !(ref.comments.empty?) then + ref.comments.each do |c| + result.concat " REMARK #{ genbank_wrap(c) }\n" + end + end + result + end + # formats sequence lines as GenBank def each_genbank_seqline(str) #:yields: counter, seqline *************** *** 56,87 **** (references or []).each do |ref| n += 1 ! pos = ref.sequence_position.to_s.gsub(/\s/, '') ! pos.gsub!(/(\d+)\-(\d+)/, "\\1 to \\2") ! pos.gsub!(/\s*\,\s*/, '; ') ! if pos.empty? ! pos = '' ! else ! pos = " (bases #{pos})" ! end ! volissue = "#{ref.volume.to_s}" ! volissue += " (#{ref.issue})" unless ref.issue.to_s.empty? ! journal = "#{ref.journal.to_s}" ! journal += " #{volissue}" unless volissue.empty? ! journal += ", #{ref.pages}" unless ref.pages.to_s.empty? ! journal += " (#{ref.year})" unless ref.year.to_s.empty? ! ! alist = ref.authors.collect { |x| x.gsub(/\, /, ',') } ! lastauthor = alist.pop ! authorsline = alist.join(', ') ! authorsline.concat(" and ") unless alist.empty? ! authorsline.concat lastauthor.to_s ! ! %>REFERENCE <%= genbank_wrap(sprintf('%-2d%s', n, pos)) %> ! AUTHORS <%= genbank_wrap(authorsline) %> ! TITLE <%= genbank_wrap(ref.title.to_s) %> ! JOURNAL <%= genbank_wrap(journal) %> ! <% unless ref.pubmed.to_s.empty? ! %> PUBMED <%= ref.pubmed %> ! <% end end %>FEATURES Location/Qualifiers --- 124,128 ---- (references or []).each do |ref| n += 1 ! %><%= reference_format_genbank(ref, n) %><% end %>FEATURES Location/Qualifiers From ngoto at dev.open-bio.org Thu May 8 05:38:03 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Thu, 08 May 2008 05:38:03 +0000 Subject: [BioRuby-cvs] bioruby/test/unit/bio test_reference.rb, 1.3, 1.3.2.1 test_feature.rb, 1.5, 1.5.2.1 Message-ID: <200805080538.m485c31o010619@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/test/unit/bio In directory dev.open-bio.org:/tmp/cvs-serv10599/test/unit/bio Modified Files: Tag: BRANCH-biohackathon2008 test_reference.rb test_feature.rb Log Message: Unit test codes are changed due to the changes of Bio::References and Bio::Features (those are obsoleted but exist for backaward compatibility). Index: test_feature.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/test/unit/bio/test_feature.rb,v retrieving revision 1.5 retrieving revision 1.5.2.1 diff -C2 -d -r1.5 -r1.5.2.1 *** test_feature.rb 5 Apr 2007 23:35:42 -0000 1.5 --- test_feature.rb 8 May 2008 05:38:01 -0000 1.5.2.1 *************** *** 15,18 **** --- 15,19 ---- require 'test/unit' require 'bio/feature' + require 'bio/compat/features' *************** *** 89,96 **** --- 90,124 ---- end + class NullStderr + def initialize + @log = [] + end + + def write(*arg) + #p arg + @log.push([ :write, *arg ]) + nil + end + + def method_missing(*arg) + #p arg + @log.push arg + nil + end + end + class TestFeatures < Test::Unit::TestCase def setup + # To suppress warning messages, $stderr is replaced by dummy object. + @stderr_orig = $stderr + $stderr = NullStderr.new + @obj = Bio::Features.new([Bio::Feature.new('gene', '1..615', [])]) end + + def teardown + # bring back $stderr + $stderr = @stderr_orig + end def test_features Index: test_reference.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/test/unit/bio/test_reference.rb,v retrieving revision 1.3 retrieving revision 1.3.2.1 diff -C2 -d -r1.3 -r1.3.2.1 *** test_reference.rb 5 Apr 2007 23:35:42 -0000 1.3 --- test_reference.rb 8 May 2008 05:38:01 -0000 1.3.2.1 *************** *** 15,18 **** --- 15,19 ---- require 'test/unit' require 'bio/reference' + require 'bio/compat/references' *************** *** 173,179 **** --- 174,202 ---- end + class NullStderr + def initialize + @log = [] + end + + def write(*arg) + #p arg + @log.push([ :write, *arg ]) + nil + end + + def method_missing(*arg) + #p arg + @log.push arg + nil + end + end + class TestReferences < Test::Unit::TestCase def setup + # To suppress warning messages, $stderr is replaced by dummy object. + @stderr_orig = $stderr + $stderr = NullStderr.new + hash = {} ary = [Bio::Reference.new(hash), *************** *** 182,185 **** --- 205,213 ---- end + def teardown + # bring back $stderr + $stderr = @stderr_orig + end + def test_append hash = {} From ngoto at dev.open-bio.org Fri May 9 02:32:47 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Fri, 09 May 2008 02:32:47 +0000 Subject: [BioRuby-cvs] bioruby/test/unit/bio/appl/blast test_xmlparser.rb, 1.7, 1.7.2.1 Message-ID: <200805090232.m492Wlfl015068@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/test/unit/bio/appl/blast In directory dev.open-bio.org:/tmp/cvs-serv15048/test/unit/bio/appl/blast Modified Files: Tag: BRANCH-biohackathon2008 test_xmlparser.rb Log Message: Bug fix: tests in test/unit/bio/appl/blast/test_report.rb was ignored because of conflicts of test classes' names (TestBlastReport, etc.). The class names in test/unit/bio/appl/blast/test_xmlparser.rb is changed because it contains less assertions than that of test_report.rb. Index: test_xmlparser.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/test/unit/bio/appl/blast/test_xmlparser.rb,v retrieving revision 1.7 retrieving revision 1.7.2.1 diff -C2 -d -r1.7 -r1.7.2.1 *** test_xmlparser.rb 5 Apr 2007 23:35:43 -0000 1.7 --- test_xmlparser.rb 9 May 2008 02:32:44 -0000 1.7.2.1 *************** *** 16,20 **** ! module Bio class TestBlastFormat7XMLParserData bioruby_root = Pathname.new(File.join(File.dirname(__FILE__), ['..'] * 5)).cleanpath.to_s --- 16,20 ---- ! module Bio::TestBlastXMLParser class TestBlastFormat7XMLParserData bioruby_root = Pathname.new(File.join(File.dirname(__FILE__), ['..'] * 5)).cleanpath.to_s *************** *** 36,40 **** def setup ! @report = Bio::Blast::Report.new(Bio::TestBlastFormat7XMLParserData.output) end --- 36,40 ---- def setup ! @report = Bio::Blast::Report.new(TestBlastFormat7XMLParserData.output) end *************** *** 188,192 **** class TestBlastReportHit < Test::Unit::TestCase def setup ! data = Bio::TestBlastFormat7XMLParserData.output report = Bio::Blast::Report.new(data) @hit = report.hits.first --- 188,192 ---- class TestBlastReportHit < Test::Unit::TestCase def setup ! data = TestBlastFormat7XMLParserData.output report = Bio::Blast::Report.new(data) @hit = report.hits.first *************** *** 293,297 **** class TestBlastReportHsp < Test::Unit::TestCase def setup ! data = Bio::TestBlastFormat7XMLParserData.output report = Bio::Blast::Report.new(data) @hsp = report.hits.first.hsps.first --- 293,297 ---- class TestBlastReportHsp < Test::Unit::TestCase def setup ! data = TestBlastFormat7XMLParserData.output report = Bio::Blast::Report.new(data) @hsp = report.hits.first.hsps.first From ngoto at dev.open-bio.org Mon May 12 09:52:18 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Mon, 12 May 2008 09:52:18 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/appl/blast format0.rb,1.26,1.26.2.1 Message-ID: <200805120952.m4C9qIUl004178@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/appl/blast In directory dev.open-bio.org:/tmp/cvs-serv4135 Modified Files: Tag: BRANCH-biohackathon2008 format0.rb Log Message: same changes as 1.26 => 1.27 in trunk: Fixed a bug when a null line is inserted after database title in some cases, reported by Tomoaki NISHIYAMA. Index: format0.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/appl/blast/format0.rb,v retrieving revision 1.26 retrieving revision 1.26.2.1 diff -C2 -d -r1.26 -r1.26.2.1 *** format0.rb 12 Feb 2008 02:13:31 -0000 1.26 --- format0.rb 12 May 2008 09:52:15 -0000 1.26.2.1 *************** *** 294,297 **** --- 294,302 ---- @f0query = data.shift @f0database = data.shift + # In special case, a void line is inserted after database name. + if /\A +[\d\,]+ +sequences\; +[\d\,]+ total +letters\s*\z/ =~ data[0] then + @f0database.concat "\n" + @f0database.concat data.shift + end end From ngoto at dev.open-bio.org Mon May 12 11:16:19 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Mon, 12 May 2008 11:16:19 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/appl/blast format0.rb, 1.26.2.1, 1.26.2.2 Message-ID: <200805121116.m4CBGJMX004407@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/appl/blast In directory dev.open-bio.org:/tmp/cvs-serv4387/lib/bio/appl/blast Modified Files: Tag: BRANCH-biohackathon2008 format0.rb Log Message: Bug fix: Bio::Blast::Default::Report#eff_space returns wrong value ("Effective length of database"). It should return "Effective search space". Index: format0.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/appl/blast/format0.rb,v retrieving revision 1.26.2.1 retrieving revision 1.26.2.2 diff -C2 -d -r1.26.2.1 -r1.26.2.2 *** format0.rb 12 May 2008 09:52:15 -0000 1.26.2.1 --- format0.rb 12 May 2008 11:16:17 -0000 1.26.2.2 *************** *** 388,392 **** #@db_num = @hash['Number of Sequences'] unless defined?(@db_num) #@db_len = @hash['length of database'] unless defined?(@db_len) ! if val = @hash['effective length of database'] then @eff_space = val.tr(',', '').to_i end --- 388,392 ---- #@db_num = @hash['Number of Sequences'] unless defined?(@db_num) #@db_len = @hash['length of database'] unless defined?(@db_len) ! if val = @hash['effective search space'] then @eff_space = val.tr(',', '').to_i end From ngoto at dev.open-bio.org Mon May 12 11:25:57 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Mon, 12 May 2008 11:25:57 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/appl/blast format0.rb,1.27,1.28 Message-ID: <200805121125.m4CBPvmO004456@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/appl/blast In directory dev.open-bio.org:/tmp/cvs-serv4436 Modified Files: format0.rb Log Message: The same change as 1.26.2.1 ==> 1.26.2.2: Bug fix: Bio::Blast::Default::Report#eff_space returns wrong value ("Effective length of database"). It should return "Effective search space". Index: format0.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/appl/blast/format0.rb,v retrieving revision 1.27 retrieving revision 1.28 diff -C2 -d -r1.27 -r1.28 *** format0.rb 1 Apr 2008 06:31:35 -0000 1.27 --- format0.rb 12 May 2008 11:25:55 -0000 1.28 *************** *** 388,392 **** #@db_num = @hash['Number of Sequences'] unless defined?(@db_num) #@db_len = @hash['length of database'] unless defined?(@db_len) ! if val = @hash['effective length of database'] then @eff_space = val.tr(',', '').to_i end --- 388,392 ---- #@db_num = @hash['Number of Sequences'] unless defined?(@db_num) #@db_len = @hash['length of database'] unless defined?(@db_len) ! if val = @hash['effective search space'] then @eff_space = val.tr(',', '').to_i end From ngoto at dev.open-bio.org Mon May 12 11:49:10 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Mon, 12 May 2008 11:49:10 +0000 Subject: [BioRuby-cvs] bioruby/test/unit/bio/appl/blast test_report.rb, 1.5, 1.5.2.1 Message-ID: <200805121149.m4CBnAL5004568@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/test/unit/bio/appl/blast In directory dev.open-bio.org:/tmp/cvs-serv4548 Modified Files: Tag: BRANCH-biohackathon2008 test_report.rb Log Message: * Class TestBlastReportData is changed to module TestBlastReportHelper and improved. * Changed to test both rexml and xmlparser. * Added tests for Bio::Blast::Default::Report. * Name of a method Bio::TestBlastReport#test_extrez_query is changed to "test_entrez_query" because it may be a typo. * Some tests are changed to use assert_nothing_raised{ ... } instead of assert() (or no assertions). Index: test_report.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/test/unit/bio/appl/blast/test_report.rb,v retrieving revision 1.5 retrieving revision 1.5.2.1 diff -C2 -d -r1.5 -r1.5.2.1 *** test_report.rb 5 Apr 2007 23:35:43 -0000 1.5 --- test_report.rb 12 May 2008 11:49:08 -0000 1.5.2.1 *************** *** 17,46 **** module Bio ! class TestBlastReportData bioruby_root = Pathname.new(File.join(File.dirname(__FILE__), ['..'] * 5)).cleanpath.to_s TestDataBlast = Pathname.new(File.join(bioruby_root, 'test', 'data', 'blast')).cleanpath.to_s ! def self.input ! File.open(File.join(TestDataBlast, 'b0002.faa')).read end ! def self.output(format = 7) ! case format ! when 0 ! File.open(File.join(TestDataBlast, 'b0002.faa.m0')).read ! when 7 ! File.open(File.join(TestDataBlast, 'b0002.faa.m7')).read ! when 8 ! File.open(File.join(TestDataBlast, 'b0002.faa.m8')).read end end ! end - class TestBlastReport < Test::Unit::TestCase ! require 'bio/appl/blast/report' def setup ! @report = Bio::Blast::Report.new(Bio::TestBlastReportData.output) end --- 17,68 ---- module Bio ! ! module TestBlastReportHelper bioruby_root = Pathname.new(File.join(File.dirname(__FILE__), ['..'] * 5)).cleanpath.to_s TestDataBlast = Pathname.new(File.join(bioruby_root, 'test', 'data', 'blast')).cleanpath.to_s ! private ! ! def get_input_data(basename = 'b0002.faa') ! File.open(File.join(TestDataBlast, basename)).read end ! def get_output_data(basename = 'b0002.faa', format = 7) ! fn = basename + ".m#{format.to_i}" ! ! # available filenames: ! # 'b0002.faa.m0' ! # 'b0002.faa.m7' ! # 'b0002.faa.m8' ! ! File.open(File.join(TestDataBlast, fn)).read ! end ! ! def create_report_object(basename = 'b0002.faa') ! case self.class.name.to_s ! when /XMLParser/i ! text = get_output_data(basename, 7) ! Bio::Blast::Report.new(text, :xmlparser) ! when /REXML/i ! text = get_output_data(basename, 7) ! Bio::Blast::Report.new(text, :rexml) ! when /Default/i ! text = get_output_data(basename, 0) ! Bio::Blast::Default::Report.new(text) ! when /Tab/i ! text = get_output_data(basename, 8) ! Bio::Blast::Report.new(text) ! else ! text = get_output_data(basename, 7) ! Bio::Blast::Report.new(text) end end ! end #module TestBlastReportHelper class TestBlastReport < Test::Unit::TestCase ! include TestBlastReportHelper def setup ! @report = create_report_object end *************** *** 97,109 **** def test_inclusion ! assert(@report.inclusion) end def test_sc_match ! assert(@report.sc_match) end def test_sc_mismatch ! assert(@report.sc_mismatch) end --- 119,131 ---- def test_inclusion ! assert_nothing_raised { @report.inclusion } end def test_sc_match ! assert_nothing_raised { @report.sc_match } end def test_sc_mismatch ! assert_nothing_raised { @report.sc_mismatch } end *************** *** 124,137 **** end ! def test_extrez_query assert_equal(nil, @report.entrez_query) end def test_each_iteration ! @report.each_iteration { |itr| } end def test_each_hit ! @report.each_hit { |hit| } end --- 146,163 ---- end ! def test_entrez_query assert_equal(nil, @report.entrez_query) end def test_each_iteration ! assert_nothing_raised { ! @report.each_iteration { |itr| } ! } end def test_each_hit ! assert_nothing_raised { ! @report.each_hit { |hit| } ! } end *************** *** 178,184 **** class TestBlastReportIteration < Test::Unit::TestCase def setup ! data = Bio::TestBlastReportData.output ! report = Bio::Blast::Report.new(data) @itr = report.iterations.first end --- 204,211 ---- class TestBlastReportIteration < Test::Unit::TestCase + include TestBlastReportHelper + def setup ! report = create_report_object @itr = report.iterations.first end *************** *** 205,211 **** class TestBlastReportHit < Test::Unit::TestCase def setup ! data = Bio::TestBlastReportData.output ! report = Bio::Blast::Report.new(data) @hit = report.hits.first end --- 232,239 ---- class TestBlastReportHit < Test::Unit::TestCase + include TestBlastReportHelper + def setup ! report = create_report_object @hit = report.hits.first end *************** *** 316,322 **** class TestBlastReportHsp < Test::Unit::TestCase def setup ! data = Bio::TestBlastReportData.output ! report = Bio::Blast::Report.new(data) @hsp = report.hits.first.hsps.first end --- 344,351 ---- class TestBlastReportHsp < Test::Unit::TestCase + include TestBlastReportHelper + def setup ! report = create_report_object @hsp = report.hits.first.hsps.first end *************** *** 343,347 **** def test_Hsp_gaps ! assert(@hsp.gaps) end --- 372,376 ---- def test_Hsp_gaps ! assert_nothing_raised { @hsp.gaps } end *************** *** 383,391 **** def test_Hsp_pattern_from ! @hsp.pattern_from end def test_Hsp_pattern_to ! @hsp.pattern_to end --- 412,420 ---- def test_Hsp_pattern_from ! assert_nothing_raised { @hsp.pattern_from } end def test_Hsp_pattern_to ! assert_nothing_raised { @hsp.pattern_to } end *************** *** 406,417 **** def test_Hsp_percent_identity ! @hsp.percent_identity end def test_Hsp_mismatch_count ! @hsp.mismatch_count end end end # module Bio --- 435,614 ---- def test_Hsp_percent_identity ! assert_nothing_raised { @hsp.percent_identity } end def test_Hsp_mismatch_count ! assert_nothing_raised { @hsp.mismatch_count } end end + class TestBlastReportREXML < TestBlastReport + end + + class TestBlastReportIterationREXML < TestBlastReportIteration + end + + class TestBlastReportHitREXML < TestBlastReportHit + end + + class TestBlastReportHspREXML < TestBlastReportHsp + end + + if defined? XMLParser then + + class TestBlastReportXMLParser < TestBlastReport + end + + class TestBlastReportIterationXMLParser < TestBlastReportIteration + end + + class TestBlastReportHitXMLParser < TestBlastReportHit + end + + class TestBlastReportHspXMLParser < TestBlastReportHsp + end + + end #if defined? XMLParser + + class TestBlastReportDefault < TestBlastReport + undef test_entrez_query + undef test_filter + undef test_hsp_len + undef test_inclusion + undef test_parameters + undef test_query_id + undef test_statistics + + def test_program + assert_equal('BLASTP', @report.program) + end + + def test_reference + text_str = 'Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402.' + assert_equal(text_str, @report.reference) + end + + def test_version + assert_equal('BLASTP 2.2.10 [Oct-19-2004]', @report.version) + end + + def test_kappa + assert_equal(0.134, @report.kappa) + end + + def test_lambda + assert_equal(0.319, @report.lambda) + end + + def test_entropy + assert_equal(0.383, @report.entropy) + end + + def test_gapped_kappa + assert_equal(0.0410, @report.gapped_kappa) + end + + def test_gapped_lambda + assert_equal(0.267, @report.gapped_lambda) + end + + def test_gapped_entropy + assert_equal(0.140, @report.gapped_entropy) + end + end + + class TestBlastReportIterationDefault < TestBlastReportIteration + undef test_statistics + end + + class TestBlastReportHitDefault < TestBlastReportHit + undef test_Hit_accession + undef test_Hit_hit_id + undef test_Hit_num + undef test_Hit_query_def + undef test_Hit_query_id + undef test_Hit_query_len + + def setup + @filtered_query_sequence = 'MRVLKFGGTSVANAERFLRVADILESNARQGQVATVLSAPAKITNHLVAMIEKTISGQDALPNISDAERIFAELLTGLAAAQPGFPLAQLKTFVDQEFAQIKHVLHGISLLGQCPDSINAALICRGEKMSIAIMAGVLEARGHNVTVIDPVEKLLAVGHYLESTVDIAESTRRIAASRIPADHMVLMAGFTAGNEKGELVVLGRNGSDYSAAVLAACLRADCCEIWTDVDGVYTCDPRQVPDARLLKSMSYQEAMELSYFGAKVLHPRTITPIAQFQIPCLIKNTGNPQAPGTLIGASRDEDELPVKGISNLNNMAMFSVSGPGMKGMVGMAARVFAAMSRARISVVLITQSSSEYSISFCVPQSDCVRAERAMQEEFYLELKEGLLEPLAVTERLAIISVVGDGMRTLRGISAKFFAALARANINIVAIAQGSSERSISVVVNNDDATTGVRVTHQMLFNTDQxxxxxxxxxxxxxxALLEQLKRQQSWLKNKHIDLRVCGVANSKALLTNVHGLNLENWQEELAQAKEPFNLGRLIRLVKEYHLLNPVIVDCTSSQAVADQYADFLREGFHVVTPNKKANTSSMDYYHQLRYAAEKSRRKFLYDTNVGAGLPVIENLQNLLNAGDELMKFSGILSGSLSYIFGKLDEGMSFSEATTLAREMGYTEPDPRDDLSGMDVARKLLILARETGRELELADIEIEPVLPAEFNAEGDVAAFMANLSQLDDLFAARVAKARDEGKVLRYVGNIDEDGVCRVKIAEVDGNDPLFKVKNGENALAFYSHYYQPLPLVLRGYGAGNDVTAAGVFADLLRTLSWKLGV' + super + end + + def test_Hit_bit_score + # differs from XML because of truncation in the default format + assert_equal(1567.0, @hit.bit_score) + end + + def test_Hit_identity + # differs from XML because filtered residues are not counted in the + # default format + assert_equal(806, @hit.identity) + end + + def test_Hit_midline + # differs from XML because filtered residues are not specified in XML + seq = @filtered_query_sequence.gsub(/x/, ' ') + assert_equal(seq, @hit.midline) + end + + def test_Hit_query_seq + # differs from XML because filtered residues are not specified in XML + seq = @filtered_query_sequence.gsub(/x/, 'X') + assert_equal(seq, @hit.query_seq) + end + end + + class TestBlastReportHspDefault < TestBlastReportHsp + undef test_Hsp_density + undef test_Hsp_mismatch_count + undef test_Hsp_num + undef test_Hsp_pattern_from + undef test_Hsp_pattern_to + + def setup + @filtered_query_sequence = 'MRVLKFGGTSVANAERFLRVADILESNARQGQVATVLSAPAKITNHLVAMIEKTISGQDALPNISDAERIFAELLTGLAAAQPGFPLAQLKTFVDQEFAQIKHVLHGISLLGQCPDSINAALICRGEKMSIAIMAGVLEARGHNVTVIDPVEKLLAVGHYLESTVDIAESTRRIAASRIPADHMVLMAGFTAGNEKGELVVLGRNGSDYSAAVLAACLRADCCEIWTDVDGVYTCDPRQVPDARLLKSMSYQEAMELSYFGAKVLHPRTITPIAQFQIPCLIKNTGNPQAPGTLIGASRDEDELPVKGISNLNNMAMFSVSGPGMKGMVGMAARVFAAMSRARISVVLITQSSSEYSISFCVPQSDCVRAERAMQEEFYLELKEGLLEPLAVTERLAIISVVGDGMRTLRGISAKFFAALARANINIVAIAQGSSERSISVVVNNDDATTGVRVTHQMLFNTDQxxxxxxxxxxxxxxALLEQLKRQQSWLKNKHIDLRVCGVANSKALLTNVHGLNLENWQEELAQAKEPFNLGRLIRLVKEYHLLNPVIVDCTSSQAVADQYADFLREGFHVVTPNKKANTSSMDYYHQLRYAAEKSRRKFLYDTNVGAGLPVIENLQNLLNAGDELMKFSGILSGSLSYIFGKLDEGMSFSEATTLAREMGYTEPDPRDDLSGMDVARKLLILARETGRELELADIEIEPVLPAEFNAEGDVAAFMANLSQLDDLFAARVAKARDEGKVLRYVGNIDEDGVCRVKIAEVDGNDPLFKVKNGENALAFYSHYYQPLPLVLRGYGAGNDVTAAGVFADLLRTLSWKLGV' + super + end + + def test_Hsp_identity + # differs from XML because filtered residues are not counted in the + # default format + assert_equal(806, @hsp.identity) + end + + def test_Hsp_positive + # differs from XML because filtered residues are not counted in the + # default format + assert_equal(806, @hsp.positive) + end + + def test_Hsp_midline + # differs from XML because filtered residues are not specified in XML + seq = @filtered_query_sequence.gsub(/x/, ' ') + assert_equal(seq, @hsp.midline) + end + + def test_Hsp_qseq + # differs from XML because filtered residues are not specified in XML + seq = @filtered_query_sequence.gsub(/x/, 'X') + assert_equal(seq, @hsp.qseq) + end + + def test_Hsp_hit_score + # differs from XML because of truncation in the default format + assert_equal(1567.0, @hsp.bit_score) + end + + def test_Hsp_hit_frame + # differs from XML because not available in the default BLASTP format + assert_equal(nil, @hsp.hit_frame) + end + + def test_Hsp_query_frame + # differs from XML because not available in the default BLASTP format + assert_equal(nil, @hsp.query_frame) + end + end + end # module Bio From ngoto at dev.open-bio.org Mon May 12 11:50:44 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Mon, 12 May 2008 11:50:44 +0000 Subject: [BioRuby-cvs] bioruby/test/unit/bio/appl/blast test_xmlparser.rb, 1.7.2.1, NONE Message-ID: <200805121150.m4CBoiA4004596@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/test/unit/bio/appl/blast In directory dev.open-bio.org:/tmp/cvs-serv4576 Removed Files: Tag: BRANCH-biohackathon2008 test_xmlparser.rb Log Message: test_xmlparser.rb is removed because it has few assertions and its role is now merged into test_report.rb. --- test_xmlparser.rb DELETED --- From ngoto at dev.open-bio.org Mon May 12 12:01:22 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Mon, 12 May 2008 12:01:22 +0000 Subject: [BioRuby-cvs] bioruby/test/unit/bio/appl/blast test_xmlparser.rb, 1.7, 1.8 Message-ID: <200805121201.m4CC1Mog004646@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/test/unit/bio/appl/blast In directory dev.open-bio.org:/tmp/cvs-serv4624/test/unit/bio/appl/blast Modified Files: test_xmlparser.rb Log Message: Same changes as 1.7 to 1.7.2.1 in BRANCH-biohackathon2008: Bug fix: tests in test/unit/bio/appl/blast/test_report.rb was ignored because of conflicts of test classes' names (TestBlastReport, etc.). The class names in test/unit/bio/appl/blast/test_xmlparser.rb is changed because it contains less assertions than that of test_report.rb. Index: test_xmlparser.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/test/unit/bio/appl/blast/test_xmlparser.rb,v retrieving revision 1.7 retrieving revision 1.8 diff -C2 -d -r1.7 -r1.8 *** test_xmlparser.rb 5 Apr 2007 23:35:43 -0000 1.7 --- test_xmlparser.rb 12 May 2008 12:01:20 -0000 1.8 *************** *** 16,20 **** ! module Bio class TestBlastFormat7XMLParserData bioruby_root = Pathname.new(File.join(File.dirname(__FILE__), ['..'] * 5)).cleanpath.to_s --- 16,20 ---- ! module Bio::TestBlastXMLParser class TestBlastFormat7XMLParserData bioruby_root = Pathname.new(File.join(File.dirname(__FILE__), ['..'] * 5)).cleanpath.to_s *************** *** 36,40 **** def setup ! @report = Bio::Blast::Report.new(Bio::TestBlastFormat7XMLParserData.output) end --- 36,40 ---- def setup ! @report = Bio::Blast::Report.new(TestBlastFormat7XMLParserData.output) end *************** *** 188,192 **** class TestBlastReportHit < Test::Unit::TestCase def setup ! data = Bio::TestBlastFormat7XMLParserData.output report = Bio::Blast::Report.new(data) @hit = report.hits.first --- 188,192 ---- class TestBlastReportHit < Test::Unit::TestCase def setup ! data = TestBlastFormat7XMLParserData.output report = Bio::Blast::Report.new(data) @hit = report.hits.first *************** *** 293,297 **** class TestBlastReportHsp < Test::Unit::TestCase def setup ! data = Bio::TestBlastFormat7XMLParserData.output report = Bio::Blast::Report.new(data) @hsp = report.hits.first.hsps.first --- 293,297 ---- class TestBlastReportHsp < Test::Unit::TestCase def setup ! data = TestBlastFormat7XMLParserData.output report = Bio::Blast::Report.new(data) @hsp = report.hits.first.hsps.first From ngoto at dev.open-bio.org Mon May 12 13:11:47 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Mon, 12 May 2008 13:11:47 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/appl/blast rexml.rb, 1.12, 1.13 xmlparser.rb, 1.17, 1.18 Message-ID: <200805121311.m4CDBl0K004957@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/appl/blast In directory dev.open-bio.org:/tmp/cvs-serv4930/lib/bio/appl/blast Modified Files: rexml.rb xmlparser.rb Log Message: * lib/bio/appl/blast/xmlparser.rb, lib/bio/appl/blast/rexml.rb Bug fix: unit test sometime fails due to improper treatment of some Blast parameters and difference between rexml and xmlparser. To fix the bug, types of some parameters may be changed, e.g. Bio::Blast::Report#expect is changed to return Float or nil. * ChangeLog ChangeLog for today's changes to lib/bio/appl/blast/* and related files. Index: xmlparser.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/appl/blast/xmlparser.rb,v retrieving revision 1.17 retrieving revision 1.18 diff -C2 -d -r1.17 -r1.18 *** xmlparser.rb 5 Apr 2007 23:35:39 -0000 1.17 --- xmlparser.rb 12 May 2008 13:11:45 -0000 1.18 *************** *** 116,139 **** end ! def xmlparser_parse_parameters(hash) ! labels = { ! 'matrix' => 'Parameters_matrix', ! 'expect' => 'Parameters_expect', ! 'include' => 'Parameters_include', ! 'sc-match' => 'Parameters_sc-match', ! 'sc-mismatch' => 'Parameters_sc-mismatch', ! 'gap-open' => 'Parameters_gap-open', ! 'gap-extend' => 'Parameters_gap-extend', ! 'filter' => 'Parameters_filter', ! 'pattern' => 'Parameters_pattern', ! 'entrez-query' => 'Parameters_entrez-query', ! } ! labels.each do |k,v| case k ! when 'filter', 'matrix' ! @parameters[k] = hash[v].to_s else ! @parameters[k] = hash[v].to_i end end end --- 116,148 ---- end ! # set parameter of the key as val ! def xml_set_parameter(key, val) ! #labels = { ! # 'matrix' => 'Parameters_matrix', ! # 'expect' => 'Parameters_expect', ! # 'include' => 'Parameters_include', ! # 'sc-match' => 'Parameters_sc-match', ! # 'sc-mismatch' => 'Parameters_sc-mismatch', ! # 'gap-open' => 'Parameters_gap-open', ! # 'gap-extend' => 'Parameters_gap-extend', ! # 'filter' => 'Parameters_filter', ! # 'pattern' => 'Parameters_pattern', ! # 'entrez-query' => 'Parameters_entrez-query', ! #} ! k = key.sub(/\AParameters\_/, '') ! @parameters[k] = case k ! when 'expect', 'include' ! val.to_f ! when /\Agap\-/, /\Asc\-/ ! val.to_i else ! val end + end + + def xmlparser_parse_parameters(hash) + hash.each do |k, v| + xml_set_parameter(k, v) end end Index: rexml.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/appl/blast/rexml.rb,v retrieving revision 1.12 retrieving revision 1.13 diff -C2 -d -r1.12 -r1.13 *** rexml.rb 5 Apr 2007 23:35:39 -0000 1.12 --- rexml.rb 12 May 2008 13:11:45 -0000 1.13 *************** *** 38,44 **** when 'BlastOutput_param' e.elements["Parameters"].each_element_with_text do |p| ! k = p.name.sub(/Parameters_/, '') ! v = p.text =~ /\D/ ? p.text : p.text.to_i ! @parameters[k] = v end else --- 38,42 ---- when 'BlastOutput_param' e.elements["Parameters"].each_element_with_text do |p| ! xml_set_parameter(p.name, p.text) end else From ngoto at dev.open-bio.org Mon May 12 13:11:47 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Mon, 12 May 2008 13:11:47 +0000 Subject: [BioRuby-cvs] bioruby ChangeLog,1.85,1.86 Message-ID: <200805121311.m4CDBlOs004952@dev.open-bio.org> Update of /home/repository/bioruby/bioruby In directory dev.open-bio.org:/tmp/cvs-serv4930 Modified Files: ChangeLog Log Message: * lib/bio/appl/blast/xmlparser.rb, lib/bio/appl/blast/rexml.rb Bug fix: unit test sometime fails due to improper treatment of some Blast parameters and difference between rexml and xmlparser. To fix the bug, types of some parameters may be changed, e.g. Bio::Blast::Report#expect is changed to return Float or nil. * ChangeLog ChangeLog for today's changes to lib/bio/appl/blast/* and related files. Index: ChangeLog =================================================================== RCS file: /home/repository/bioruby/bioruby/ChangeLog,v retrieving revision 1.85 retrieving revision 1.86 diff -C2 -d -r1.85 -r1.86 *** ChangeLog 15 Apr 2008 13:54:38 -0000 1.85 --- ChangeLog 12 May 2008 13:11:45 -0000 1.86 *************** *** 1,2 **** --- 1,23 ---- + 2008-05-12 Naohisa Goto + + * lib/bio/appl/blast/xmlparser.rb, lib/bio/appl/blast/rexml.rb + + Bug fix: unit test sometime fails due to improper treatment of some + Blast parameters and difference between rexml and xmlparser. + To fix the bug, types of some parameters may be changed, e.g. + Bio::Blast::Report#expect is changed to return Float or nil. + + * lib/bio/appl/blast/format0.rb + + Bug fix: Bio::Blast::Default::Report#eff_space returns wrong value + ("Effective length of database"). It should return the value of + "Effective search space". + + * test/unit/bio/appl/blast/test_xmlparser.rb + + Bug fix: tests in test/unit/bio/appl/blast/test_report.rb were + ignored because of conflicts of the names of test classes. + Class name in test_xmlparser.rb is changed to fix the bug. + 2008-04-15 Naohisa Goto From ngoto at dev.open-bio.org Mon May 12 13:19:35 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Mon, 12 May 2008 13:19:35 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/appl/blast rexml.rb, 1.12, 1.12.2.1 xmlparser.rb, 1.17, 1.17.2.1 Message-ID: <200805121319.m4CDJZTf004986@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/appl/blast In directory dev.open-bio.org:/tmp/cvs-serv4966/lib/bio/appl/blast Modified Files: Tag: BRANCH-biohackathon2008 rexml.rb xmlparser.rb Log Message: Merging differences between 1.17 and 1.18 into xmplarser.rb and between 1.12 and 1.13 into rexml.rb. Index: xmlparser.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/appl/blast/xmlparser.rb,v retrieving revision 1.17 retrieving revision 1.17.2.1 diff -C2 -d -r1.17 -r1.17.2.1 *** xmlparser.rb 5 Apr 2007 23:35:39 -0000 1.17 --- xmlparser.rb 12 May 2008 13:19:32 -0000 1.17.2.1 *************** *** 116,139 **** end ! def xmlparser_parse_parameters(hash) ! labels = { ! 'matrix' => 'Parameters_matrix', ! 'expect' => 'Parameters_expect', ! 'include' => 'Parameters_include', ! 'sc-match' => 'Parameters_sc-match', ! 'sc-mismatch' => 'Parameters_sc-mismatch', ! 'gap-open' => 'Parameters_gap-open', ! 'gap-extend' => 'Parameters_gap-extend', ! 'filter' => 'Parameters_filter', ! 'pattern' => 'Parameters_pattern', ! 'entrez-query' => 'Parameters_entrez-query', ! } ! labels.each do |k,v| case k ! when 'filter', 'matrix' ! @parameters[k] = hash[v].to_s else ! @parameters[k] = hash[v].to_i end end end --- 116,148 ---- end ! # set parameter of the key as val ! def xml_set_parameter(key, val) ! #labels = { ! # 'matrix' => 'Parameters_matrix', ! # 'expect' => 'Parameters_expect', ! # 'include' => 'Parameters_include', ! # 'sc-match' => 'Parameters_sc-match', ! # 'sc-mismatch' => 'Parameters_sc-mismatch', ! # 'gap-open' => 'Parameters_gap-open', ! # 'gap-extend' => 'Parameters_gap-extend', ! # 'filter' => 'Parameters_filter', ! # 'pattern' => 'Parameters_pattern', ! # 'entrez-query' => 'Parameters_entrez-query', ! #} ! k = key.sub(/\AParameters\_/, '') ! @parameters[k] = case k ! when 'expect', 'include' ! val.to_f ! when /\Agap\-/, /\Asc\-/ ! val.to_i else ! val end + end + + def xmlparser_parse_parameters(hash) + hash.each do |k, v| + xml_set_parameter(k, v) end end Index: rexml.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/appl/blast/rexml.rb,v retrieving revision 1.12 retrieving revision 1.12.2.1 diff -C2 -d -r1.12 -r1.12.2.1 *** rexml.rb 5 Apr 2007 23:35:39 -0000 1.12 --- rexml.rb 12 May 2008 13:19:32 -0000 1.12.2.1 *************** *** 38,44 **** when 'BlastOutput_param' e.elements["Parameters"].each_element_with_text do |p| ! k = p.name.sub(/Parameters_/, '') ! v = p.text =~ /\D/ ? p.text : p.text.to_i ! @parameters[k] = v end else --- 38,42 ---- when 'BlastOutput_param' e.elements["Parameters"].each_element_with_text do |p| ! xml_set_parameter(p.name, p.text) end else From ngoto at dev.open-bio.org Tue May 13 11:19:44 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Tue, 13 May 2008 11:19:44 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/appl/blast format0.rb, 1.26.2.2, 1.26.2.3 Message-ID: <200805131119.m4DBJiw2008784@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/appl/blast In directory dev.open-bio.org:/tmp/cvs-serv8700/lib/bio/appl/blast Modified Files: Tag: BRANCH-biohackathon2008 format0.rb Log Message: Bio::Blast::Default::Report::Iteration#lambda, #kappa, #entropy, #gapped_lambda, #gapped_kappa, and #gapped_entropy, and the same methods in Bio::Blast::Default::Report class are changed to return float or nil instead of string or nil. Index: format0.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/appl/blast/format0.rb,v retrieving revision 1.26.2.2 retrieving revision 1.26.2.3 diff -C2 -d -r1.26.2.2 -r1.26.2.3 *** format0.rb 12 May 2008 11:16:17 -0000 1.26.2.2 --- format0.rb 13 May 2008 11:19:42 -0000 1.26.2.3 *************** *** 723,733 **** end if gapped then ! @gapped_lambda = h['Lambda'] ! @gapped_kappa = h['K'] ! @gapped_entropy = h['H'] else ! @lambda = h['Lambda'] ! @kappa = h['K'] ! @entropy = h['H'] end end #each --- 723,733 ---- end if gapped then ! @gapped_lambda = (v = h['Lambda']) ? v.to_f : nil ! @gapped_kappa = (v = h['K']) ? v.to_f : nil ! @gapped_entropy = (v = h['H']) ? v.to_f : nil else ! @lambda = (v = h['Lambda']) ? v.to_f : nil ! @kappa = (v = h['K']) ? v.to_f : nil ! @entropy = (v = h['H']) ? v.to_f : nil end end #each From ngoto at dev.open-bio.org Tue May 13 11:21:47 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Tue, 13 May 2008 11:21:47 +0000 Subject: [BioRuby-cvs] bioruby/doc Changes-1.3.rd,NONE,1.1.2.1 Message-ID: <200805131121.m4DBLlqg008833@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/doc In directory dev.open-bio.org:/tmp/cvs-serv8813/doc Added Files: Tag: BRANCH-biohackathon2008 Changes-1.3.rd Log Message: newly added documents describing important or incompatible changes in bioruby-1.3 (after bioruby-1.2.1). --- NEW FILE: Changes-1.3.rd --- = Incompatible and important changes since the BioRuby 1.2.1 release A lot of changes have been made to the BioRuby after the version 1.2.1 is released. == Incompatible changes --- Bio::Features Bio::Features is obsoleted and changed to an array of Bio::Feature object with some backward compatibility methods. The backward compatibility methods will soon be removed in the future. --- Bio::References Bio::References is obsoleted and changed to an array of Bio::Reference object with some backward compatibility methods. The backward compatibility methods will soon be removed in the future. --- Bio::BLAST::Default::Report, Bio::BLAST::Default::Report::Hit, Bio::BLAST::Default::Report::HSP, Bio::BLAST::WU::Report, Bio::BLAST::WU::Report::Hit, Bio::BLAST::WU::Report::HSP * Iteration#lambda, #kappa, #entropy, #gapped_lambda, #gapped_kappa, and #gapped_entropy, and the same methods in the Report class are changed to return float or nil instead of string or nil. From ngoto at dev.open-bio.org Wed May 14 13:30:15 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Wed, 14 May 2008 13:30:15 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/appl/blast format0.rb,1.28,1.29 Message-ID: <200805141330.m4EDUFSv011996@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/appl/blast In directory dev.open-bio.org:/tmp/cvs-serv11956/lib/bio/appl/blast Modified Files: format0.rb Log Message: Bug fix: For some PHI-BLAST (blastpgp) entries, possibly due to the changes of output format, Bio::Blast::Default::Report::Iteration#eff_space (and the shortcut method in the Report class) raises StringScanner::Error. In addition, Iteration#pattern and #pattern_positions returns incorrect values possibly due to the output format changes. Index: format0.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/appl/blast/format0.rb,v retrieving revision 1.28 retrieving revision 1.29 diff -C2 -d -r1.28 -r1.29 *** format0.rb 12 May 2008 11:25:55 -0000 1.28 --- format0.rb 14 May 2008 13:30:12 -0000 1.29 *************** *** 535,539 **** r = data.first break unless r ! if /^Significant alignments for pattern/ =~ r data.shift r = data.first --- 535,539 ---- r = data.first break unless r ! while /^Significant alignments for pattern/ =~ r data.shift r = data.first *************** *** 590,596 **** @f0message.each do |r| sc = StringScanner.new(r) ! if sc.skip_until(/^ *pattern +(.+)$/) then @pattern = sc[1] unless @pattern ! sc.skip_until(/^ at position +(\d+)/) @pattern_positions << sc[1].to_i end --- 590,596 ---- @f0message.each do |r| sc = StringScanner.new(r) ! if sc.skip_until(/^ *pattern +([^\s]+)/) then @pattern = sc[1] unless @pattern ! sc.skip_until(/(?:^ *| +)at position +(\d+) +of +query +sequence/) @pattern_positions << sc[1].to_i end From ngoto at dev.open-bio.org Wed May 14 13:37:05 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Wed, 14 May 2008 13:37:05 +0000 Subject: [BioRuby-cvs] bioruby ChangeLog,1.86,1.87 Message-ID: <200805141337.m4EDb514012065@dev.open-bio.org> Update of /home/repository/bioruby/bioruby In directory dev.open-bio.org:/tmp/cvs-serv12045 Modified Files: ChangeLog Log Message: ChangeLog for lib/bio/appl/blast/format0.rb from 1.28 to 1.29. Index: ChangeLog =================================================================== RCS file: /home/repository/bioruby/bioruby/ChangeLog,v retrieving revision 1.86 retrieving revision 1.87 diff -C2 -d -r1.86 -r1.87 *** ChangeLog 12 May 2008 13:11:45 -0000 1.86 --- ChangeLog 14 May 2008 13:37:03 -0000 1.87 *************** *** 1,2 **** --- 1,12 ---- + 2008-05-14 Naohisa Goto + + * lib/bio/appl/blast/format0.rb + + Bug fix: Possibly because of the output format changes of PHI-BLAST, + Bio::Blast::Default::Report::Iteration#eff_space (and the shortcut + method in the Report class) failed for PHI-BLAST (blastpgp) results, + and Iteration#pattern and #pattern_positions (and the + shortcut methods in the Report class) returned incorrect values. + 2008-05-12 Naohisa Goto From ngoto at dev.open-bio.org Wed May 14 13:39:43 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Wed, 14 May 2008 13:39:43 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/appl/blast format0.rb, 1.26.2.3, 1.26.2.4 Message-ID: <200805141339.m4EDdh98012136@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/appl/blast In directory dev.open-bio.org:/tmp/cvs-serv12115/lib/bio/appl/blast Modified Files: Tag: BRANCH-biohackathon2008 format0.rb Log Message: Merging differences between 1.28 and 1.29 into format0.rb in BRANCH-biohackathon2008. Index: format0.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/appl/blast/format0.rb,v retrieving revision 1.26.2.3 retrieving revision 1.26.2.4 diff -C2 -d -r1.26.2.3 -r1.26.2.4 *** format0.rb 13 May 2008 11:19:42 -0000 1.26.2.3 --- format0.rb 14 May 2008 13:39:41 -0000 1.26.2.4 *************** *** 535,539 **** r = data.first break unless r ! if /^Significant alignments for pattern/ =~ r data.shift r = data.first --- 535,539 ---- r = data.first break unless r ! while /^Significant alignments for pattern/ =~ r data.shift r = data.first *************** *** 590,596 **** @f0message.each do |r| sc = StringScanner.new(r) ! if sc.skip_until(/^ *pattern +(.+)$/) then @pattern = sc[1] unless @pattern ! sc.skip_until(/^ at position +(\d+)/) @pattern_positions << sc[1].to_i end --- 590,596 ---- @f0message.each do |r| sc = StringScanner.new(r) ! if sc.skip_until(/^ *pattern +([^\s]+)/) then @pattern = sc[1] unless @pattern ! sc.skip_until(/(?:^ *| +)at position +(\d+) +of +query +sequence/) @pattern_positions << sc[1].to_i end From pjotr at dev.open-bio.org Mon May 19 11:23:58 2008 From: pjotr at dev.open-bio.org (Pjotr Prins) Date: Mon, 19 May 2008 11:23:58 +0000 Subject: [BioRuby-cvs] bioruby/sample fastasort.rb,NONE,1.1 Message-ID: <200805191123.m4JBNwdr000709@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/sample In directory dev.open-bio.org:/tmp/cvs-serv689 Added Files: fastasort.rb Log Message: Simple example for sorting a flatfile --- NEW FILE: fastasort.rb --- #!/usr/bin/env ruby # # fastasort: Sorts a FASTA file (in fact it can use any flat file input supported # by BIORUBY) while modifying the definition of each record in the # process. # # Copyright (C) 2008 KATAYAMA Toshiaki & Pjotr Prins # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # $Id: fastasort.rb,v 1.1 2008/05/19 11:23:56 pjotr Exp $ # require 'bio' include Bio table = Hash.new # table to sort objects ARGV.each do | fn | Bio::FlatFile.auto(fn).each do | item | # strip JALView extension from definition e.g. .../1-212 if item.definition =~ /\/\d+-\d+$/ item.definition = $` end table[item.definition] = item.data end end # Output sorted table table.sort.each do | definition, data | rec = Bio::FastaFormat.new('> '+definition.strip+"\n"+data) print rec end From pjotr at dev.open-bio.org Mon May 19 12:22:07 2008 From: pjotr at dev.open-bio.org (Pjotr Prins) Date: Mon, 19 May 2008 12:22:07 +0000 Subject: [BioRuby-cvs] bioruby/doc Tutorial.rd,1.21,1.22 Message-ID: <200805191222.m4JCM7nM000852@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/doc In directory dev.open-bio.org:/tmp/cvs-serv829/doc Modified Files: Tutorial.rd Log Message: Piping FASTA files (examples and doc) Index: Tutorial.rd =================================================================== RCS file: /home/repository/bioruby/bioruby/doc/Tutorial.rd,v retrieving revision 1.21 retrieving revision 1.22 diff -C2 -d -r1.21 -r1.22 *** Tutorial.rd 13 Feb 2008 08:04:30 -0000 1.21 --- Tutorial.rd 19 May 2008 12:22:05 -0000 1.22 *************** *** 466,470 **** An example that can take any input, filter using a regular expression to output ! to a FASTA file can be found in sample/any2fasta.rb. Other methods to extract specific data from database objects can be --- 466,477 ---- An example that can take any input, filter using a regular expression to output ! to a FASTA file can be found in sample/any2fasta.rb. With this technique it is ! possible to write a Unix type grep/sort pipe for sequence information. One ! example using scripts in the BIORUBY sample folder: ! ! fastagrep.rb '/At|Dm/' database.seq | fastasort.rb ! ! greps the database for Arabidopsis and Drosophila entries and sorts the output ! to FASTA. Other methods to extract specific data from database objects can be From pjotr at dev.open-bio.org Mon May 19 12:22:07 2008 From: pjotr at dev.open-bio.org (Pjotr Prins) Date: Mon, 19 May 2008 12:22:07 +0000 Subject: [BioRuby-cvs] bioruby/sample fastagrep.rb, NONE, 1.1 fastasort.rb, 1.1, 1.2 Message-ID: <200805191222.m4JCM7KO000857@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/sample In directory dev.open-bio.org:/tmp/cvs-serv829/sample Modified Files: fastasort.rb Added Files: fastagrep.rb Log Message: Piping FASTA files (examples and doc) --- NEW FILE: fastagrep.rb --- #!/usr/bin/env ruby # # fastagrep: Greps a FASTA file (in fact it can use any flat file input supported # by BIORUBY) and outputs sorted FASTA # # Copyright (C) 2008 KATAYAMA Toshiaki & Pjotr Prins # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # $Id: fastagrep.rb,v 1.1 2008/05/19 12:22:05 pjotr Exp $ # require 'bio' include Bio usage = < reduced.fasta As the result is a FASTA stream you could pipe it for sorting: fastagrep.rb "/Arabidopsis|Drosophila/i" *.seq | fastasort.rb USAGE if ARGV.size == 0 print usage exit 1 end skip = (ARGV[0] == '-v') ARGV.shift if skip # ---- Valid regular expression - if it is not a file regex = ARGV[0] if regex=~/^\// and !File.exist?(regex) ARGV.shift else print usage exit 1 end ARGV.each do | fn | Bio::FlatFile.auto(fn).each do | item | if skip next if eval("item.definition =~ #{regex}") else next if eval("item.definition !~ #{regex}") end rec = Bio::FastaFormat.new('> '+item.definition.strip+"\n"+item.data) print rec end end Index: fastasort.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/sample/fastasort.rb,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** fastasort.rb 19 May 2008 11:23:56 -0000 1.1 --- fastasort.rb 19 May 2008 12:22:05 -0000 1.2 *************** *** 3,7 **** # fastasort: Sorts a FASTA file (in fact it can use any flat file input supported # by BIORUBY) while modifying the definition of each record in the ! # process. # # Copyright (C) 2008 KATAYAMA Toshiaki & Pjotr Prins --- 3,8 ---- # fastasort: Sorts a FASTA file (in fact it can use any flat file input supported # by BIORUBY) while modifying the definition of each record in the ! # process so it is suitable for processing with (for example) pal2nal ! # and PAML. # # Copyright (C) 2008 KATAYAMA Toshiaki & Pjotr Prins *************** *** 27,35 **** ARGV.each do | fn | Bio::FlatFile.auto(fn).each do | item | # strip JALView extension from definition e.g. .../1-212 if item.definition =~ /\/\d+-\d+$/ item.definition = $` end ! table[item.definition] = item.data end end --- 28,47 ---- ARGV.each do | fn | Bio::FlatFile.auto(fn).each do | item | + # Some procession of the definition for external programs (just + # an example): + # strip JALView extension from definition e.g. .../1-212 if item.definition =~ /\/\d+-\d+$/ item.definition = $` end ! # substitute slashes: ! definition = item.definition.gsub(/\//,'-') ! # substitute quotes and ampersands: ! definition = item.definition.gsub(/['"&]/,'x') ! # prefix letters if the first position is a number: ! definition = 'seq'+definition if definition =~ /^\d/ ! ! # Now add the data to the sort table ! table[definition] = item.data end end From ngoto at dev.open-bio.org Wed May 21 11:28:56 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Wed, 21 May 2008 11:28:56 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/embl format.erb,1.1.2.1,NONE Message-ID: <200805211128.m4LBSu3f009781@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/embl In directory dev.open-bio.org:/tmp/cvs-serv9740/embl Removed Files: Tag: BRANCH-biohackathon2008 format.erb Log Message: removed unused file lib/bio/db/embl/format.erb. The contents of this file is already moved to lib/bio/db/embl/format_embl.rb and modified. --- format.erb DELETED --- From ngoto at dev.open-bio.org Wed May 21 12:27:35 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Wed, 21 May 2008 12:27:35 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/embl embl.rb,1.29.2.4,1.29.2.5 Message-ID: <200805211227.m4LCRZo6009984@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/embl In directory dev.open-bio.org:/tmp/cvs-serv9964/lib/bio/db/embl Modified Files: Tag: BRANCH-biohackathon2008 embl.rb Log Message: added rDoc for Bio::EMBL#to_biosequence Index: embl.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/embl/embl.rb,v retrieving revision 1.29.2.4 retrieving revision 1.29.2.5 diff -C2 -d -r1.29.2.4 -r1.29.2.5 *** embl.rb 21 Mar 2008 06:24:42 -0000 1.29.2.4 --- embl.rb 21 May 2008 12:27:33 -0000 1.29.2.5 *************** *** 371,377 **** alias naseq seq alias ntseq seq ! # // Line; termination line (end; 1/entry) def to_biosequence bio_seq = Bio::Sequence.new(self.seq) --- 371,383 ---- alias naseq seq alias ntseq seq ! ! #-- # // Line; termination line (end; 1/entry) + #++ + # converts the entry to Bio::Sequence object + # --- + # *Arguments*:: + # *Returns*:: Bio::Sequence object def to_biosequence bio_seq = Bio::Sequence.new(self.seq) From ngoto at dev.open-bio.org Wed May 28 13:09:05 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Wed, 28 May 2008 13:09:05 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/embl embl.rb,1.29.2.5,1.29.2.6 Message-ID: <200805281309.m4SD9504013095@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/embl In directory dev.open-bio.org:/tmp/cvs-serv13075/lib/bio/db/embl Modified Files: Tag: BRANCH-biohackathon2008 embl.rb Log Message: fixed possible typo Index: embl.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/embl/embl.rb,v retrieving revision 1.29.2.5 retrieving revision 1.29.2.6 diff -C2 -d -r1.29.2.5 -r1.29.2.6 *** embl.rb 21 May 2008 12:27:33 -0000 1.29.2.5 --- embl.rb 28 May 2008 13:09:03 -0000 1.29.2.6 *************** *** 384,388 **** bio_seq.entry_id = self.entry_id bio_seq.primary_accession = self.accessions[0] ! bio_seq.secondary_accessions = self.accessions[1,-1] || [] bio_seq.molecule_type = self.molecule_type bio_seq.data_class = self.data_class --- 384,388 ---- bio_seq.entry_id = self.entry_id bio_seq.primary_accession = self.accessions[0] ! bio_seq.secondary_accessions = self.accessions[1..-1] || [] bio_seq.molecule_type = self.molecule_type bio_seq.data_class = self.data_class From ngoto at dev.open-bio.org Wed May 28 13:26:35 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Wed, 28 May 2008 13:26:35 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/genbank format_genbank.rb, 1.1.2.3, 1.1.2.4 Message-ID: <200805281326.m4SDQZDb013144@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/genbank In directory dev.open-bio.org:/tmp/cvs-serv13124/lib/bio/db/genbank Modified Files: Tag: BRANCH-biohackathon2008 format_genbank.rb Log Message: simplify sequence formatting routine Index: format_genbank.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/genbank/Attic/format_genbank.rb,v retrieving revision 1.1.2.3 retrieving revision 1.1.2.4 diff -C2 -d -r1.1.2.3 -r1.1.2.4 *** format_genbank.rb 7 May 2008 12:28:56 -0000 1.1.2.3 --- format_genbank.rb 28 May 2008 13:26:33 -0000 1.1.2.4 *************** *** 102,111 **** # formats sequence lines as GenBank ! def each_genbank_seqline(str) #:yields: counter, seqline i = 1 ! a = str.scan(/.{1,60}/) do |s| ! yield i, s.gsub(/(.{1,10})/, " \\1") i += 60 end end --- 102,114 ---- # formats sequence lines as GenBank ! def seq_format_genbank(str) i = 1 ! result = str.gsub(/.{1,60}/) do |s| ! s = s.gsub(/.{1,10}/, ' \0') ! y = sprintf("%9d%s\n", i, s) i += 60 + y end + result end *************** *** 129,135 **** <%= format_features_genbank(features || []) %>ORIGIN ! <% each_genbank_seqline(seq) do |i, s| ! %><%= sprintf('%9d', i) %><%= s %> ! <% end %>// __END_OF_TEMPLATE__ --- 132,137 ---- <%= format_features_genbank(features || []) %>ORIGIN ! <%= seq_format_genbank(seq) ! %>// __END_OF_TEMPLATE__ From ngoto at dev.open-bio.org Wed May 28 13:38:09 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Wed, 28 May 2008 13:38:09 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/embl format_embl.rb, 1.1.2.4, 1.1.2.5 Message-ID: <200805281338.m4SDc9Dh013213@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/embl In directory dev.open-bio.org:/tmp/cvs-serv13173/lib/bio/db/embl Modified Files: Tag: BRANCH-biohackathon2008 format_embl.rb Log Message: simplify code of seq_format_embl(), and SQ line is changed not to show non-ACGT single base contents (which should be shown together as "other"). Index: format_embl.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/embl/Attic/format_embl.rb,v retrieving revision 1.1.2.4 retrieving revision 1.1.2.5 diff -C2 -d -r1.1.2.4 -r1.1.2.5 *** format_embl.rb 7 May 2008 12:24:26 -0000 1.1.2.4 --- format_embl.rb 28 May 2008 13:38:07 -0000 1.1.2.5 *************** *** 106,121 **** def seq_format_embl(seq) - output_lines = Array.new counter = 0 ! remainder = seq.window_search(60,60) do |subseq| ! counter += 60 ! subseq.gsub!(/(.{10})/, '\1 ') ! output_lines.push(' '*5 + subseq + counter.to_s.rjust(9)) end ! counter += remainder.length ! remainder = (remainder.to_s + ' '*(60-remainder.length)) ! remainder.gsub!(/(.{10})/, '\1 ') ! output_lines.push(' '*5 + remainder + counter.to_s.rjust(9)) ! return output_lines.join("\n") end --- 106,126 ---- def seq_format_embl(seq) counter = 0 ! result = seq.gsub(/.{1,60}/) do |x| ! counter += x.length ! x = x.gsub(/.{10}/, '\0 ') ! sprintf(" %-66s%9d\n", x, counter) end ! result.chomp! ! result ! end ! ! def seq_composition(seq) ! { :a => seq.count('aA'), ! :c => seq.count('cC'), ! :g => seq.count('gG'), ! :t => seq.count('tTuU'), ! :other => seq.count('^aAcCgGtTuU') ! } end *************** *** 140,144 **** FH <%= format_features_embl(features || []) %>XX ! SQ Sequence <%= seq.length %> BP; <%= seq.composition.collect{|k,v| "#{v} #{k.upcase}"}.join('; ') + '; ' + (seq.gsub(/[ACTGactg]/, '').length.to_s ) + ' other;' %> <%= seq_format_embl(seq) %> // --- 145,149 ---- FH <%= format_features_embl(features || []) %>XX ! SQ Sequence <%= seq.length %> BP; <% c = seq_composition(seq) %><%= c[:a] %> A; <%= c[:c] %> C; <%= c[:g] %> G; <%= c[:t] %> T; <%= c[:other] %> other; <%= seq_format_embl(seq) %> // From pjotr at dev.open-bio.org Thu May 29 11:25:47 2008 From: pjotr at dev.open-bio.org (Pjotr Prins) Date: Thu, 29 May 2008 11:25:47 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio reference.rb,1.24,1.25 Message-ID: <200805291125.m4TBPlWZ015209@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio In directory dev.open-bio.org:/tmp/cvs-serv15189 Modified Files: reference.rb Log Message: - Improved bibtex support (optional output of abstract - strip empty fields) - Put generated URL into separate method Index: reference.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/reference.rb,v retrieving revision 1.24 retrieving revision 1.25 diff -C2 -d -r1.24 -r1.25 *** reference.rb 5 Apr 2007 23:35:39 -0000 1.24 --- reference.rb 29 May 2008 11:25:44 -0000 1.25 *************** *** 71,77 **** attr_reader :abstract - # An URL String. - attr_reader :url - # MeSH terms in an Array. attr_reader :mesh --- 71,74 ---- *************** *** 128,132 **** @medline = hash['medline'] # 98765432 @abstract = hash['abstract'] - @url = hash['url'] @mesh = hash['mesh'] @affiliations = hash['affiliations'] --- 125,128 ---- *************** *** 232,241 **** lines << "%P #{@pages}" unless @pages.empty? lines << "%M #{@pubmed}" unless @pubmed.to_s.empty? ! if @pubmed ! cgi = "http://www.ncbi.nlm.nih.gov/entrez/query.fcgi" ! opts = "cmd=Retrieve&db=PubMed&dopt=Citation&list_uids" ! @url = "#{cgi}?#{opts}=#{@pubmed}" ! end ! lines << "%U #{@url}" unless @url.empty? lines << "%X #{@abstract}" unless @abstract.empty? @mesh.each do |term| --- 228,232 ---- lines << "%P #{@pages}" unless @pages.empty? lines << "%M #{@pubmed}" unless @pubmed.to_s.empty? ! lines << "%U #{url}" unless url.empty? lines << "%X #{@abstract}" unless @abstract.empty? @mesh.each do |term| *************** *** 299,318 **** # *Arguments*: # * (optional) _section_: BiBTeX section as String # *Returns*:: String ! def bibtex(section = nil) section = "article" unless section authors = authors_join(' and ', ' and ') pages = @pages.sub('-', '--') ! return <<-"END".gsub(/\t/, '') ! @#{section}{PMID:#{@pubmed}, ! author = {#{authors}}, ! title = {#{@title}}, ! journal = {#{@journal}}, ! year = {#{@year}}, ! volume = {#{@volume}}, ! number = {#{@issue}}, ! pages = {#{pages}}, ! } ! END end --- 290,317 ---- # *Arguments*: # * (optional) _section_: BiBTeX section as String + # * (optional) _keywords_: Array of additional keywords, e.g. ['abstract'] # *Returns*:: String ! def bibtex(section = nil, add_keywords = []) section = "article" unless section authors = authors_join(' and ', ' and ') pages = @pages.sub('-', '--') ! keywords = "author title journal year volume number pages url".split(/ /) ! bib = "@#{section}{PMID:#{@pubmed},\n" ! (keywords+add_keywords).each do | kw | ! if kw == 'author' ! ref = authors ! elsif kw == 'title' ! # strip final dot from title ! ref = @title.sub(/\.$/,'') ! elsif kw == 'number' ! ref = @issue ! elsif kw == 'url' ! ref = url ! else ! ref = eval('@'+kw) ! end ! bib += " #{kw.ljust(12)} = {#{ref}},\n" if ref != '' ! end ! bib+"}\n" end *************** *** 500,503 **** --- 499,513 ---- end + # Returns a valid URL for pubmed records + # + # *Returns*:: String + def url + if @pubmed != '' + cgi = "http://www.ncbi.nlm.nih.gov/entrez/query.fcgi" + opts = "cmd=Retrieve&db=PubMed&dopt=Citation&list_uids" + return "#{cgi}?#{opts}=#{@pubmed}" + end + '' + end private *************** *** 527,530 **** --- 537,541 ---- end + end From pjotr at dev.open-bio.org Sat May 31 09:36:58 2008 From: pjotr at dev.open-bio.org (Pjotr Prins) Date: Sat, 31 May 2008 09:36:58 +0000 Subject: [BioRuby-cvs] bioruby/test/unit/bio test_reference.rb,1.3,1.4 Message-ID: <200805310936.m4V9aw7X020318@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/test/unit/bio In directory dev.open-bio.org:/tmp/cvs-serv20293/test/unit/bio Modified Files: test_reference.rb Log Message: - Bibtex: reverted on url regression per comment Naohisa - now it gets overridden on empty for pubmed only. - Bibtex: fixed unit tests Index: test_reference.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/test/unit/bio/test_reference.rb,v retrieving revision 1.3 retrieving revision 1.4 diff -C2 -d -r1.3 -r1.4 *** test_reference.rb 5 Apr 2007 23:35:42 -0000 1.3 --- test_reference.rb 31 May 2008 09:36:56 -0000 1.4 *************** *** 91,95 **** def test_format_endnote ! str = "%0 Journal Article\n%A Hoge, J.P.\n%A Fuga, F.B.\n%D 2001\n%T Title of the study.\n%J Theor. J. Hoge\n%V 12\n%N 3\n%P 123-145\n%M 12345678\n%U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=12345678\n%X Hoge fuga. hoge fuga.\n%K Hoge\n%+ Tokyo" assert_equal(str, @obj.format('endnote')) assert_equal(str, @obj.endnote) --- 91,95 ---- def test_format_endnote ! str = "%0 Journal Article\n%A Hoge, J.P.\n%A Fuga, F.B.\n%D 2001\n%T Title of the study.\n%J Theor. J. Hoge\n%V 12\n%N 3\n%P 123-145\n%M 12345678\n%U http://example.com\n%X Hoge fuga. hoge fuga.\n%K Hoge\n%+ Tokyo" assert_equal(str, @obj.format('endnote')) assert_equal(str, @obj.endnote) *************** *** 103,117 **** def test_format_bibtex ! str =< Update of /home/repository/bioruby/bioruby/lib/bio In directory dev.open-bio.org:/tmp/cvs-serv20293/lib/bio Modified Files: reference.rb Log Message: - Bibtex: reverted on url regression per comment Naohisa - now it gets overridden on empty for pubmed only. - Bibtex: fixed unit tests Index: reference.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/reference.rb,v retrieving revision 1.25 retrieving revision 1.26 diff -C2 -d -r1.25 -r1.26 *** reference.rb 29 May 2008 11:25:44 -0000 1.25 --- reference.rb 31 May 2008 09:36:55 -0000 1.26 *************** *** 77,80 **** --- 77,83 ---- attr_reader :affiliations + # An URL String. + attr_reader :url + # Create a new Bio::Reference object from a Hash of values. # Data is extracted from the values for keys: *************** *** 125,128 **** --- 128,132 ---- @medline = hash['medline'] # 98765432 @abstract = hash['abstract'] + @url = hash['url'] @mesh = hash['mesh'] @affiliations = hash['affiliations'] *************** *** 503,506 **** --- 507,511 ---- # *Returns*:: String def url + return @url if @url and @url != '' if @pubmed != '' cgi = "http://www.ncbi.nlm.nih.gov/entrez/query.fcgi"