From ngoto at dev.open-bio.org Tue Apr 1 02:31:37 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Tue, 01 Apr 2008 06:31:37 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/appl/blast format0.rb,1.26,1.27 Message-ID: <200804010631.m316VbfM002141@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/appl/blast In directory dev.open-bio.org:/tmp/cvs-serv2121/lib/bio/appl/blast Modified Files: format0.rb Log Message: Fixed a bug when a null line is inserted after database title in some cases, reported by Tomoaki NISHIYAMA. Index: format0.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/appl/blast/format0.rb,v retrieving revision 1.26 retrieving revision 1.27 diff -C2 -d -r1.26 -r1.27 *** format0.rb 12 Feb 2008 02:13:31 -0000 1.26 --- format0.rb 1 Apr 2008 06:31:35 -0000 1.27 *************** *** 294,297 **** --- 294,302 ---- @f0query = data.shift @f0database = data.shift + # In special case, a void line is inserted after database name. + if /\A +[\d\,]+ +sequences\; +[\d\,]+ total +letters\s*\z/ =~ data[0] then + @f0database.concat "\n" + @f0database.concat data.shift + end end From ngoto at dev.open-bio.org Tue Apr 1 06:36:47 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Tue, 01 Apr 2008 10:36:47 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/pdb chain.rb, 1.9, 1.10 pdb.rb, 1.27, 1.28 Message-ID: <200804011036.m31Aal0p009616@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/pdb In directory dev.open-bio.org:/tmp/cvs-serv9574/lib/bio/db/pdb Modified Files: chain.rb pdb.rb Log Message: * Fixed a bug that ArgumentError occurred in Bio::PDB::Chain#aaseq method for nucleic acid chains. The same error might also be occurred in Bio::PDB#seqres and also fixed. * Fixed a bug that current residue/heterogen is not properly initialized when current chain is changed. Index: pdb.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/pdb/pdb.rb,v retrieving revision 1.27 retrieving revision 1.28 diff -C2 -d -r1.27 -r1.28 *** pdb.rb 28 Dec 2007 14:43:44 -0000 1.27 --- pdb.rb 1 Apr 2008 10:36:44 -0000 1.28 *************** *** 1498,1501 **** --- 1498,1504 ---- chain = newChain end + # chain might be changed, clearing cResidue and cLigand + cResidue = nil + cLigand = nil end end *************** *** 1551,1554 **** --- 1554,1559 ---- c_atom = nil cChain = nil + cResidue = nil + cLigand = nil if cModel.model_serial or cModel.chains.size > 0 then self.addModel(cModel) *************** *** 1810,1814 **** #need to look up with Ala aa = aa.capitalize ! (Bio::AminoAcid.three2one(aa) or 'X') end seq = Bio::Sequence::AA.new(a.to_s) --- 1815,1823 ---- #need to look up with Ala aa = aa.capitalize ! (begin ! Bio::AminoAcid.three2one(aa) ! rescue ArgumentError ! nil ! end || 'X') end seq = Bio::Sequence::AA.new(a.to_s) *************** *** 1816,1820 **** # nucleic acid sequence a.collect! do |na| ! na = na.strip na.size == 1 ? na : 'n' end --- 1825,1829 ---- # nucleic acid sequence a.collect! do |na| ! na = na.delete('^a-zA-Z') na.size == 1 ? na : 'n' end Index: chain.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/pdb/chain.rb,v retrieving revision 1.9 retrieving revision 1.10 diff -C2 -d -r1.9 -r1.10 *** chain.rb 18 Dec 2007 13:48:42 -0000 1.9 --- chain.rb 1 Apr 2008 10:36:44 -0000 1.10 *************** *** 190,194 **** end tlc = residue.resName.capitalize ! olc = (Bio::AminoAcid.three2one(tlc) or 'X') string << olc end --- 190,198 ---- end tlc = residue.resName.capitalize ! olc = (begin ! Bio::AminoAcid.three2one(tlc) ! rescue ArgumentError ! nil ! end || 'X') string << olc end From ngoto at dev.open-bio.org Wed Apr 2 02:24:16 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Wed, 02 Apr 2008 06:24:16 +0000 Subject: [BioRuby-cvs] bioruby ChangeLog,1.83,1.84 Message-ID: <200804020624.m326OGwp011324@dev.open-bio.org> Update of /home/repository/bioruby/bioruby In directory dev.open-bio.org:/tmp/cvs-serv11304 Modified Files: ChangeLog Log Message: ChangeLog added for lib/bio/appl/blast/format0.rb,1.26,1.27, lib/bio/db/pdb/chain.rb,1.9,1.10, and lib/bio/db/pdb/pdb.rb,1.27,1.28. Index: ChangeLog =================================================================== RCS file: /home/repository/bioruby/bioruby/ChangeLog,v retrieving revision 1.83 retrieving revision 1.84 diff -C2 -d -r1.83 -r1.84 *** ChangeLog 12 Feb 2008 05:32:23 -0000 1.83 --- ChangeLog 2 Apr 2008 06:24:14 -0000 1.84 *************** *** 1,2 **** --- 1,18 ---- + 2008-04-01 Naohisa Goto + + * lib/bio/appl/blast/format0.rb + + Fixed a bug: Failed to parse database name in some cases. + Thanks to Tomoaki Nishiyama who reported the bug and sent patches + ([BioRuby-ja] BLAST format0 parser fails header parsing output + of specific databases). + + * lib/bio/db/pdb/chain.rb, lib/bio/db/pdb/pdb.rb + + Fixed bugs: Bio::PDB::Chain#aaseq failed for nucleotide chain; + Failed to parse chains for some entries (e.g. 1B2M). + Thanks to Semin Lee who reported the bugs and sent patches + ([BioRuby] Bio::PDB parsing problem (1B2M)). + 2008-02-12 Naohisa Goto From ngoto at dev.open-bio.org Tue Apr 15 09:54:41 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Tue, 15 Apr 2008 13:54:41 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/appl/blast rpsblast.rb,NONE,1.1 Message-ID: <200804151354.m3FDsfkK032072@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/appl/blast In directory dev.open-bio.org:/tmp/cvs-serv32038/lib/bio/appl/blast Added Files: rpsblast.rb Log Message: Newly added RPS-Blast default (-m 0) output parser. --- NEW FILE: rpsblast.rb --- # # = bio/appl/blast/rpsblast.rb - NCBI RPS Blast default output parser # # Copyright:: Copyright (C) 2008 Naohisa Goto # License:: The Ruby License # # $Id: rpsblast.rb,v 1.1 2008/04/15 13:54:39 ngoto Exp $ # # == Description # # NCBI RPS Blast (Reversed Position Specific Blast) default # (-m 0 option) output parser class, Bio::Blast::RPSBlast::Report # and related classes/modules. # # == References # # * Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, # Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), # "Gapped BLAST and PSI-BLAST: a new generation of protein database search # programs", Nucleic Acids Res. 25:3389-3402. # * ftp://ftp.ncbi.nih.gov/blast/documents/rpsblast.html # * http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd_help.shtml # require 'bio/appl/blast/format0' module Bio class Blast # NCBI RPS Blast (Reversed Position Specific Blast) namespace. # Currently, this module is existing only for separating namespace. # To parse RPSBlast results, see Bio::Blast::RPSBlast::Report documents. module RPSBlast # NCBI RPS Blast (Reversed Position Specific Blast) # default output parser. # # It supports defalut (-m 0 option) output of the "rpsblast" command. # # Because this class inherits Bio::Blast::Default::Report, # almost all methods are eqaul to Bio::Blast::Default::Report. # Only DELIMITER (and RS) and few methods are different. # # Note for multi-fasta result: When parsing output of rpsblast command # with multi-fasta sequences as input data, # each query's result is stored as an "iteration" of PSI-Blast, # because rpsblast's output with multi-fasta input is hard to split # by query. # This behavior may be changed in the future. # # Note for nucleotide results: This class is not tested with # nucleotide query and/or nucleotide databases. # class Report < Bio::Blast::Default::Report # Delimter of each entry for TBLAST. Bio::FlatFile uses it. DELIMITER = RS = "\nRPS-BLAST" # (Integer) excess read size included in DELIMITER. DELIMITER_OVERRUN = 9 # "RPS-BLAST" # Creates a new Report object from a string. # # Note for multi-fasta results: When parsing an output of rpsblast # command running with multi-fasta sequences, # each query's result is stored as an "iteration" of PSI-Blast, # because rpsblast's output with multi-fasta input is hard to split # by query. # This behavior may be changed in the future. # # Note for nucleotide results: This class is not tested with # nucleotide query and/or nucleotide databases. # def initialize(str) str = str.sub(/\A\s+/, '') # remove trailing entries for sure str.sub!(/\n(RPS\-BLAST.*)/m, "\n") @entry_overrun = $1 @entry = str data = str.split(/(?:^[ \t]*\n)+/) format0_split_headers(data) @iterations = format0_split_search(data) format0_split_stat_params(data) end # Returns definition of the query. # For a result of multi-fasta input, the first query's definition # is returned (The same as iterations.first.query_def). def query_def iterations.first.query_def end # Returns length of the query. # For a result of multi-fasta input, the first query's length # is returned (The same as iterations.first.query_len). def query_len iterations.first.query_len end private # Splits headers into the first line, reference, query line and # database line. def format0_split_headers(data) @f0header = data.shift @f0references = [] while data[0] and /\ADatabase\:/ !~ data[0] @f0references.push data.shift end @f0database = data.shift # In special case, a void line is inserted after database name. if /\A +[\d\,]+ +sequences\; +[\d\,]+ total +letters\s*\z/ =~ data[0] then @f0database.concat "\n" @f0database.concat data.shift end end # Splits the search results. def format0_split_search(data) iterations = [] dummystr = 'Searching..................................................done' if r = data[0] and /^Searching/ =~ r then dummystr = data.shift end while r = data[0] and /^Query\=/ =~ r iterations << Iteration.new(data, dummystr) end iterations end # Iteration class for RPS-Blast. # Though RPS-Blast does not iterate like PSI-BLAST, # it aims to store a result of single query sequence. # # Normally, the instance of the class is generated # by Bio::Blast::RPSBlast::Report object. # class Iteration < Bio::Blast::Default::Report::Iteration # Creates a new Iteration object. # It is designed to be called only internally from # the Bio::Blast::RPSBlast::Report class. # Users shall not use the method directly. def initialize(data, dummystr) if /\AQuery\=/ =~ data[0] then sc = StringScanner.new(data.shift) sc.skip(/\s*/) if sc.skip_until(/Query\= */) then q = [] begin q << sc.scan(/.*/) sc.skip(/\s*^ ?/) end until !sc.rest or r = sc.skip(/ *\( *([\,\d]+) *letters *\)\s*\z/) @query_len = sc[1].delete(',').to_i if r @query_def = q.join(' ') end end data.unshift(dummystr) super(data) end # definition of the query attr_reader :query_def # length of the query sequence attr_reader :query_len end #class Iteration end #class Report end #module RPSBlast end #module Blast end #module Bio From ngoto at dev.open-bio.org Tue Apr 15 09:54:41 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Tue, 15 Apr 2008 13:54:41 +0000 Subject: [BioRuby-cvs] bioruby ChangeLog,1.84,1.85 Message-ID: <200804151354.m3FDsf3j032062@dev.open-bio.org> Update of /home/repository/bioruby/bioruby In directory dev.open-bio.org:/tmp/cvs-serv32038 Modified Files: ChangeLog Log Message: Newly added RPS-Blast default (-m 0) output parser. Index: ChangeLog =================================================================== RCS file: /home/repository/bioruby/bioruby/ChangeLog,v retrieving revision 1.84 retrieving revision 1.85 diff -C2 -d -r1.84 -r1.85 *** ChangeLog 2 Apr 2008 06:24:14 -0000 1.84 --- ChangeLog 15 Apr 2008 13:54:38 -0000 1.85 *************** *** 1,2 **** --- 1,8 ---- + 2008-04-15 Naohisa Goto + + * lib/bio/appl/blast/rpsblast.rb + + Newly added RPS-Blast default (-m 0) output parser. + 2008-04-01 Naohisa Goto From ngoto at dev.open-bio.org Tue Apr 15 09:54:41 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Tue, 15 Apr 2008 13:54:41 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/appl blast.rb,1.34,1.35 Message-ID: <200804151354.m3FDsfmn032067@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/appl In directory dev.open-bio.org:/tmp/cvs-serv32038/lib/bio/appl Modified Files: blast.rb Log Message: Newly added RPS-Blast default (-m 0) output parser. Index: blast.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/appl/blast.rb,v retrieving revision 1.34 retrieving revision 1.35 diff -C2 -d -r1.34 -r1.35 *** blast.rb 30 Jan 2008 17:43:34 -0000 1.34 --- blast.rb 15 Apr 2008 13:54:39 -0000 1.35 *************** *** 73,76 **** --- 73,77 ---- autoload :WU, 'bio/appl/blast/wublast' autoload :Bl2seq, 'bio/appl/bl2seq/report' + autoload :RPSBlast, 'bio/appl/blast/rpsblast' # This is a shortcut for Bio::Blast.new: From ngoto at dev.open-bio.org Fri Apr 18 11:40:38 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Fri, 18 Apr 2008 15:40:38 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/embl sptr.rb,1.36,1.37 Message-ID: <200804181540.m3IFecgN008057@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/embl In directory dev.open-bio.org:/tmp/cvs-serv8036/lib/bio/db/embl Modified Files: sptr.rb Log Message: bug fix: Bio::SPTR#references raises NoMethodError since lib/bio/db/embl/sptr.rb version 1.34. Index: sptr.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/embl/sptr.rb,v retrieving revision 1.36 retrieving revision 1.37 diff -C2 -d -r1.36 -r1.37 *** sptr.rb 5 Apr 2007 23:35:40 -0000 1.36 --- sptr.rb 18 Apr 2008 15:40:36 -0000 1.37 *************** *** 507,514 **** end when 'RX' # PUBMED, MEDLINE ! value.split('.').each {|item| ! tag, xref = item.split(/; /).map {|i| i.strip } hash[ tag.downcase ] = xref ! } end } --- 507,513 ---- end when 'RX' # PUBMED, MEDLINE ! value.each do |tag, xref| hash[ tag.downcase ] = xref ! end end } From ngoto at dev.open-bio.org Wed Apr 23 12:48:28 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Wed, 23 Apr 2008 16:48:28 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/embl common.rb,1.12,1.13 Message-ID: <200804231648.m3NGmSSa012476@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/embl In directory dev.open-bio.org:/tmp/cvs-serv12456/lib/bio/db/embl Modified Files: common.rb Log Message: Bug fix: Bio::EMBL#references failed to parse journal name, volume, issue, pages, and year. In addition, it might failed to parse PubMed ID. Index: common.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/embl/common.rb,v retrieving revision 1.12 retrieving revision 1.13 diff -C2 -d -r1.12 -r1.13 *** common.rb 5 Apr 2007 23:35:40 -0000 1.12 --- common.rb 23 Apr 2008 16:48:25 -0000 1.13 *************** *** 279,294 **** hash['title'] = value when 'RL' ! if value =~ /(.*) (\d+) \((\d+)\), (\d+-\d+) \((\d+)\)$/ ! hash['journal'] = $1 hash['volume'] = $2 ! hash['issue'] = $3 ! hash['pages'] = $4 ! hash['year'] = $5 else hash['journal'] = value end when 'RX' # PUBMED, MEDLINE ! value.split('.').each {|item| ! tag, xref = item.split(/; /).map {|i| i.strip } hash[ tag.downcase ] = xref } --- 279,294 ---- hash['title'] = value when 'RL' ! if /(.*) (\d+) *(\(([^\)]+)\))?(\, |\:)([a-zA-Z\d]+\-[a-zA-Z\d]+) *\((\d+)\)\.?\z/ =~ value.to_s ! hash['journal'] = $1.rstrip hash['volume'] = $2 ! hash['issue'] = $4 ! hash['pages'] = $6 ! hash['year'] = $7 else hash['journal'] = value end when 'RX' # PUBMED, MEDLINE ! value.split(/\. /).each {|item| ! tag, xref = item.split(/\; /).map {|i| i.strip.sub(/\.\z/, '') } hash[ tag.downcase ] = xref } From ngoto at dev.open-bio.org Wed Apr 23 13:34:17 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Wed, 23 Apr 2008 17:34:17 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/embl common.rb,1.12.2.1,1.12.2.2 Message-ID: <200804231734.m3NHYHMP012740@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/embl In directory dev.open-bio.org:/tmp/cvs-serv12720/lib/bio/db/embl Modified Files: Tag: BRANCH-biohackathon2008 common.rb Log Message: lib/bio/db/embl/common.rb in branch BRANCH-biohackathon2008 is copied from CVS HEAD revision 1.13 because of the bug fixed in revision 1.13. (Bug fix: Bio::EMBL#references failed to parse journal name, volume, issue, pages, and year. In addition, it might fail to parse PubMed ID.) Index: common.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/embl/common.rb,v retrieving revision 1.12.2.1 retrieving revision 1.12.2.2 diff -C2 -d -r1.12.2.1 -r1.12.2.2 *** common.rb 20 Feb 2008 09:56:22 -0000 1.12.2.1 --- common.rb 23 Apr 2008 17:34:15 -0000 1.12.2.2 *************** *** 241,305 **** def ref unless @data['R'] ! @data['R'] = Array.new ! # Get the different references as 'blurbs' (the lines together) ! reference_blurbs = get('R').split(/\nRN /) ! reference_blurbs.each_index do |i| ! reference_blurbs[i] = 'RN ' + reference_blurbs[i] unless reference_blurbs[i] =~ /^RN / ! end ! ! # For each reference, we'll first create a hash that looks like below. ! # Suppose the input is: ! # RA name1, name2, name3 ! # RA name4 ! # RT some part of the title that ! # RT did not fit on one line ! # Then the hash looks like: ! # h = { ! # 'RA' => ["name1, name2, name3", "name4"], ! # 'RT' => ["some part of the title that", "did not fit on one line"] ! # } ! reference_blurbs.each do |rb| ! line_based_data = Hash.new ! rb.split(/\n/).each do |line| ! key, value = line.scan(/^(R[A-Z]) "?(\[?.*[A-Za-z0-9]\]?)/)[0] ! if line_based_data[key].nil? ! line_based_data[key] = Array.new ! end ! line_based_data[key].push(value) ! end ! ! # Now we have to sanitize the hash: the authors should be kept in an ! # array, the title should be 1 string, ... So the hash should look like: ! # h = { ! # 'RA' => ["name1", "name2", "name3", "name4"], ! # 'RT' => 'some part of the title that did not fit on one line' ! # } ! line_based_data.keys.each do |key| ! if ['RC', 'RP', 'RT', 'RL'].include?(key) ! line_based_data[key] = line_based_data[key].join(' ') ! elsif ['RA', 'RX'].include?(key) ! sanitized_data = Array.new ! line_based_data[key].each do |v| ! sanitized_data.push(v.split(/\s*,\s*/)) ! end ! line_based_data[key] = sanitized_data.flatten ! elsif key == 'RN' ! line_based_data[key] = line_based_data[key][0].sub(/^\[/,'').sub(/\]$/,'').to_i end end ! ! # And put it in @data. @data in the end looks like this: ! # data = [ ! # { ! # 'RA' => ["name1", "name2", "name3", "name4"], ! # 'RT' => 'some part of the title that did not fit on one line' ! # }, ! # { ! # 'RA' => ["name1", "name2", "name3", "name4"], ! # 'RT' => 'some part of the title that did not fit on one line' ! # } ! # ] ! @data['R'].push(line_based_data) end end @data['R'] --- 241,265 ---- def ref unless @data['R'] ! ary = Array.new ! get('R').split(/\nRN /).each do |str| ! raw = {'RN' => '', 'RC' => '', 'RP' => '', 'RX' => '', ! 'RA' => '', 'RT' => '', 'RL' => '', 'RG' => ''} ! str = 'RN ' + str unless /^RN / =~ str ! str.split("\n").each do |line| ! if /^(R[NPXARLCTG]) (.+)/ =~ line ! raw[$1] += $2 + ' ' ! else ! raise "Invalid format in R lines, \n[#{line}]\n" end end ! raw.each_value {|v| ! v.strip! ! v.sub!(/^"/,'') ! v.sub!(/;$/,'') ! v.sub!(/"$/,'') ! } ! ary.push(raw) end + @data['R'] = ary end @data['R'] *************** *** 310,345 **** def references unless @data['references'] ! @data['references'] = Array.new ! self.ref.each do |ref| ! hash = Hash.new ! ref.each do |key, value| case key - when 'RN' - hash['embl_gb_record_number'] = value - when 'RC' - hash['comments'] = value - when 'RX' - hash['xrefs'] = value - when 'RP' - hash['sequence_position'] = value when 'RA' ! hash['authors'] = value when 'RT' hash['title'] = value when 'RL' ! hash['journal'] = value when 'RX' # PUBMED, MEDLINE ! value.each {|item| ! tag, xref = item.split(/; /).map {|i| i.strip } hash[ tag.downcase ] = xref } end ! end ! @data['references'].push(Reference.new(hash)) ! end end @data['references'] end # returns contents in the DR line. # * Bio::EMBLDB::Common#dr -> [ * ] --- 270,306 ---- def references unless @data['references'] ! ary = self.ref.map {|ent| ! hash = Hash.new('') ! ent.each {|key, value| case key when 'RA' ! hash['authors'] = value.split(/, /) when 'RT' hash['title'] = value when 'RL' ! if /(.*) (\d+) *(\(([^\)]+)\))?(\, |\:)([a-zA-Z\d]+\-[a-zA-Z\d]+) *\((\d+)\)\.?\z/ =~ value.to_s ! hash['journal'] = $1.rstrip ! hash['volume'] = $2 ! hash['issue'] = $4 ! hash['pages'] = $6 ! hash['year'] = $7 ! else ! hash['journal'] = value ! end when 'RX' # PUBMED, MEDLINE ! value.split(/\. /).each {|item| ! tag, xref = item.split(/\; /).map {|i| i.strip.sub(/\.\z/, '') } hash[ tag.downcase ] = xref } end ! } ! Reference.new(hash) ! } ! @data['references'] = References.new(ary) end @data['references'] end + # returns contents in the DR line. # * Bio::EMBLDB::Common#dr -> [ * ] From ngoto at dev.open-bio.org Wed Apr 23 14:04:53 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Wed, 23 Apr 2008 18:04:53 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/embl common.rb,1.12.2.2,1.12.2.3 Message-ID: <200804231804.m3NI4rUv012864@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/embl In directory dev.open-bio.org:/tmp/cvs-serv12842/lib/bio/db/embl Modified Files: Tag: BRANCH-biohackathon2008 common.rb Log Message: Part of changes made between 1.12 and 1.12.2.1 is incorporated with modifications. Index: common.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/embl/common.rb,v retrieving revision 1.12.2.2 retrieving revision 1.12.2.3 diff -C2 -d -r1.12.2.2 -r1.12.2.3 *** common.rb 23 Apr 2008 17:34:15 -0000 1.12.2.2 --- common.rb 23 Apr 2008 18:04:51 -0000 1.12.2.3 *************** *** 74,77 **** --- 74,78 ---- require 'bio/db' require 'bio/reference' + require 'bio/compat/references' module Bio *************** *** 274,279 **** ent.each {|key, value| case key when 'RA' ! hash['authors'] = value.split(/, /) when 'RT' hash['title'] = value --- 275,288 ---- ent.each {|key, value| case key + when 'RN' + if /\[(\d+)\]/ =~ value.to_s + hash['embl_gb_record_number'] = $1.to_i + end + when 'RC' + hash['comment'] = value + when 'RP' + hash['sequence_position'] = value when 'RA' ! hash['authors'] = value.split(/\, /) when 'RT' hash['title'] = value *************** *** 288,292 **** hash['journal'] = value end ! when 'RX' # PUBMED, MEDLINE value.split(/\. /).each {|item| tag, xref = item.split(/\; /).map {|i| i.strip.sub(/\.\z/, '') } --- 297,301 ---- hash['journal'] = value end ! when 'RX' # PUBMED, DOI, (AGRICOLA) value.split(/\. /).each {|item| tag, xref = item.split(/\; /).map {|i| i.strip.sub(/\.\z/, '') } *************** *** 297,301 **** Reference.new(hash) } ! @data['references'] = References.new(ary) end @data['references'] --- 306,310 ---- Reference.new(hash) } ! @data['references'] = ary.extend(Bio::References::BackwardCompatibility) end @data['references'] From ngoto at dev.open-bio.org Wed Apr 23 14:52:20 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Wed, 23 Apr 2008 18:52:20 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio reference.rb,1.24.2.5,1.24.2.6 Message-ID: <200804231852.m3NIqKW0013081@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio In directory dev.open-bio.org:/tmp/cvs-serv13059/lib/bio Modified Files: Tag: BRANCH-biohackathon2008 reference.rb Log Message: * lib/bio/reference.rb * New methods: Bio::Reference#comments, Bio::Reference#doi * Code of Bio::Reference#embl is moved to lib/bio/db/embl/format_embl.rb to improve tolerance for various data (e.g. references with no record numbers or with duplicated record numbers). * lib/bio/db/embl/common.rb * Changes to support for Bio::Reference#comments. * lib/bio/db/embl/format_embl.rb * Bio::Sequence::Format::NucFormatter::Embl#reference_format_embl (private method) is added based on Bio::Reference#embl. * Changes to improve tolerance for various data. Index: reference.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/reference.rb,v retrieving revision 1.24.2.5 retrieving revision 1.24.2.6 diff -C2 -d -r1.24.2.5 -r1.24.2.6 *** reference.rb 4 Mar 2008 11:31:45 -0000 1.24.2.5 --- reference.rb 23 Apr 2008 18:52:18 -0000 1.24.2.6 *************** *** 42,47 **** class Reference - include Bio::Sequence::Format::INSDFeatureHelper - # Author names in an Array, [ "Hoge, J.P.", "Fuga, F.B." ]. attr_reader :authors --- 42,45 ---- *************** *** 70,73 **** --- 68,74 ---- # medline identifier (typically Fixnum) attr_reader :medline + + # DOI identifier (typically String, e.g. "10.1126/science.1110418") + attr_reader :doi # Abstract text in String. *************** *** 89,92 **** --- 90,96 ---- attr_reader :sequence_position + # Comments for the reference (typically Array of String, or nil) + attr_reader :comments + # Create a new Bio::Reference object from a Hash of values. # Data is extracted from the values for keys: *************** *** 126,150 **** # *Returns*:: Bio::Reference object def initialize(hash) ! hash.default = '' ! @authors = hash['authors'] # [ "Hoge, J.P.", "Fuga, F.B." ] ! @title = hash['title'] # "Title of the study." ! @journal = hash['journal'] # "Theor. J. Hoge" ! @volume = hash['volume'] # 12 ! @issue = hash['issue'] # 3 ! @pages = hash['pages'] # 123-145 ! @year = hash['year'] # 2001 ! @pubmed = hash['pubmed'] # 12345678 ! @medline = hash['medline'] # 98765432 ! @abstract = hash['abstract'] @url = hash['url'] ! @mesh = hash['mesh'] @embl_gb_record_number = hash['embl_gb_record_number'] || nil @sequence_position = hash['sequence_position'] || nil ! @comments = hash['comments'] || [] ! @xrefs = hash['xrefs'] || [] ! @affiliations = hash['affiliations'] ! @authors = [] if @authors.empty? ! @mesh = [] if @mesh.empty? ! @affiliations = [] if @affiliations.empty? end --- 130,150 ---- # *Returns*:: Bio::Reference object def initialize(hash) ! @authors = hash['authors'] || [] # [ "Hoge, J.P.", "Fuga, F.B." ] ! @title = hash['title'] || '' # "Title of the study." ! @journal = hash['journal'] || '' # "Theor. J. Hoge" ! @volume = hash['volume'] || '' # 12 ! @issue = hash['issue'] || '' # 3 ! @pages = hash['pages'] || '' # 123-145 ! @year = hash['year'] || '' # 2001 ! @pubmed = hash['pubmed'] || '' # 12345678 ! @medline = hash['medline'] || '' # 98765432 ! @doi = hash['doi'] ! @abstract = hash['abstract'] || '' @url = hash['url'] ! @mesh = hash['mesh'] || [] @embl_gb_record_number = hash['embl_gb_record_number'] || nil @sequence_position = hash['sequence_position'] || nil ! @comments = hash['comments'] ! @affiliations = hash['affiliations'] || [] end *************** *** 273,298 **** # RL Plant Mol. Biol. 17(2):209-219(1991). def embl ! lines = Array.new ! if ! @embl_gb_record_number.nil? ! lines << "RN [#{@embl_gb_record_number}]" ! end ! if @comments != [] ! @comments.each do |c| ! lines << "RC #{c}" ! end ! end ! if ! @sequence_position.nil? ! lines << "RP #{@sequence_position}" ! end ! if ! @xrefs.nil? ! @xrefs.each do |x| ! lines << "RX #{x}" ! end ! end ! lines << wrap(@authors.join(', '), 80, 'RA ') + ';' unless @authors.nil? ! lines << (@title == '' ? 'RT ;' : wrap('"' + @title + '"', 80, 'RT ') + ';') ! lines << wrap(@journal, 80, 'RL ') unless @journal == '' ! lines << "XX" ! return lines.join("\n") end --- 273,280 ---- # RL Plant Mol. Biol. 17(2):209-219(1991). def embl ! r = self ! Bio::Sequence::Format::NucFormatter::Embl.new('').instance_eval { ! reference_format_embl(r) ! } end From ngoto at dev.open-bio.org Wed Apr 23 14:52:20 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Wed, 23 Apr 2008 18:52:20 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/embl format_embl.rb, 1.1.2.2, 1.1.2.3 common.rb, 1.12.2.3, 1.12.2.4 Message-ID: <200804231852.m3NIqKHG013084@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/embl In directory dev.open-bio.org:/tmp/cvs-serv13059/lib/bio/db/embl Modified Files: Tag: BRANCH-biohackathon2008 format_embl.rb common.rb Log Message: * lib/bio/reference.rb * New methods: Bio::Reference#comments, Bio::Reference#doi * Code of Bio::Reference#embl is moved to lib/bio/db/embl/format_embl.rb to improve tolerance for various data (e.g. references with no record numbers or with duplicated record numbers). * lib/bio/db/embl/common.rb * Changes to support for Bio::Reference#comments. * lib/bio/db/embl/format_embl.rb * Bio::Sequence::Format::NucFormatter::Embl#reference_format_embl (private method) is added based on Bio::Reference#embl. * Changes to improve tolerance for various data. Index: format_embl.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/embl/Attic/format_embl.rb,v retrieving revision 1.1.2.2 retrieving revision 1.1.2.3 diff -C2 -d -r1.1.2.2 -r1.1.2.3 *** format_embl.rb 27 Mar 2008 13:38:31 -0000 1.1.2.2 --- format_embl.rb 23 Apr 2008 18:52:18 -0000 1.1.2.3 *************** *** 25,28 **** --- 25,76 ---- end + # format reference + # ref:: Bio::Reference object + # hash:: (optional) a hash for RN (reference number) administration + def reference_format_embl(ref, hash = nil) + lines = Array.new + if ref.embl_gb_record_number or hash then + refno = ref.embl_gb_record_number.to_i + hash ||= {} + if refno <= 0 or hash[refno] then + refno = hash.keys.sort[-1].to_i + 1 + hash[refno] = true + end + lines << embl_wrap("RN ", "[#{refno}]") + end + if ref.comments then + ref.comments.each do |cmnt| + lines << embl_wrap("RC ", cmnt) + end + end + unless ref.sequence_position.to_s.empty? then + lines << embl_wrap("RP ", "#{ref.sequence_position}") + end + unless ref.doi.to_s.empty? then + lines << embl_wrap("RX ", "DOI; #{ref.doi}.") + end + unless ref.pubmed.to_s.empty? then + lines << embl_wrap("RX ", "PUBMED; #{ref.pubmed}.") + end + unless ref.authors.empty? + lines << embl_wrap('RA ', ref.authors.join(', ') + ';') + end + lines << embl_wrap('RT ', + (ref.title.to_s.empty? ? '' : + "\"#{ref.title}\"") + ';') + unless ref.journal.to_s.empty? then + volissue = "#{ref.volume.to_s}" + volissue = "#{volissue}(#{ref.issue})" unless ref.issue.to_s.empty? + rl = "#{ref.journal}" + rl += " #{volissue}" unless volissue.empty? + rl += ":#{ref.pages}" unless ref.pages.to_s.empty? + rl += "(#{ref.year})" unless ref.year.to_s.empty? + rl += '.' + lines << embl_wrap('RL ', rl) + end + lines << "XX" + return lines.join("\n") + end + def seq_format_embl(seq) output_lines = Array.new *************** *** 43,64 **** erb_template <<'__END_OF_TEMPLATE__' ID <%= entry_id %>; SV <%= sequence_version %>; <%= topology %>; <%= molecule_type %>; <%= data_class %>; <%= division %>; <%= seq.length %> BP. ! XX <%= embl_wrap('AC ', accessions.reject{|a| a.nil?}.join('; ') + ';') %> ! XX DT <%= date_created %> DT <%= date_modified %> ! XX <%= embl_wrap('DE ', definition) %> ! XX <%= embl_wrap('KW ', keywords.join('; ') + '.') %> ! XX OS <%= species %> <%= embl_wrap('OC ', classification.join('; ') + '.') %> XX ! <%= (references || []).collect{|ref| ref.format('embl')}.join("\n") %> ! XX ! FH Key Location/Qualifiers ! FH ! <%= format_features_embl(features || []) %>XX SQ Sequence <%= seq.length %> BP; <%= seq.composition.collect{|k,v| "#{v} #{k.upcase}"}.join('; ') + '; ' + (seq.gsub(/[ACTGactg]/, '').length.to_s ) + ' other;' %> <%= seq_format_embl(seq) %> --- 91,111 ---- erb_template <<'__END_OF_TEMPLATE__' ID <%= entry_id %>; SV <%= sequence_version %>; <%= topology %>; <%= molecule_type %>; <%= data_class %>; <%= division %>; <%= seq.length %> BP. ! XX <%= embl_wrap('AC ', accessions.reject{|a| a.nil?}.join('; ') + ';') %> ! XX DT <%= date_created %> DT <%= date_modified %> ! XX <%= embl_wrap('DE ', definition) %> ! XX <%= embl_wrap('KW ', keywords.join('; ') + '.') %> ! XX OS <%= species %> <%= embl_wrap('OC ', classification.join('; ') + '.') %> XX ! <% hash = {}; (references || []).each do |ref| %><%= reference_format_embl(ref, hash) %> ! <% end %>FH Key Location/Qualifiers ! FH ! <%= format_features_embl(features || []) %>XX SQ Sequence <%= seq.length %> BP; <%= seq.composition.collect{|k,v| "#{v} #{k.upcase}"}.join('; ') + '; ' + (seq.gsub(/[ACTGactg]/, '').length.to_s ) + ' other;' %> <%= seq_format_embl(seq) %> Index: common.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/embl/common.rb,v retrieving revision 1.12.2.3 retrieving revision 1.12.2.4 diff -C2 -d -r1.12.2.3 -r1.12.2.4 *** common.rb 23 Apr 2008 18:04:51 -0000 1.12.2.3 --- common.rb 23 Apr 2008 18:52:18 -0000 1.12.2.4 *************** *** 280,284 **** end when 'RC' ! hash['comment'] = value when 'RP' hash['sequence_position'] = value --- 280,287 ---- end when 'RC' ! unless value.to_s.strip.empty? ! hash['comments'] ||= [] ! hash['comments'].push value ! end when 'RP' hash['sequence_position'] = value From ngoto at dev.open-bio.org Thu Apr 24 09:49:44 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Thu, 24 Apr 2008 13:49:44 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/embl sptr.rb,1.36,1.36.2.1 Message-ID: <200804241349.m3ODni9x015583@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/embl In directory dev.open-bio.org:/tmp/cvs-serv15545/lib/bio/db/embl Modified Files: Tag: BRANCH-biohackathon2008 sptr.rb Log Message: The same change (except comment) as of 1.36 => 1.37 in CVS HEAD is made (bug fix: Bio::SPTR#references raises NoMethodError since lib/bio/db/embl/sptr.rb version 1.34). Index: sptr.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/embl/sptr.rb,v retrieving revision 1.36 retrieving revision 1.36.2.1 diff -C2 -d -r1.36 -r1.36.2.1 *** sptr.rb 5 Apr 2007 23:35:40 -0000 1.36 --- sptr.rb 24 Apr 2008 13:49:42 -0000 1.36.2.1 *************** *** 506,514 **** hash['journal'] = value end ! when 'RX' # PUBMED, MEDLINE ! value.split('.').each {|item| ! tag, xref = item.split(/; /).map {|i| i.strip } hash[ tag.downcase ] = xref ! } end } --- 506,513 ---- hash['journal'] = value end ! when 'RX' # PUBMED, MEDLINE, DOI ! value.each do |tag, xref| hash[ tag.downcase ] = xref ! end end } From ngoto at dev.open-bio.org Thu Apr 24 10:28:27 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Thu, 24 Apr 2008 14:28:27 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio sequence.rb,0.58.2.10,0.58.2.11 Message-ID: <200804241428.m3OESRaY016145@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio In directory dev.open-bio.org:/tmp/cvs-serv15857/lib/bio Modified Files: Tag: BRANCH-biohackathon2008 sequence.rb Log Message: * Bio::Sequence.read is renamed to Bio::Sequence.input because this method is a pair of Bio::Sequence#output. Bio::Sequence.read still exists as an alias of Bio::Sequence. * Added document for Bio::Sequence#accessions, and fixed not to contain nil in the returned array. Index: sequence.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/sequence.rb,v retrieving revision 0.58.2.10 retrieving revision 0.58.2.11 diff -C2 -d -r0.58.2.10 -r0.58.2.11 *** sequence.rb 27 Mar 2008 13:38:31 -0000 0.58.2.10 --- sequence.rb 24 Apr 2008 14:28:25 -0000 0.58.2.11 *************** *** 369,373 **** # (GenBank, EMBL, fasta format, etc.) # ! # s = Bio::Sequence.read(str) # --- # *Arguments*: --- 369,373 ---- # (GenBank, EMBL, fasta format, etc.) # ! # s = Bio::Sequence.input(str) # --- # *Arguments*: *************** *** 375,379 **** # * (optional) _format_: format specification (class or nil) # *Returns*:: Bio::Sequence object ! def self.read(str, format = nil) if format then klass = format --- 375,379 ---- # * (optional) _format_: format specification (class or nil) # *Returns*:: Bio::Sequence object ! def self.input(str, format = nil) if format then klass = format *************** *** 384,391 **** obj.to_biosequence end ! ! def accessions ! return [@primary_accession, @secondary_accessions].flatten end --- 384,398 ---- obj.to_biosequence end ! ! # alias of Bio::Sequence.input ! def self.read(str, format = nil) ! input(str, format) ! end ! ! # accession numbers of the sequence ! # ! # *Returns*:: Array of String def accessions ! [ @primary_accession, @secondary_accessions ].flatten.compact end From helios at dev.open-bio.org Mon Apr 7 09:15:46 2008 From: helios at dev.open-bio.org (Raoul Jean Pierre Bonnal) Date: Mon, 07 Apr 2008 13:15:46 -0000 Subject: [BioRuby-cvs] bioruby/lib/bio/io sql.rb,1.8.2.1,1.8.2.2 Message-ID: <200804071315.m37DFcHI005486@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/io In directory dev.open-bio.org:/tmp/cvs-serv5466/lib/bio/io Modified Files: Tag: BRANCH-biohackathon2008 sql.rb Log Message: added "hostname" to valid_keys configurations Index: sql.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/io/sql.rb,v retrieving revision 1.8.2.1 retrieving revision 1.8.2.2 diff -C2 -d -r1.8.2.1 -r1.8.2.2 *** sql.rb 25 Mar 2008 15:46:32 -0000 1.8.2.1 --- sql.rb 7 Apr 2008 13:15:36 -0000 1.8.2.2 *************** *** 25,29 **** #{:database=>"biorails_development", :adapter=>"postgresql", :username=>"rails", :password=>nil} configurations.assert_valid_keys('development', 'production','test') ! configurations[env].assert_valid_keys('database','adapter','username','password') DummyBase.configurations = configurations DummyBase.establish_connection "#{env}" --- 25,29 ---- #{:database=>"biorails_development", :adapter=>"postgresql", :username=>"rails", :password=>nil} configurations.assert_valid_keys('development', 'production','test') ! configurations[env].assert_valid_keys('hostname','database','adapter','username','password') DummyBase.configurations = configurations DummyBase.establish_connection "#{env}" *************** *** 43,46 **** --- 43,50 ---- end + def self.exists_database(name) + Bio::SQL::Biodatabase.find_by_name(name).nil? ? false : true + end + def self.list_entries Bio::SQL::Bioentry.find(:all).collect{|entry| *************** *** 117,121 **** pp connection = Bio::SQL.establish_connection({'development'=>{'database'=>"biorails_development", 'adapter'=>"postgresql", 'username'=>"rails", 'password'=>nil}},'development') #pp YAML::load(ERB.new(IO.read('bio/io/biosql/config/database.yml')).result) ! if nil pp Bio::SQL.list_entries --- 121,125 ---- pp connection = Bio::SQL.establish_connection({'development'=>{'database'=>"biorails_development", 'adapter'=>"postgresql", 'username'=>"rails", 'password'=>nil}},'development') #pp YAML::load(ERB.new(IO.read('bio/io/biosql/config/database.yml')).result) ! pp Bio::SQL.list_entries if nil pp Bio::SQL.list_entries From helios at dev.open-bio.org Mon Apr 7 09:17:55 2008 From: helios at dev.open-bio.org (Raoul Jean Pierre Bonnal) Date: Mon, 07 Apr 2008 13:17:55 -0000 Subject: [BioRuby-cvs] bioruby/lib/bio/io/biosql bioentry.rb, 1.1.2.1, 1.1.2.2 Message-ID: <200804071317.m37DHmQk005551@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/io/biosql In directory dev.open-bio.org:/tmp/cvs-serv5531/lib/bio/io/biosql Modified Files: Tag: BRANCH-biohackathon2008 bioentry.rb Log Message: corrected table name "term" in conditions to get cdsfeatures "Shortcut". associated with the entry. Index: bioentry.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/io/biosql/Attic/bioentry.rb,v retrieving revision 1.1.2.1 retrieving revision 1.1.2.2 diff -C2 -d -r1.1.2.1 -r1.1.2.2 *** bioentry.rb 25 Mar 2008 15:46:32 -0000 1.1.2.1 --- bioentry.rb 7 Apr 2008 13:17:46 -0000 1.1.2.2 *************** *** 13,17 **** has_many :subject_bioentry_relationships, :class_name=>"BioentryRelationship", :foreign_key=>"subject_bioentry_id" #non mi convince molto credo non funzioni nel modo corretto ! has_many :cdsfeatures, :class_name=>"Seqfeature", :foreign_key =>"bioentry_id", :conditions=>["terms.name='CDS'"], :include=>"typeterm" has_many :terms, :through=>:bioentry_qualifier_values --- 13,17 ---- has_many :subject_bioentry_relationships, :class_name=>"BioentryRelationship", :foreign_key=>"subject_bioentry_id" #non mi convince molto credo non funzioni nel modo corretto ! has_many :cdsfeatures, :class_name=>"Seqfeature", :foreign_key =>"bioentry_id", :conditions=>["term.name='CDS'"], :include=>"type_term" has_many :terms, :through=>:bioentry_qualifier_values From helios at dev.open-bio.org Mon Apr 7 09:18:19 2008 From: helios at dev.open-bio.org (Raoul Jean Pierre Bonnal) Date: Mon, 07 Apr 2008 13:18:19 -0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/biosql sequence.rb, 1.1.2.1, 1.1.2.2 Message-ID: <200804071318.m37DIDFm005598@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/biosql In directory dev.open-bio.org:/tmp/cvs-serv5578/lib/bio/db/biosql Modified Files: Tag: BRANCH-biohackathon2008 sequence.rb Log Message: use genbank, fasta is not working Index: sequence.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/biosql/Attic/sequence.rb,v retrieving revision 1.1.2.1 retrieving revision 1.1.2.2 diff -C2 -d -r1.1.2.1 -r1.1.2.2 *** sequence.rb 25 Mar 2008 15:46:32 -0000 1.1.2.1 --- sequence.rb 7 Apr 2008 13:18:11 -0000 1.1.2.2 *************** *** 674,679 **** # parser = Bio::FlatFile.auto('/home/febo/Desktop/aj224122.embl') ! # parser = Bio::FlatFile.auto('/home/febo/Desktop/aj224122.gb') ! parser = Bio::FlatFile.auto('/home/febo/Desktop/aj224122.fasta') parser.each do |entry| --- 674,679 ---- # parser = Bio::FlatFile.auto('/home/febo/Desktop/aj224122.embl') ! parser = Bio::FlatFile.auto('/home/febo/Desktop/aj224122.gb') ! #parser = Bio::FlatFile.auto('/home/febo/Desktop/aj224122.fasta') parser.each do |entry| *************** *** 686,689 **** --- 686,690 ---- # pp "Sequence" puts result.to_biosequence.output(:genbank) #:embl + result.delete end end From ngoto at dev.open-bio.org Tue Apr 1 06:31:37 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Tue, 01 Apr 2008 06:31:37 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/appl/blast format0.rb,1.26,1.27 Message-ID: <200804010631.m316VbfM002141@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/appl/blast In directory dev.open-bio.org:/tmp/cvs-serv2121/lib/bio/appl/blast Modified Files: format0.rb Log Message: Fixed a bug when a null line is inserted after database title in some cases, reported by Tomoaki NISHIYAMA. Index: format0.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/appl/blast/format0.rb,v retrieving revision 1.26 retrieving revision 1.27 diff -C2 -d -r1.26 -r1.27 *** format0.rb 12 Feb 2008 02:13:31 -0000 1.26 --- format0.rb 1 Apr 2008 06:31:35 -0000 1.27 *************** *** 294,297 **** --- 294,302 ---- @f0query = data.shift @f0database = data.shift + # In special case, a void line is inserted after database name. + if /\A +[\d\,]+ +sequences\; +[\d\,]+ total +letters\s*\z/ =~ data[0] then + @f0database.concat "\n" + @f0database.concat data.shift + end end From ngoto at dev.open-bio.org Tue Apr 1 10:36:47 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Tue, 01 Apr 2008 10:36:47 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/pdb chain.rb, 1.9, 1.10 pdb.rb, 1.27, 1.28 Message-ID: <200804011036.m31Aal0p009616@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/pdb In directory dev.open-bio.org:/tmp/cvs-serv9574/lib/bio/db/pdb Modified Files: chain.rb pdb.rb Log Message: * Fixed a bug that ArgumentError occurred in Bio::PDB::Chain#aaseq method for nucleic acid chains. The same error might also be occurred in Bio::PDB#seqres and also fixed. * Fixed a bug that current residue/heterogen is not properly initialized when current chain is changed. Index: pdb.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/pdb/pdb.rb,v retrieving revision 1.27 retrieving revision 1.28 diff -C2 -d -r1.27 -r1.28 *** pdb.rb 28 Dec 2007 14:43:44 -0000 1.27 --- pdb.rb 1 Apr 2008 10:36:44 -0000 1.28 *************** *** 1498,1501 **** --- 1498,1504 ---- chain = newChain end + # chain might be changed, clearing cResidue and cLigand + cResidue = nil + cLigand = nil end end *************** *** 1551,1554 **** --- 1554,1559 ---- c_atom = nil cChain = nil + cResidue = nil + cLigand = nil if cModel.model_serial or cModel.chains.size > 0 then self.addModel(cModel) *************** *** 1810,1814 **** #need to look up with Ala aa = aa.capitalize ! (Bio::AminoAcid.three2one(aa) or 'X') end seq = Bio::Sequence::AA.new(a.to_s) --- 1815,1823 ---- #need to look up with Ala aa = aa.capitalize ! (begin ! Bio::AminoAcid.three2one(aa) ! rescue ArgumentError ! nil ! end || 'X') end seq = Bio::Sequence::AA.new(a.to_s) *************** *** 1816,1820 **** # nucleic acid sequence a.collect! do |na| ! na = na.strip na.size == 1 ? na : 'n' end --- 1825,1829 ---- # nucleic acid sequence a.collect! do |na| ! na = na.delete('^a-zA-Z') na.size == 1 ? na : 'n' end Index: chain.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/pdb/chain.rb,v retrieving revision 1.9 retrieving revision 1.10 diff -C2 -d -r1.9 -r1.10 *** chain.rb 18 Dec 2007 13:48:42 -0000 1.9 --- chain.rb 1 Apr 2008 10:36:44 -0000 1.10 *************** *** 190,194 **** end tlc = residue.resName.capitalize ! olc = (Bio::AminoAcid.three2one(tlc) or 'X') string << olc end --- 190,198 ---- end tlc = residue.resName.capitalize ! olc = (begin ! Bio::AminoAcid.three2one(tlc) ! rescue ArgumentError ! nil ! end || 'X') string << olc end From ngoto at dev.open-bio.org Wed Apr 2 06:24:16 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Wed, 02 Apr 2008 06:24:16 +0000 Subject: [BioRuby-cvs] bioruby ChangeLog,1.83,1.84 Message-ID: <200804020624.m326OGwp011324@dev.open-bio.org> Update of /home/repository/bioruby/bioruby In directory dev.open-bio.org:/tmp/cvs-serv11304 Modified Files: ChangeLog Log Message: ChangeLog added for lib/bio/appl/blast/format0.rb,1.26,1.27, lib/bio/db/pdb/chain.rb,1.9,1.10, and lib/bio/db/pdb/pdb.rb,1.27,1.28. Index: ChangeLog =================================================================== RCS file: /home/repository/bioruby/bioruby/ChangeLog,v retrieving revision 1.83 retrieving revision 1.84 diff -C2 -d -r1.83 -r1.84 *** ChangeLog 12 Feb 2008 05:32:23 -0000 1.83 --- ChangeLog 2 Apr 2008 06:24:14 -0000 1.84 *************** *** 1,2 **** --- 1,18 ---- + 2008-04-01 Naohisa Goto + + * lib/bio/appl/blast/format0.rb + + Fixed a bug: Failed to parse database name in some cases. + Thanks to Tomoaki Nishiyama who reported the bug and sent patches + ([BioRuby-ja] BLAST format0 parser fails header parsing output + of specific databases). + + * lib/bio/db/pdb/chain.rb, lib/bio/db/pdb/pdb.rb + + Fixed bugs: Bio::PDB::Chain#aaseq failed for nucleotide chain; + Failed to parse chains for some entries (e.g. 1B2M). + Thanks to Semin Lee who reported the bugs and sent patches + ([BioRuby] Bio::PDB parsing problem (1B2M)). + 2008-02-12 Naohisa Goto From ngoto at dev.open-bio.org Tue Apr 15 13:54:41 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Tue, 15 Apr 2008 13:54:41 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/appl/blast rpsblast.rb,NONE,1.1 Message-ID: <200804151354.m3FDsfkK032072@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/appl/blast In directory dev.open-bio.org:/tmp/cvs-serv32038/lib/bio/appl/blast Added Files: rpsblast.rb Log Message: Newly added RPS-Blast default (-m 0) output parser. --- NEW FILE: rpsblast.rb --- # # = bio/appl/blast/rpsblast.rb - NCBI RPS Blast default output parser # # Copyright:: Copyright (C) 2008 Naohisa Goto # License:: The Ruby License # # $Id: rpsblast.rb,v 1.1 2008/04/15 13:54:39 ngoto Exp $ # # == Description # # NCBI RPS Blast (Reversed Position Specific Blast) default # (-m 0 option) output parser class, Bio::Blast::RPSBlast::Report # and related classes/modules. # # == References # # * Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, # Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), # "Gapped BLAST and PSI-BLAST: a new generation of protein database search # programs", Nucleic Acids Res. 25:3389-3402. # * ftp://ftp.ncbi.nih.gov/blast/documents/rpsblast.html # * http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd_help.shtml # require 'bio/appl/blast/format0' module Bio class Blast # NCBI RPS Blast (Reversed Position Specific Blast) namespace. # Currently, this module is existing only for separating namespace. # To parse RPSBlast results, see Bio::Blast::RPSBlast::Report documents. module RPSBlast # NCBI RPS Blast (Reversed Position Specific Blast) # default output parser. # # It supports defalut (-m 0 option) output of the "rpsblast" command. # # Because this class inherits Bio::Blast::Default::Report, # almost all methods are eqaul to Bio::Blast::Default::Report. # Only DELIMITER (and RS) and few methods are different. # # Note for multi-fasta result: When parsing output of rpsblast command # with multi-fasta sequences as input data, # each query's result is stored as an "iteration" of PSI-Blast, # because rpsblast's output with multi-fasta input is hard to split # by query. # This behavior may be changed in the future. # # Note for nucleotide results: This class is not tested with # nucleotide query and/or nucleotide databases. # class Report < Bio::Blast::Default::Report # Delimter of each entry for TBLAST. Bio::FlatFile uses it. DELIMITER = RS = "\nRPS-BLAST" # (Integer) excess read size included in DELIMITER. DELIMITER_OVERRUN = 9 # "RPS-BLAST" # Creates a new Report object from a string. # # Note for multi-fasta results: When parsing an output of rpsblast # command running with multi-fasta sequences, # each query's result is stored as an "iteration" of PSI-Blast, # because rpsblast's output with multi-fasta input is hard to split # by query. # This behavior may be changed in the future. # # Note for nucleotide results: This class is not tested with # nucleotide query and/or nucleotide databases. # def initialize(str) str = str.sub(/\A\s+/, '') # remove trailing entries for sure str.sub!(/\n(RPS\-BLAST.*)/m, "\n") @entry_overrun = $1 @entry = str data = str.split(/(?:^[ \t]*\n)+/) format0_split_headers(data) @iterations = format0_split_search(data) format0_split_stat_params(data) end # Returns definition of the query. # For a result of multi-fasta input, the first query's definition # is returned (The same as iterations.first.query_def). def query_def iterations.first.query_def end # Returns length of the query. # For a result of multi-fasta input, the first query's length # is returned (The same as iterations.first.query_len). def query_len iterations.first.query_len end private # Splits headers into the first line, reference, query line and # database line. def format0_split_headers(data) @f0header = data.shift @f0references = [] while data[0] and /\ADatabase\:/ !~ data[0] @f0references.push data.shift end @f0database = data.shift # In special case, a void line is inserted after database name. if /\A +[\d\,]+ +sequences\; +[\d\,]+ total +letters\s*\z/ =~ data[0] then @f0database.concat "\n" @f0database.concat data.shift end end # Splits the search results. def format0_split_search(data) iterations = [] dummystr = 'Searching..................................................done' if r = data[0] and /^Searching/ =~ r then dummystr = data.shift end while r = data[0] and /^Query\=/ =~ r iterations << Iteration.new(data, dummystr) end iterations end # Iteration class for RPS-Blast. # Though RPS-Blast does not iterate like PSI-BLAST, # it aims to store a result of single query sequence. # # Normally, the instance of the class is generated # by Bio::Blast::RPSBlast::Report object. # class Iteration < Bio::Blast::Default::Report::Iteration # Creates a new Iteration object. # It is designed to be called only internally from # the Bio::Blast::RPSBlast::Report class. # Users shall not use the method directly. def initialize(data, dummystr) if /\AQuery\=/ =~ data[0] then sc = StringScanner.new(data.shift) sc.skip(/\s*/) if sc.skip_until(/Query\= */) then q = [] begin q << sc.scan(/.*/) sc.skip(/\s*^ ?/) end until !sc.rest or r = sc.skip(/ *\( *([\,\d]+) *letters *\)\s*\z/) @query_len = sc[1].delete(',').to_i if r @query_def = q.join(' ') end end data.unshift(dummystr) super(data) end # definition of the query attr_reader :query_def # length of the query sequence attr_reader :query_len end #class Iteration end #class Report end #module RPSBlast end #module Blast end #module Bio From ngoto at dev.open-bio.org Tue Apr 15 13:54:41 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Tue, 15 Apr 2008 13:54:41 +0000 Subject: [BioRuby-cvs] bioruby ChangeLog,1.84,1.85 Message-ID: <200804151354.m3FDsf3j032062@dev.open-bio.org> Update of /home/repository/bioruby/bioruby In directory dev.open-bio.org:/tmp/cvs-serv32038 Modified Files: ChangeLog Log Message: Newly added RPS-Blast default (-m 0) output parser. Index: ChangeLog =================================================================== RCS file: /home/repository/bioruby/bioruby/ChangeLog,v retrieving revision 1.84 retrieving revision 1.85 diff -C2 -d -r1.84 -r1.85 *** ChangeLog 2 Apr 2008 06:24:14 -0000 1.84 --- ChangeLog 15 Apr 2008 13:54:38 -0000 1.85 *************** *** 1,2 **** --- 1,8 ---- + 2008-04-15 Naohisa Goto + + * lib/bio/appl/blast/rpsblast.rb + + Newly added RPS-Blast default (-m 0) output parser. + 2008-04-01 Naohisa Goto From ngoto at dev.open-bio.org Tue Apr 15 13:54:41 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Tue, 15 Apr 2008 13:54:41 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/appl blast.rb,1.34,1.35 Message-ID: <200804151354.m3FDsfmn032067@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/appl In directory dev.open-bio.org:/tmp/cvs-serv32038/lib/bio/appl Modified Files: blast.rb Log Message: Newly added RPS-Blast default (-m 0) output parser. Index: blast.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/appl/blast.rb,v retrieving revision 1.34 retrieving revision 1.35 diff -C2 -d -r1.34 -r1.35 *** blast.rb 30 Jan 2008 17:43:34 -0000 1.34 --- blast.rb 15 Apr 2008 13:54:39 -0000 1.35 *************** *** 73,76 **** --- 73,77 ---- autoload :WU, 'bio/appl/blast/wublast' autoload :Bl2seq, 'bio/appl/bl2seq/report' + autoload :RPSBlast, 'bio/appl/blast/rpsblast' # This is a shortcut for Bio::Blast.new: From ngoto at dev.open-bio.org Fri Apr 18 15:40:38 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Fri, 18 Apr 2008 15:40:38 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/embl sptr.rb,1.36,1.37 Message-ID: <200804181540.m3IFecgN008057@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/embl In directory dev.open-bio.org:/tmp/cvs-serv8036/lib/bio/db/embl Modified Files: sptr.rb Log Message: bug fix: Bio::SPTR#references raises NoMethodError since lib/bio/db/embl/sptr.rb version 1.34. Index: sptr.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/embl/sptr.rb,v retrieving revision 1.36 retrieving revision 1.37 diff -C2 -d -r1.36 -r1.37 *** sptr.rb 5 Apr 2007 23:35:40 -0000 1.36 --- sptr.rb 18 Apr 2008 15:40:36 -0000 1.37 *************** *** 507,514 **** end when 'RX' # PUBMED, MEDLINE ! value.split('.').each {|item| ! tag, xref = item.split(/; /).map {|i| i.strip } hash[ tag.downcase ] = xref ! } end } --- 507,513 ---- end when 'RX' # PUBMED, MEDLINE ! value.each do |tag, xref| hash[ tag.downcase ] = xref ! end end } From ngoto at dev.open-bio.org Wed Apr 23 16:48:28 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Wed, 23 Apr 2008 16:48:28 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/embl common.rb,1.12,1.13 Message-ID: <200804231648.m3NGmSSa012476@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/embl In directory dev.open-bio.org:/tmp/cvs-serv12456/lib/bio/db/embl Modified Files: common.rb Log Message: Bug fix: Bio::EMBL#references failed to parse journal name, volume, issue, pages, and year. In addition, it might failed to parse PubMed ID. Index: common.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/embl/common.rb,v retrieving revision 1.12 retrieving revision 1.13 diff -C2 -d -r1.12 -r1.13 *** common.rb 5 Apr 2007 23:35:40 -0000 1.12 --- common.rb 23 Apr 2008 16:48:25 -0000 1.13 *************** *** 279,294 **** hash['title'] = value when 'RL' ! if value =~ /(.*) (\d+) \((\d+)\), (\d+-\d+) \((\d+)\)$/ ! hash['journal'] = $1 hash['volume'] = $2 ! hash['issue'] = $3 ! hash['pages'] = $4 ! hash['year'] = $5 else hash['journal'] = value end when 'RX' # PUBMED, MEDLINE ! value.split('.').each {|item| ! tag, xref = item.split(/; /).map {|i| i.strip } hash[ tag.downcase ] = xref } --- 279,294 ---- hash['title'] = value when 'RL' ! if /(.*) (\d+) *(\(([^\)]+)\))?(\, |\:)([a-zA-Z\d]+\-[a-zA-Z\d]+) *\((\d+)\)\.?\z/ =~ value.to_s ! hash['journal'] = $1.rstrip hash['volume'] = $2 ! hash['issue'] = $4 ! hash['pages'] = $6 ! hash['year'] = $7 else hash['journal'] = value end when 'RX' # PUBMED, MEDLINE ! value.split(/\. /).each {|item| ! tag, xref = item.split(/\; /).map {|i| i.strip.sub(/\.\z/, '') } hash[ tag.downcase ] = xref } From ngoto at dev.open-bio.org Wed Apr 23 17:34:17 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Wed, 23 Apr 2008 17:34:17 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/embl common.rb,1.12.2.1,1.12.2.2 Message-ID: <200804231734.m3NHYHMP012740@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/embl In directory dev.open-bio.org:/tmp/cvs-serv12720/lib/bio/db/embl Modified Files: Tag: BRANCH-biohackathon2008 common.rb Log Message: lib/bio/db/embl/common.rb in branch BRANCH-biohackathon2008 is copied from CVS HEAD revision 1.13 because of the bug fixed in revision 1.13. (Bug fix: Bio::EMBL#references failed to parse journal name, volume, issue, pages, and year. In addition, it might fail to parse PubMed ID.) Index: common.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/embl/common.rb,v retrieving revision 1.12.2.1 retrieving revision 1.12.2.2 diff -C2 -d -r1.12.2.1 -r1.12.2.2 *** common.rb 20 Feb 2008 09:56:22 -0000 1.12.2.1 --- common.rb 23 Apr 2008 17:34:15 -0000 1.12.2.2 *************** *** 241,305 **** def ref unless @data['R'] ! @data['R'] = Array.new ! # Get the different references as 'blurbs' (the lines together) ! reference_blurbs = get('R').split(/\nRN /) ! reference_blurbs.each_index do |i| ! reference_blurbs[i] = 'RN ' + reference_blurbs[i] unless reference_blurbs[i] =~ /^RN / ! end ! ! # For each reference, we'll first create a hash that looks like below. ! # Suppose the input is: ! # RA name1, name2, name3 ! # RA name4 ! # RT some part of the title that ! # RT did not fit on one line ! # Then the hash looks like: ! # h = { ! # 'RA' => ["name1, name2, name3", "name4"], ! # 'RT' => ["some part of the title that", "did not fit on one line"] ! # } ! reference_blurbs.each do |rb| ! line_based_data = Hash.new ! rb.split(/\n/).each do |line| ! key, value = line.scan(/^(R[A-Z]) "?(\[?.*[A-Za-z0-9]\]?)/)[0] ! if line_based_data[key].nil? ! line_based_data[key] = Array.new ! end ! line_based_data[key].push(value) ! end ! ! # Now we have to sanitize the hash: the authors should be kept in an ! # array, the title should be 1 string, ... So the hash should look like: ! # h = { ! # 'RA' => ["name1", "name2", "name3", "name4"], ! # 'RT' => 'some part of the title that did not fit on one line' ! # } ! line_based_data.keys.each do |key| ! if ['RC', 'RP', 'RT', 'RL'].include?(key) ! line_based_data[key] = line_based_data[key].join(' ') ! elsif ['RA', 'RX'].include?(key) ! sanitized_data = Array.new ! line_based_data[key].each do |v| ! sanitized_data.push(v.split(/\s*,\s*/)) ! end ! line_based_data[key] = sanitized_data.flatten ! elsif key == 'RN' ! line_based_data[key] = line_based_data[key][0].sub(/^\[/,'').sub(/\]$/,'').to_i end end ! ! # And put it in @data. @data in the end looks like this: ! # data = [ ! # { ! # 'RA' => ["name1", "name2", "name3", "name4"], ! # 'RT' => 'some part of the title that did not fit on one line' ! # }, ! # { ! # 'RA' => ["name1", "name2", "name3", "name4"], ! # 'RT' => 'some part of the title that did not fit on one line' ! # } ! # ] ! @data['R'].push(line_based_data) end end @data['R'] --- 241,265 ---- def ref unless @data['R'] ! ary = Array.new ! get('R').split(/\nRN /).each do |str| ! raw = {'RN' => '', 'RC' => '', 'RP' => '', 'RX' => '', ! 'RA' => '', 'RT' => '', 'RL' => '', 'RG' => ''} ! str = 'RN ' + str unless /^RN / =~ str ! str.split("\n").each do |line| ! if /^(R[NPXARLCTG]) (.+)/ =~ line ! raw[$1] += $2 + ' ' ! else ! raise "Invalid format in R lines, \n[#{line}]\n" end end ! raw.each_value {|v| ! v.strip! ! v.sub!(/^"/,'') ! v.sub!(/;$/,'') ! v.sub!(/"$/,'') ! } ! ary.push(raw) end + @data['R'] = ary end @data['R'] *************** *** 310,345 **** def references unless @data['references'] ! @data['references'] = Array.new ! self.ref.each do |ref| ! hash = Hash.new ! ref.each do |key, value| case key - when 'RN' - hash['embl_gb_record_number'] = value - when 'RC' - hash['comments'] = value - when 'RX' - hash['xrefs'] = value - when 'RP' - hash['sequence_position'] = value when 'RA' ! hash['authors'] = value when 'RT' hash['title'] = value when 'RL' ! hash['journal'] = value when 'RX' # PUBMED, MEDLINE ! value.each {|item| ! tag, xref = item.split(/; /).map {|i| i.strip } hash[ tag.downcase ] = xref } end ! end ! @data['references'].push(Reference.new(hash)) ! end end @data['references'] end # returns contents in the DR line. # * Bio::EMBLDB::Common#dr -> [ * ] --- 270,306 ---- def references unless @data['references'] ! ary = self.ref.map {|ent| ! hash = Hash.new('') ! ent.each {|key, value| case key when 'RA' ! hash['authors'] = value.split(/, /) when 'RT' hash['title'] = value when 'RL' ! if /(.*) (\d+) *(\(([^\)]+)\))?(\, |\:)([a-zA-Z\d]+\-[a-zA-Z\d]+) *\((\d+)\)\.?\z/ =~ value.to_s ! hash['journal'] = $1.rstrip ! hash['volume'] = $2 ! hash['issue'] = $4 ! hash['pages'] = $6 ! hash['year'] = $7 ! else ! hash['journal'] = value ! end when 'RX' # PUBMED, MEDLINE ! value.split(/\. /).each {|item| ! tag, xref = item.split(/\; /).map {|i| i.strip.sub(/\.\z/, '') } hash[ tag.downcase ] = xref } end ! } ! Reference.new(hash) ! } ! @data['references'] = References.new(ary) end @data['references'] end + # returns contents in the DR line. # * Bio::EMBLDB::Common#dr -> [ * ] From ngoto at dev.open-bio.org Wed Apr 23 18:04:53 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Wed, 23 Apr 2008 18:04:53 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/embl common.rb,1.12.2.2,1.12.2.3 Message-ID: <200804231804.m3NI4rUv012864@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/embl In directory dev.open-bio.org:/tmp/cvs-serv12842/lib/bio/db/embl Modified Files: Tag: BRANCH-biohackathon2008 common.rb Log Message: Part of changes made between 1.12 and 1.12.2.1 is incorporated with modifications. Index: common.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/embl/common.rb,v retrieving revision 1.12.2.2 retrieving revision 1.12.2.3 diff -C2 -d -r1.12.2.2 -r1.12.2.3 *** common.rb 23 Apr 2008 17:34:15 -0000 1.12.2.2 --- common.rb 23 Apr 2008 18:04:51 -0000 1.12.2.3 *************** *** 74,77 **** --- 74,78 ---- require 'bio/db' require 'bio/reference' + require 'bio/compat/references' module Bio *************** *** 274,279 **** ent.each {|key, value| case key when 'RA' ! hash['authors'] = value.split(/, /) when 'RT' hash['title'] = value --- 275,288 ---- ent.each {|key, value| case key + when 'RN' + if /\[(\d+)\]/ =~ value.to_s + hash['embl_gb_record_number'] = $1.to_i + end + when 'RC' + hash['comment'] = value + when 'RP' + hash['sequence_position'] = value when 'RA' ! hash['authors'] = value.split(/\, /) when 'RT' hash['title'] = value *************** *** 288,292 **** hash['journal'] = value end ! when 'RX' # PUBMED, MEDLINE value.split(/\. /).each {|item| tag, xref = item.split(/\; /).map {|i| i.strip.sub(/\.\z/, '') } --- 297,301 ---- hash['journal'] = value end ! when 'RX' # PUBMED, DOI, (AGRICOLA) value.split(/\. /).each {|item| tag, xref = item.split(/\; /).map {|i| i.strip.sub(/\.\z/, '') } *************** *** 297,301 **** Reference.new(hash) } ! @data['references'] = References.new(ary) end @data['references'] --- 306,310 ---- Reference.new(hash) } ! @data['references'] = ary.extend(Bio::References::BackwardCompatibility) end @data['references'] From ngoto at dev.open-bio.org Wed Apr 23 18:52:20 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Wed, 23 Apr 2008 18:52:20 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio reference.rb,1.24.2.5,1.24.2.6 Message-ID: <200804231852.m3NIqKW0013081@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio In directory dev.open-bio.org:/tmp/cvs-serv13059/lib/bio Modified Files: Tag: BRANCH-biohackathon2008 reference.rb Log Message: * lib/bio/reference.rb * New methods: Bio::Reference#comments, Bio::Reference#doi * Code of Bio::Reference#embl is moved to lib/bio/db/embl/format_embl.rb to improve tolerance for various data (e.g. references with no record numbers or with duplicated record numbers). * lib/bio/db/embl/common.rb * Changes to support for Bio::Reference#comments. * lib/bio/db/embl/format_embl.rb * Bio::Sequence::Format::NucFormatter::Embl#reference_format_embl (private method) is added based on Bio::Reference#embl. * Changes to improve tolerance for various data. Index: reference.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/reference.rb,v retrieving revision 1.24.2.5 retrieving revision 1.24.2.6 diff -C2 -d -r1.24.2.5 -r1.24.2.6 *** reference.rb 4 Mar 2008 11:31:45 -0000 1.24.2.5 --- reference.rb 23 Apr 2008 18:52:18 -0000 1.24.2.6 *************** *** 42,47 **** class Reference - include Bio::Sequence::Format::INSDFeatureHelper - # Author names in an Array, [ "Hoge, J.P.", "Fuga, F.B." ]. attr_reader :authors --- 42,45 ---- *************** *** 70,73 **** --- 68,74 ---- # medline identifier (typically Fixnum) attr_reader :medline + + # DOI identifier (typically String, e.g. "10.1126/science.1110418") + attr_reader :doi # Abstract text in String. *************** *** 89,92 **** --- 90,96 ---- attr_reader :sequence_position + # Comments for the reference (typically Array of String, or nil) + attr_reader :comments + # Create a new Bio::Reference object from a Hash of values. # Data is extracted from the values for keys: *************** *** 126,150 **** # *Returns*:: Bio::Reference object def initialize(hash) ! hash.default = '' ! @authors = hash['authors'] # [ "Hoge, J.P.", "Fuga, F.B." ] ! @title = hash['title'] # "Title of the study." ! @journal = hash['journal'] # "Theor. J. Hoge" ! @volume = hash['volume'] # 12 ! @issue = hash['issue'] # 3 ! @pages = hash['pages'] # 123-145 ! @year = hash['year'] # 2001 ! @pubmed = hash['pubmed'] # 12345678 ! @medline = hash['medline'] # 98765432 ! @abstract = hash['abstract'] @url = hash['url'] ! @mesh = hash['mesh'] @embl_gb_record_number = hash['embl_gb_record_number'] || nil @sequence_position = hash['sequence_position'] || nil ! @comments = hash['comments'] || [] ! @xrefs = hash['xrefs'] || [] ! @affiliations = hash['affiliations'] ! @authors = [] if @authors.empty? ! @mesh = [] if @mesh.empty? ! @affiliations = [] if @affiliations.empty? end --- 130,150 ---- # *Returns*:: Bio::Reference object def initialize(hash) ! @authors = hash['authors'] || [] # [ "Hoge, J.P.", "Fuga, F.B." ] ! @title = hash['title'] || '' # "Title of the study." ! @journal = hash['journal'] || '' # "Theor. J. Hoge" ! @volume = hash['volume'] || '' # 12 ! @issue = hash['issue'] || '' # 3 ! @pages = hash['pages'] || '' # 123-145 ! @year = hash['year'] || '' # 2001 ! @pubmed = hash['pubmed'] || '' # 12345678 ! @medline = hash['medline'] || '' # 98765432 ! @doi = hash['doi'] ! @abstract = hash['abstract'] || '' @url = hash['url'] ! @mesh = hash['mesh'] || [] @embl_gb_record_number = hash['embl_gb_record_number'] || nil @sequence_position = hash['sequence_position'] || nil ! @comments = hash['comments'] ! @affiliations = hash['affiliations'] || [] end *************** *** 273,298 **** # RL Plant Mol. Biol. 17(2):209-219(1991). def embl ! lines = Array.new ! if ! @embl_gb_record_number.nil? ! lines << "RN [#{@embl_gb_record_number}]" ! end ! if @comments != [] ! @comments.each do |c| ! lines << "RC #{c}" ! end ! end ! if ! @sequence_position.nil? ! lines << "RP #{@sequence_position}" ! end ! if ! @xrefs.nil? ! @xrefs.each do |x| ! lines << "RX #{x}" ! end ! end ! lines << wrap(@authors.join(', '), 80, 'RA ') + ';' unless @authors.nil? ! lines << (@title == '' ? 'RT ;' : wrap('"' + @title + '"', 80, 'RT ') + ';') ! lines << wrap(@journal, 80, 'RL ') unless @journal == '' ! lines << "XX" ! return lines.join("\n") end --- 273,280 ---- # RL Plant Mol. Biol. 17(2):209-219(1991). def embl ! r = self ! Bio::Sequence::Format::NucFormatter::Embl.new('').instance_eval { ! reference_format_embl(r) ! } end From ngoto at dev.open-bio.org Wed Apr 23 18:52:20 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Wed, 23 Apr 2008 18:52:20 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/embl format_embl.rb, 1.1.2.2, 1.1.2.3 common.rb, 1.12.2.3, 1.12.2.4 Message-ID: <200804231852.m3NIqKHG013084@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/embl In directory dev.open-bio.org:/tmp/cvs-serv13059/lib/bio/db/embl Modified Files: Tag: BRANCH-biohackathon2008 format_embl.rb common.rb Log Message: * lib/bio/reference.rb * New methods: Bio::Reference#comments, Bio::Reference#doi * Code of Bio::Reference#embl is moved to lib/bio/db/embl/format_embl.rb to improve tolerance for various data (e.g. references with no record numbers or with duplicated record numbers). * lib/bio/db/embl/common.rb * Changes to support for Bio::Reference#comments. * lib/bio/db/embl/format_embl.rb * Bio::Sequence::Format::NucFormatter::Embl#reference_format_embl (private method) is added based on Bio::Reference#embl. * Changes to improve tolerance for various data. Index: format_embl.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/embl/Attic/format_embl.rb,v retrieving revision 1.1.2.2 retrieving revision 1.1.2.3 diff -C2 -d -r1.1.2.2 -r1.1.2.3 *** format_embl.rb 27 Mar 2008 13:38:31 -0000 1.1.2.2 --- format_embl.rb 23 Apr 2008 18:52:18 -0000 1.1.2.3 *************** *** 25,28 **** --- 25,76 ---- end + # format reference + # ref:: Bio::Reference object + # hash:: (optional) a hash for RN (reference number) administration + def reference_format_embl(ref, hash = nil) + lines = Array.new + if ref.embl_gb_record_number or hash then + refno = ref.embl_gb_record_number.to_i + hash ||= {} + if refno <= 0 or hash[refno] then + refno = hash.keys.sort[-1].to_i + 1 + hash[refno] = true + end + lines << embl_wrap("RN ", "[#{refno}]") + end + if ref.comments then + ref.comments.each do |cmnt| + lines << embl_wrap("RC ", cmnt) + end + end + unless ref.sequence_position.to_s.empty? then + lines << embl_wrap("RP ", "#{ref.sequence_position}") + end + unless ref.doi.to_s.empty? then + lines << embl_wrap("RX ", "DOI; #{ref.doi}.") + end + unless ref.pubmed.to_s.empty? then + lines << embl_wrap("RX ", "PUBMED; #{ref.pubmed}.") + end + unless ref.authors.empty? + lines << embl_wrap('RA ', ref.authors.join(', ') + ';') + end + lines << embl_wrap('RT ', + (ref.title.to_s.empty? ? '' : + "\"#{ref.title}\"") + ';') + unless ref.journal.to_s.empty? then + volissue = "#{ref.volume.to_s}" + volissue = "#{volissue}(#{ref.issue})" unless ref.issue.to_s.empty? + rl = "#{ref.journal}" + rl += " #{volissue}" unless volissue.empty? + rl += ":#{ref.pages}" unless ref.pages.to_s.empty? + rl += "(#{ref.year})" unless ref.year.to_s.empty? + rl += '.' + lines << embl_wrap('RL ', rl) + end + lines << "XX" + return lines.join("\n") + end + def seq_format_embl(seq) output_lines = Array.new *************** *** 43,64 **** erb_template <<'__END_OF_TEMPLATE__' ID <%= entry_id %>; SV <%= sequence_version %>; <%= topology %>; <%= molecule_type %>; <%= data_class %>; <%= division %>; <%= seq.length %> BP. ! XX <%= embl_wrap('AC ', accessions.reject{|a| a.nil?}.join('; ') + ';') %> ! XX DT <%= date_created %> DT <%= date_modified %> ! XX <%= embl_wrap('DE ', definition) %> ! XX <%= embl_wrap('KW ', keywords.join('; ') + '.') %> ! XX OS <%= species %> <%= embl_wrap('OC ', classification.join('; ') + '.') %> XX ! <%= (references || []).collect{|ref| ref.format('embl')}.join("\n") %> ! XX ! FH Key Location/Qualifiers ! FH ! <%= format_features_embl(features || []) %>XX SQ Sequence <%= seq.length %> BP; <%= seq.composition.collect{|k,v| "#{v} #{k.upcase}"}.join('; ') + '; ' + (seq.gsub(/[ACTGactg]/, '').length.to_s ) + ' other;' %> <%= seq_format_embl(seq) %> --- 91,111 ---- erb_template <<'__END_OF_TEMPLATE__' ID <%= entry_id %>; SV <%= sequence_version %>; <%= topology %>; <%= molecule_type %>; <%= data_class %>; <%= division %>; <%= seq.length %> BP. ! XX <%= embl_wrap('AC ', accessions.reject{|a| a.nil?}.join('; ') + ';') %> ! XX DT <%= date_created %> DT <%= date_modified %> ! XX <%= embl_wrap('DE ', definition) %> ! XX <%= embl_wrap('KW ', keywords.join('; ') + '.') %> ! XX OS <%= species %> <%= embl_wrap('OC ', classification.join('; ') + '.') %> XX ! <% hash = {}; (references || []).each do |ref| %><%= reference_format_embl(ref, hash) %> ! <% end %>FH Key Location/Qualifiers ! FH ! <%= format_features_embl(features || []) %>XX SQ Sequence <%= seq.length %> BP; <%= seq.composition.collect{|k,v| "#{v} #{k.upcase}"}.join('; ') + '; ' + (seq.gsub(/[ACTGactg]/, '').length.to_s ) + ' other;' %> <%= seq_format_embl(seq) %> Index: common.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/embl/common.rb,v retrieving revision 1.12.2.3 retrieving revision 1.12.2.4 diff -C2 -d -r1.12.2.3 -r1.12.2.4 *** common.rb 23 Apr 2008 18:04:51 -0000 1.12.2.3 --- common.rb 23 Apr 2008 18:52:18 -0000 1.12.2.4 *************** *** 280,284 **** end when 'RC' ! hash['comment'] = value when 'RP' hash['sequence_position'] = value --- 280,287 ---- end when 'RC' ! unless value.to_s.strip.empty? ! hash['comments'] ||= [] ! hash['comments'].push value ! end when 'RP' hash['sequence_position'] = value From ngoto at dev.open-bio.org Thu Apr 24 13:49:44 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Thu, 24 Apr 2008 13:49:44 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/embl sptr.rb,1.36,1.36.2.1 Message-ID: <200804241349.m3ODni9x015583@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/embl In directory dev.open-bio.org:/tmp/cvs-serv15545/lib/bio/db/embl Modified Files: Tag: BRANCH-biohackathon2008 sptr.rb Log Message: The same change (except comment) as of 1.36 => 1.37 in CVS HEAD is made (bug fix: Bio::SPTR#references raises NoMethodError since lib/bio/db/embl/sptr.rb version 1.34). Index: sptr.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/embl/sptr.rb,v retrieving revision 1.36 retrieving revision 1.36.2.1 diff -C2 -d -r1.36 -r1.36.2.1 *** sptr.rb 5 Apr 2007 23:35:40 -0000 1.36 --- sptr.rb 24 Apr 2008 13:49:42 -0000 1.36.2.1 *************** *** 506,514 **** hash['journal'] = value end ! when 'RX' # PUBMED, MEDLINE ! value.split('.').each {|item| ! tag, xref = item.split(/; /).map {|i| i.strip } hash[ tag.downcase ] = xref ! } end } --- 506,513 ---- hash['journal'] = value end ! when 'RX' # PUBMED, MEDLINE, DOI ! value.each do |tag, xref| hash[ tag.downcase ] = xref ! end end } From ngoto at dev.open-bio.org Thu Apr 24 14:28:27 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Thu, 24 Apr 2008 14:28:27 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio sequence.rb,0.58.2.10,0.58.2.11 Message-ID: <200804241428.m3OESRaY016145@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio In directory dev.open-bio.org:/tmp/cvs-serv15857/lib/bio Modified Files: Tag: BRANCH-biohackathon2008 sequence.rb Log Message: * Bio::Sequence.read is renamed to Bio::Sequence.input because this method is a pair of Bio::Sequence#output. Bio::Sequence.read still exists as an alias of Bio::Sequence. * Added document for Bio::Sequence#accessions, and fixed not to contain nil in the returned array. Index: sequence.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/sequence.rb,v retrieving revision 0.58.2.10 retrieving revision 0.58.2.11 diff -C2 -d -r0.58.2.10 -r0.58.2.11 *** sequence.rb 27 Mar 2008 13:38:31 -0000 0.58.2.10 --- sequence.rb 24 Apr 2008 14:28:25 -0000 0.58.2.11 *************** *** 369,373 **** # (GenBank, EMBL, fasta format, etc.) # ! # s = Bio::Sequence.read(str) # --- # *Arguments*: --- 369,373 ---- # (GenBank, EMBL, fasta format, etc.) # ! # s = Bio::Sequence.input(str) # --- # *Arguments*: *************** *** 375,379 **** # * (optional) _format_: format specification (class or nil) # *Returns*:: Bio::Sequence object ! def self.read(str, format = nil) if format then klass = format --- 375,379 ---- # * (optional) _format_: format specification (class or nil) # *Returns*:: Bio::Sequence object ! def self.input(str, format = nil) if format then klass = format *************** *** 384,391 **** obj.to_biosequence end ! ! def accessions ! return [@primary_accession, @secondary_accessions].flatten end --- 384,398 ---- obj.to_biosequence end ! ! # alias of Bio::Sequence.input ! def self.read(str, format = nil) ! input(str, format) ! end ! ! # accession numbers of the sequence ! # ! # *Returns*:: Array of String def accessions ! [ @primary_accession, @secondary_accessions ].flatten.compact end From helios at dev.open-bio.org Mon Apr 7 13:15:46 2008 From: helios at dev.open-bio.org (Raoul Jean Pierre Bonnal) Date: Mon, 07 Apr 2008 13:15:46 -0000 Subject: [BioRuby-cvs] bioruby/lib/bio/io sql.rb,1.8.2.1,1.8.2.2 Message-ID: <200804071315.m37DFcHI005486@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/io In directory dev.open-bio.org:/tmp/cvs-serv5466/lib/bio/io Modified Files: Tag: BRANCH-biohackathon2008 sql.rb Log Message: added "hostname" to valid_keys configurations Index: sql.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/io/sql.rb,v retrieving revision 1.8.2.1 retrieving revision 1.8.2.2 diff -C2 -d -r1.8.2.1 -r1.8.2.2 *** sql.rb 25 Mar 2008 15:46:32 -0000 1.8.2.1 --- sql.rb 7 Apr 2008 13:15:36 -0000 1.8.2.2 *************** *** 25,29 **** #{:database=>"biorails_development", :adapter=>"postgresql", :username=>"rails", :password=>nil} configurations.assert_valid_keys('development', 'production','test') ! configurations[env].assert_valid_keys('database','adapter','username','password') DummyBase.configurations = configurations DummyBase.establish_connection "#{env}" --- 25,29 ---- #{:database=>"biorails_development", :adapter=>"postgresql", :username=>"rails", :password=>nil} configurations.assert_valid_keys('development', 'production','test') ! configurations[env].assert_valid_keys('hostname','database','adapter','username','password') DummyBase.configurations = configurations DummyBase.establish_connection "#{env}" *************** *** 43,46 **** --- 43,50 ---- end + def self.exists_database(name) + Bio::SQL::Biodatabase.find_by_name(name).nil? ? false : true + end + def self.list_entries Bio::SQL::Bioentry.find(:all).collect{|entry| *************** *** 117,121 **** pp connection = Bio::SQL.establish_connection({'development'=>{'database'=>"biorails_development", 'adapter'=>"postgresql", 'username'=>"rails", 'password'=>nil}},'development') #pp YAML::load(ERB.new(IO.read('bio/io/biosql/config/database.yml')).result) ! if nil pp Bio::SQL.list_entries --- 121,125 ---- pp connection = Bio::SQL.establish_connection({'development'=>{'database'=>"biorails_development", 'adapter'=>"postgresql", 'username'=>"rails", 'password'=>nil}},'development') #pp YAML::load(ERB.new(IO.read('bio/io/biosql/config/database.yml')).result) ! pp Bio::SQL.list_entries if nil pp Bio::SQL.list_entries From helios at dev.open-bio.org Mon Apr 7 13:17:55 2008 From: helios at dev.open-bio.org (Raoul Jean Pierre Bonnal) Date: Mon, 07 Apr 2008 13:17:55 -0000 Subject: [BioRuby-cvs] bioruby/lib/bio/io/biosql bioentry.rb, 1.1.2.1, 1.1.2.2 Message-ID: <200804071317.m37DHmQk005551@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/io/biosql In directory dev.open-bio.org:/tmp/cvs-serv5531/lib/bio/io/biosql Modified Files: Tag: BRANCH-biohackathon2008 bioentry.rb Log Message: corrected table name "term" in conditions to get cdsfeatures "Shortcut". associated with the entry. Index: bioentry.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/io/biosql/Attic/bioentry.rb,v retrieving revision 1.1.2.1 retrieving revision 1.1.2.2 diff -C2 -d -r1.1.2.1 -r1.1.2.2 *** bioentry.rb 25 Mar 2008 15:46:32 -0000 1.1.2.1 --- bioentry.rb 7 Apr 2008 13:17:46 -0000 1.1.2.2 *************** *** 13,17 **** has_many :subject_bioentry_relationships, :class_name=>"BioentryRelationship", :foreign_key=>"subject_bioentry_id" #non mi convince molto credo non funzioni nel modo corretto ! has_many :cdsfeatures, :class_name=>"Seqfeature", :foreign_key =>"bioentry_id", :conditions=>["terms.name='CDS'"], :include=>"typeterm" has_many :terms, :through=>:bioentry_qualifier_values --- 13,17 ---- has_many :subject_bioentry_relationships, :class_name=>"BioentryRelationship", :foreign_key=>"subject_bioentry_id" #non mi convince molto credo non funzioni nel modo corretto ! has_many :cdsfeatures, :class_name=>"Seqfeature", :foreign_key =>"bioentry_id", :conditions=>["term.name='CDS'"], :include=>"type_term" has_many :terms, :through=>:bioentry_qualifier_values From helios at dev.open-bio.org Mon Apr 7 13:18:19 2008 From: helios at dev.open-bio.org (Raoul Jean Pierre Bonnal) Date: Mon, 07 Apr 2008 13:18:19 -0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/biosql sequence.rb, 1.1.2.1, 1.1.2.2 Message-ID: <200804071318.m37DIDFm005598@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/biosql In directory dev.open-bio.org:/tmp/cvs-serv5578/lib/bio/db/biosql Modified Files: Tag: BRANCH-biohackathon2008 sequence.rb Log Message: use genbank, fasta is not working Index: sequence.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/biosql/Attic/sequence.rb,v retrieving revision 1.1.2.1 retrieving revision 1.1.2.2 diff -C2 -d -r1.1.2.1 -r1.1.2.2 *** sequence.rb 25 Mar 2008 15:46:32 -0000 1.1.2.1 --- sequence.rb 7 Apr 2008 13:18:11 -0000 1.1.2.2 *************** *** 674,679 **** # parser = Bio::FlatFile.auto('/home/febo/Desktop/aj224122.embl') ! # parser = Bio::FlatFile.auto('/home/febo/Desktop/aj224122.gb') ! parser = Bio::FlatFile.auto('/home/febo/Desktop/aj224122.fasta') parser.each do |entry| --- 674,679 ---- # parser = Bio::FlatFile.auto('/home/febo/Desktop/aj224122.embl') ! parser = Bio::FlatFile.auto('/home/febo/Desktop/aj224122.gb') ! #parser = Bio::FlatFile.auto('/home/febo/Desktop/aj224122.fasta') parser.each do |entry| *************** *** 686,689 **** --- 686,690 ---- # pp "Sequence" puts result.to_biosequence.output(:genbank) #:embl + result.delete end end