From ngoto at dev.open-bio.org  Tue Apr  1 02:31:37 2008
From: ngoto at dev.open-bio.org (Naohisa Goto)
Date: Tue, 01 Apr 2008 06:31:37 +0000
Subject: [BioRuby-cvs] bioruby/lib/bio/appl/blast format0.rb,1.26,1.27
Message-ID: <200804010631.m316VbfM002141@dev.open-bio.org>

Update of /home/repository/bioruby/bioruby/lib/bio/appl/blast
In directory dev.open-bio.org:/tmp/cvs-serv2121/lib/bio/appl/blast

Modified Files:
	format0.rb 
Log Message:
Fixed a bug when a null line is inserted after database title in some cases,
reported by Tomoaki NISHIYAMA.


Index: format0.rb
===================================================================
RCS file: /home/repository/bioruby/bioruby/lib/bio/appl/blast/format0.rb,v
retrieving revision 1.26
retrieving revision 1.27
diff -C2 -d -r1.26 -r1.27
*** format0.rb	12 Feb 2008 02:13:31 -0000	1.26
--- format0.rb	1 Apr 2008 06:31:35 -0000	1.27
***************
*** 294,297 ****
--- 294,302 ----
            @f0query = data.shift
            @f0database = data.shift
+           # In special case, a void line is inserted after database name.
+           if /\A +[\d\,]+ +sequences\; +[\d\,]+ total +letters\s*\z/ =~ data[0] then
+             @f0database.concat "\n"
+             @f0database.concat data.shift
+           end
          end
  

From ngoto at dev.open-bio.org  Tue Apr  1 06:36:47 2008
From: ngoto at dev.open-bio.org (Naohisa Goto)
Date: Tue, 01 Apr 2008 10:36:47 +0000
Subject: [BioRuby-cvs] bioruby/lib/bio/db/pdb chain.rb, 1.9, 1.10 pdb.rb,
	1.27, 1.28
Message-ID: <200804011036.m31Aal0p009616@dev.open-bio.org>

Update of /home/repository/bioruby/bioruby/lib/bio/db/pdb
In directory dev.open-bio.org:/tmp/cvs-serv9574/lib/bio/db/pdb

Modified Files:
	chain.rb pdb.rb 
Log Message:
* Fixed a bug that ArgumentError occurred in Bio::PDB::Chain#aaseq method 
  for nucleic acid chains. The same error might also be occurred in
  Bio::PDB#seqres and also fixed.
* Fixed a bug that current residue/heterogen is not properly initialized when
  current chain is changed.


Index: pdb.rb
===================================================================
RCS file: /home/repository/bioruby/bioruby/lib/bio/db/pdb/pdb.rb,v
retrieving revision 1.27
retrieving revision 1.28
diff -C2 -d -r1.27 -r1.28
*** pdb.rb	28 Dec 2007 14:43:44 -0000	1.27
--- pdb.rb	1 Apr 2008 10:36:44 -0000	1.28
***************
*** 1498,1501 ****
--- 1498,1504 ----
                chain = newChain
              end
+             # chain might be changed, clearing cResidue and cLigand
+             cResidue = nil
+             cLigand = nil
            end
          end
***************
*** 1551,1554 ****
--- 1554,1559 ----
            c_atom = nil
            cChain = nil
+           cResidue = nil
+           cLigand = nil
            if cModel.model_serial or cModel.chains.size > 0 then
              self.addModel(cModel)
***************
*** 1810,1814 ****
                #need to look up with Ala
                aa = aa.capitalize
!               (Bio::AminoAcid.three2one(aa) or 'X')
              end
              seq = Bio::Sequence::AA.new(a.to_s)
--- 1815,1823 ----
                #need to look up with Ala
                aa = aa.capitalize
!               (begin
!                  Bio::AminoAcid.three2one(aa)
!                rescue ArgumentError
!                  nil
!                end || 'X')
              end
              seq = Bio::Sequence::AA.new(a.to_s)
***************
*** 1816,1820 ****
              # nucleic acid sequence
              a.collect! do |na|
!               na = na.strip
                na.size == 1 ? na : 'n'
              end
--- 1825,1829 ----
              # nucleic acid sequence
              a.collect! do |na|
!               na = na.delete('^a-zA-Z')
                na.size == 1 ? na : 'n'
              end

Index: chain.rb
===================================================================
RCS file: /home/repository/bioruby/bioruby/lib/bio/db/pdb/chain.rb,v
retrieving revision 1.9
retrieving revision 1.10
diff -C2 -d -r1.9 -r1.10
*** chain.rb	18 Dec 2007 13:48:42 -0000	1.9
--- chain.rb	1 Apr 2008 10:36:44 -0000	1.10
***************
*** 190,194 ****
              end
              tlc = residue.resName.capitalize
!             olc = (Bio::AminoAcid.three2one(tlc) or 'X')
              string << olc
            end
--- 190,198 ----
              end
              tlc = residue.resName.capitalize
!             olc = (begin
!                      Bio::AminoAcid.three2one(tlc)
!                    rescue ArgumentError
!                      nil
!                    end || 'X')
              string << olc
            end


From ngoto at dev.open-bio.org  Wed Apr  2 02:24:16 2008
From: ngoto at dev.open-bio.org (Naohisa Goto)
Date: Wed, 02 Apr 2008 06:24:16 +0000
Subject: [BioRuby-cvs] bioruby ChangeLog,1.83,1.84
Message-ID: <200804020624.m326OGwp011324@dev.open-bio.org>

Update of /home/repository/bioruby/bioruby
In directory dev.open-bio.org:/tmp/cvs-serv11304

Modified Files:
	ChangeLog 
Log Message:
ChangeLog added for lib/bio/appl/blast/format0.rb,1.26,1.27,
lib/bio/db/pdb/chain.rb,1.9,1.10, and lib/bio/db/pdb/pdb.rb,1.27,1.28.


Index: ChangeLog
===================================================================
RCS file: /home/repository/bioruby/bioruby/ChangeLog,v
retrieving revision 1.83
retrieving revision 1.84
diff -C2 -d -r1.83 -r1.84
*** ChangeLog	12 Feb 2008 05:32:23 -0000	1.83
--- ChangeLog	2 Apr 2008 06:24:14 -0000	1.84
***************
*** 1,2 ****
--- 1,18 ----
+ 2008-04-01  Naohisa Goto <ng at bioruby.org>
+ 
+ 	* lib/bio/appl/blast/format0.rb
+ 
+ 	  Fixed a bug: Failed to parse database name in some cases.
+ 	  Thanks to Tomoaki Nishiyama who reported the bug and sent patches
+ 	  ([BioRuby-ja] BLAST format0 parser fails header parsing output
+ 	  of specific databases).
+ 
+ 	* lib/bio/db/pdb/chain.rb, lib/bio/db/pdb/pdb.rb
+ 
+ 	  Fixed bugs: Bio::PDB::Chain#aaseq failed for nucleotide chain;
+ 	  Failed to parse chains for some entries (e.g. 1B2M).
+ 	  Thanks to Semin Lee who reported the bugs and sent patches
+ 	  ([BioRuby] Bio::PDB parsing problem (1B2M)).
+ 
  2008-02-12  Naohisa Goto <ng at bioruby.org>
  

From ngoto at dev.open-bio.org  Tue Apr 15 09:54:41 2008
From: ngoto at dev.open-bio.org (Naohisa Goto)
Date: Tue, 15 Apr 2008 13:54:41 +0000
Subject: [BioRuby-cvs] bioruby/lib/bio/appl/blast rpsblast.rb,NONE,1.1
Message-ID: <200804151354.m3FDsfkK032072@dev.open-bio.org>

Update of /home/repository/bioruby/bioruby/lib/bio/appl/blast
In directory dev.open-bio.org:/tmp/cvs-serv32038/lib/bio/appl/blast

Added Files:
	rpsblast.rb 
Log Message:
Newly added RPS-Blast default (-m 0) output parser.


--- NEW FILE: rpsblast.rb ---
#
# = bio/appl/blast/rpsblast.rb - NCBI RPS Blast default output parser
# 
# Copyright::  Copyright (C) 2008 Naohisa Goto <ng at bioruby.org>
# License::    The Ruby License
#
# $Id: rpsblast.rb,v 1.1 2008/04/15 13:54:39 ngoto Exp $
#
# == Description
#
# NCBI RPS Blast (Reversed Position Specific Blast) default
# (-m 0 option) output parser class, Bio::Blast::RPSBlast::Report
# and related classes/modules.
#
# == References
#
# * Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
#   Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
#   "Gapped BLAST and PSI-BLAST: a new generation of protein database search
#   programs", Nucleic Acids Res. 25:3389-3402.
# * ftp://ftp.ncbi.nih.gov/blast/documents/rpsblast.html
# * http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd_help.shtml
#

require 'bio/appl/blast/format0'

module Bio
class Blast

  # NCBI RPS Blast (Reversed Position Specific Blast) namespace.
  # Currently, this module is existing only for separating namespace.
  # To parse RPSBlast results, see Bio::Blast::RPSBlast::Report documents.
  module RPSBlast

    # NCBI RPS Blast (Reversed Position Specific Blast)
    # default output parser.
    #
    # It supports defalut (-m 0 option) output of the "rpsblast" command.
    #
    # Because this class inherits Bio::Blast::Default::Report,
    # almost all methods are eqaul to Bio::Blast::Default::Report.
    # Only DELIMITER (and RS) and few methods are different.
    #
    # Note for multi-fasta result: When parsing output of rpsblast command
    # with multi-fasta sequences as input data,
    # each query's result is stored as an "iteration" of PSI-Blast,
    # because rpsblast's output with multi-fasta input is hard to split
    # by query.
    # This behavior may be changed in the future.
    #
    # Note for nucleotide results: This class is not tested with
    # nucleotide query and/or nucleotide databases.
    #
    class Report < Bio::Blast::Default::Report
      # Delimter of each entry for TBLAST. Bio::FlatFile uses it.
      DELIMITER = RS = "\nRPS-BLAST"

      # (Integer) excess read size included in DELIMITER.
      DELIMITER_OVERRUN = 9 # "RPS-BLAST"

      # Creates a new Report object from a string.
      #
      # Note for multi-fasta results: When parsing an output of rpsblast
      # command running with multi-fasta sequences,
      # each query's result is stored as an "iteration" of PSI-Blast,
      # because rpsblast's output with multi-fasta input is hard to split
      # by query.
      # This behavior may be changed in the future.
      #
      # Note for nucleotide results: This class is not tested with
      # nucleotide query and/or nucleotide databases.
      #
      def initialize(str)
        str = str.sub(/\A\s+/, '')
        # remove trailing entries for sure
        str.sub!(/\n(RPS\-BLAST.*)/m, "\n") 
        @entry_overrun = $1
        @entry = str
        data = str.split(/(?:^[ \t]*\n)+/)

        format0_split_headers(data)
        @iterations = format0_split_search(data)
        format0_split_stat_params(data)
      end

      # Returns definition of the query.
      # For a result of multi-fasta input, the first query's definition
      # is returned (The same as <tt>iterations.first.query_def</tt>).
      def query_def
        iterations.first.query_def
      end

      # Returns length of the query.
      # For a result of multi-fasta input, the first query's length
      # is returned (The same as <tt>iterations.first.query_len</tt>).
      def query_len
        iterations.first.query_len
      end

      private

      # Splits headers into the first line, reference, query line and
      # database line.
      def format0_split_headers(data)
        @f0header = data.shift
        @f0references = []
        while data[0] and /\ADatabase\:/ !~ data[0]
          @f0references.push data.shift
        end
        @f0database = data.shift
        # In special case, a void line is inserted after database name.
        if /\A +[\d\,]+ +sequences\; +[\d\,]+ total +letters\s*\z/ =~ data[0] then
          @f0database.concat "\n"
          @f0database.concat data.shift
        end
      end

      # Splits the search results.
      def format0_split_search(data)
        iterations = []
        dummystr = 'Searching..................................................done'
        if r = data[0] and /^Searching/ =~ r then
          dummystr = data.shift
        end
        while r = data[0] and /^Query\=/ =~ r
          iterations << Iteration.new(data, dummystr)
        end
        iterations
      end

      # Iteration class for RPS-Blast.
      # Though RPS-Blast does not iterate like PSI-BLAST, 
      # it aims to store a result of single query sequence.
      #
      # Normally, the instance of the class is generated
      # by Bio::Blast::RPSBlast::Report object.
      # 
      class Iteration < Bio::Blast::Default::Report::Iteration
        # Creates a new Iteration object.
        # It is designed to be called only internally from
        # the Bio::Blast::RPSBlast::Report class.
        # Users shall not use the method directly.
        def initialize(data, dummystr)
          if /\AQuery\=/ =~ data[0] then
            sc = StringScanner.new(data.shift)
            sc.skip(/\s*/)
            if sc.skip_until(/Query\= */) then
              q = []
              begin
                q << sc.scan(/.*/)
                sc.skip(/\s*^ ?/)
              end until !sc.rest or r = sc.skip(/ *\( *([\,\d]+) *letters *\)\s*\z/)
              @query_len = sc[1].delete(',').to_i if r
              @query_def = q.join(' ')
            end
          end
          data.unshift(dummystr)
          
          super(data)
        end

        # definition of the query
        attr_reader :query_def

        # length of the query sequence
        attr_reader :query_len
        
      end #class Iteration
      
    end #class Report

  end #module RPSBlast

end #module Blast
end #module Bio


From ngoto at dev.open-bio.org  Tue Apr 15 09:54:41 2008
From: ngoto at dev.open-bio.org (Naohisa Goto)
Date: Tue, 15 Apr 2008 13:54:41 +0000
Subject: [BioRuby-cvs] bioruby ChangeLog,1.84,1.85
Message-ID: <200804151354.m3FDsf3j032062@dev.open-bio.org>

Update of /home/repository/bioruby/bioruby
In directory dev.open-bio.org:/tmp/cvs-serv32038

Modified Files:
	ChangeLog 
Log Message:
Newly added RPS-Blast default (-m 0) output parser.


Index: ChangeLog
===================================================================
RCS file: /home/repository/bioruby/bioruby/ChangeLog,v
retrieving revision 1.84
retrieving revision 1.85
diff -C2 -d -r1.84 -r1.85
*** ChangeLog	2 Apr 2008 06:24:14 -0000	1.84
--- ChangeLog	15 Apr 2008 13:54:38 -0000	1.85
***************
*** 1,2 ****
--- 1,8 ----
+ 2008-04-15  Naohisa Goto <ng at bioruby.org>
+ 
+ 	* lib/bio/appl/blast/rpsblast.rb
+ 
+ 	  Newly added RPS-Blast default (-m 0) output parser.
+ 
  2008-04-01  Naohisa Goto <ng at bioruby.org>
  

From ngoto at dev.open-bio.org  Tue Apr 15 09:54:41 2008
From: ngoto at dev.open-bio.org (Naohisa Goto)
Date: Tue, 15 Apr 2008 13:54:41 +0000
Subject: [BioRuby-cvs] bioruby/lib/bio/appl blast.rb,1.34,1.35
Message-ID: <200804151354.m3FDsfmn032067@dev.open-bio.org>

Update of /home/repository/bioruby/bioruby/lib/bio/appl
In directory dev.open-bio.org:/tmp/cvs-serv32038/lib/bio/appl

Modified Files:
	blast.rb 
Log Message:
Newly added RPS-Blast default (-m 0) output parser.


Index: blast.rb
===================================================================
RCS file: /home/repository/bioruby/bioruby/lib/bio/appl/blast.rb,v
retrieving revision 1.34
retrieving revision 1.35
diff -C2 -d -r1.34 -r1.35
*** blast.rb	30 Jan 2008 17:43:34 -0000	1.34
--- blast.rb	15 Apr 2008 13:54:39 -0000	1.35
***************
*** 73,76 ****
--- 73,77 ----
      autoload :WU,           'bio/appl/blast/wublast'
      autoload :Bl2seq,       'bio/appl/bl2seq/report'
+     autoload :RPSBlast,     'bio/appl/blast/rpsblast'
  
      # This is a shortcut for Bio::Blast.new:


From ngoto at dev.open-bio.org  Fri Apr 18 11:40:38 2008
From: ngoto at dev.open-bio.org (Naohisa Goto)
Date: Fri, 18 Apr 2008 15:40:38 +0000
Subject: [BioRuby-cvs] bioruby/lib/bio/db/embl sptr.rb,1.36,1.37
Message-ID: <200804181540.m3IFecgN008057@dev.open-bio.org>

Update of /home/repository/bioruby/bioruby/lib/bio/db/embl
In directory dev.open-bio.org:/tmp/cvs-serv8036/lib/bio/db/embl

Modified Files:
	sptr.rb 
Log Message:
bug fix: Bio::SPTR#references raises NoMethodError since 
lib/bio/db/embl/sptr.rb version 1.34.


Index: sptr.rb
===================================================================
RCS file: /home/repository/bioruby/bioruby/lib/bio/db/embl/sptr.rb,v
retrieving revision 1.36
retrieving revision 1.37
diff -C2 -d -r1.36 -r1.37
*** sptr.rb	5 Apr 2007 23:35:40 -0000	1.36
--- sptr.rb	18 Apr 2008 15:40:36 -0000	1.37
***************
*** 507,514 ****
              end
            when 'RX'  # PUBMED, MEDLINE
!             value.split('.').each {|item|
!               tag, xref = item.split(/; /).map {|i| i.strip }
                hash[ tag.downcase ]  = xref
!             }
            end
          }
--- 507,513 ----
              end
            when 'RX'  # PUBMED, MEDLINE
!             value.each do |tag, xref|
                hash[ tag.downcase ]  = xref
!             end
            end
          }


From ngoto at dev.open-bio.org  Wed Apr 23 12:48:28 2008
From: ngoto at dev.open-bio.org (Naohisa Goto)
Date: Wed, 23 Apr 2008 16:48:28 +0000
Subject: [BioRuby-cvs] bioruby/lib/bio/db/embl common.rb,1.12,1.13
Message-ID: <200804231648.m3NGmSSa012476@dev.open-bio.org>

Update of /home/repository/bioruby/bioruby/lib/bio/db/embl
In directory dev.open-bio.org:/tmp/cvs-serv12456/lib/bio/db/embl

Modified Files:
	common.rb 
Log Message:
Bug fix: Bio::EMBL#references failed to parse journal name, volume, issue,
pages, and year. In addition, it might failed to parse PubMed ID.


Index: common.rb
===================================================================
RCS file: /home/repository/bioruby/bioruby/lib/bio/db/embl/common.rb,v
retrieving revision 1.12
retrieving revision 1.13
diff -C2 -d -r1.12 -r1.13
*** common.rb	5 Apr 2007 23:35:40 -0000	1.12
--- common.rb	23 Apr 2008 16:48:25 -0000	1.13
***************
*** 279,294 ****
              hash['title'] = value
            when 'RL'
!             if value =~ /(.*) (\d+) \((\d+)\), (\d+-\d+) \((\d+)\)$/
!               hash['journal'] = $1
                hash['volume']  = $2
!               hash['issue']   = $3
!               hash['pages']   = $4
!               hash['year']    = $5
              else
                hash['journal'] = value
              end
            when 'RX'  # PUBMED, MEDLINE
!             value.split('.').each {|item|
!               tag, xref = item.split(/; /).map {|i| i.strip }
                hash[ tag.downcase ]  = xref
              }
--- 279,294 ----
              hash['title'] = value
            when 'RL'
!             if /(.*) (\d+) *(\(([^\)]+)\))?(\, |\:)([a-zA-Z\d]+\-[a-zA-Z\d]+) *\((\d+)\)\.?\z/ =~ value.to_s
!               hash['journal'] = $1.rstrip
                hash['volume']  = $2
!               hash['issue']   = $4
!               hash['pages']   = $6
!               hash['year']    = $7
              else
                hash['journal'] = value
              end
            when 'RX'  # PUBMED, MEDLINE
!             value.split(/\. /).each {|item|
!               tag, xref = item.split(/\; /).map {|i| i.strip.sub(/\.\z/, '') }
                hash[ tag.downcase ]  = xref
              }


From ngoto at dev.open-bio.org  Wed Apr 23 13:34:17 2008
From: ngoto at dev.open-bio.org (Naohisa Goto)
Date: Wed, 23 Apr 2008 17:34:17 +0000
Subject: [BioRuby-cvs] bioruby/lib/bio/db/embl common.rb,1.12.2.1,1.12.2.2
Message-ID: <200804231734.m3NHYHMP012740@dev.open-bio.org>

Update of /home/repository/bioruby/bioruby/lib/bio/db/embl
In directory dev.open-bio.org:/tmp/cvs-serv12720/lib/bio/db/embl

Modified Files:
      Tag: BRANCH-biohackathon2008
	common.rb 
Log Message:
lib/bio/db/embl/common.rb in branch BRANCH-biohackathon2008 is copied from
CVS HEAD revision 1.13 because of the bug fixed in revision 1.13.
(Bug fix: Bio::EMBL#references failed to parse journal name, volume, issue,
pages, and year. In addition, it might fail to parse PubMed ID.)


Index: common.rb
===================================================================
RCS file: /home/repository/bioruby/bioruby/lib/bio/db/embl/common.rb,v
retrieving revision 1.12.2.1
retrieving revision 1.12.2.2
diff -C2 -d -r1.12.2.1 -r1.12.2.2
*** common.rb	20 Feb 2008 09:56:22 -0000	1.12.2.1
--- common.rb	23 Apr 2008 17:34:15 -0000	1.12.2.2
***************
*** 241,305 ****
    def ref
      unless @data['R']
!       @data['R'] = Array.new
!       # Get the different references as 'blurbs' (the lines together)
!       reference_blurbs = get('R').split(/\nRN   /)
!       reference_blurbs.each_index do |i|
!         reference_blurbs[i] = 'RN   ' + reference_blurbs[i] unless reference_blurbs[i] =~ /^RN   /
!       end
!       
!       # For each reference, we'll first create a hash that looks like below.
!       # Suppose the input is:
!       #   RA   name1, name2, name3
!       #   RA   name4
!       #   RT   some part of the title that
!       #   RT   did not fit on one line
!       # Then the hash looks like:
!       #   h = {
!       #         'RA' => ["name1, name2, name3", "name4"],
!       #         'RT' => ["some part of the title that", "did not fit on one line"]
!       #       }
!       reference_blurbs.each do |rb|
!         line_based_data = Hash.new
!         rb.split(/\n/).each do |line|
!           key, value = line.scan(/^(R[A-Z])   "?(\[?.*[A-Za-z0-9]\]?)/)[0]
!           if line_based_data[key].nil?
!             line_based_data[key] = Array.new
!           end
!           line_based_data[key].push(value)
!         end
! 
!         # Now we have to sanitize the hash: the authors should be kept in an 
!         # array, the title should be 1 string, ... So the hash should look like:
!         #  h = {
!         #        'RA' => ["name1", "name2", "name3", "name4"],
!         #        'RT' => 'some part of the title that did not fit on one line'
!         #      }
!         line_based_data.keys.each do |key|
!           if ['RC', 'RP', 'RT', 'RL'].include?(key)
!             line_based_data[key] = line_based_data[key].join(' ')
!           elsif ['RA', 'RX'].include?(key)
!             sanitized_data = Array.new
!             line_based_data[key].each do |v|
!               sanitized_data.push(v.split(/\s*,\s*/))
!             end
!             line_based_data[key] = sanitized_data.flatten
!           elsif key == 'RN'
!             line_based_data[key] = line_based_data[key][0].sub(/^\[/,'').sub(/\]$/,'').to_i
            end
          end
!         
!         # And put it in @data. @data in the end looks like this:
!         #  data = [
!         #           {
!         #             'RA' => ["name1", "name2", "name3", "name4"],
!         #             'RT' => 'some part of the title that did not fit on one line'
!         #           },
!         #           {
!         #             'RA' => ["name1", "name2", "name3", "name4"],
!         #             'RT' => 'some part of the title that did not fit on one line'
!         #           }
!         #         ]
!         @data['R'].push(line_based_data)
        end
      end
      @data['R']
--- 241,265 ----
    def ref
      unless @data['R']
!       ary = Array.new
!       get('R').split(/\nRN   /).each do |str|
!         raw = {'RN' => '', 'RC' => '', 'RP' => '', 'RX' => '', 
!                'RA' => '', 'RT' => '', 'RL' => '', 'RG' => ''}
!         str = 'RN   ' + str unless /^RN   / =~ str
!         str.split("\n").each do |line|
!           if /^(R[NPXARLCTG])   (.+)/ =~ line
!             raw[$1] += $2 + ' '
!           else
!             raise "Invalid format in R lines, \n[#{line}]\n"
            end
          end
!         raw.each_value {|v| 
!           v.strip! 
!           v.sub!(/^"/,'')
!           v.sub!(/;$/,'')
!           v.sub!(/"$/,'')
!         }
!         ary.push(raw)
        end
+       @data['R'] = ary
      end
      @data['R']
***************
*** 310,345 ****
    def references
      unless @data['references']
!       @data['references'] = Array.new
!       self.ref.each do |ref|
!         hash = Hash.new
!         ref.each do |key, value|
            case key
-           when 'RN'
-             hash['embl_gb_record_number'] = value
-           when 'RC'
-             hash['comments'] = value
-           when 'RX'
-             hash['xrefs'] = value
-           when 'RP'
-             hash['sequence_position'] = value
            when 'RA'
!             hash['authors'] = value
            when 'RT'
              hash['title'] = value
            when 'RL'
!             hash['journal'] = value
            when 'RX'  # PUBMED, MEDLINE
!             value.each {|item|
!               tag, xref = item.split(/; /).map {|i| i.strip }
                hash[ tag.downcase ]  = xref
              }
            end
!         end
!         @data['references'].push(Reference.new(hash))
!       end
      end
      @data['references']
    end
  
    # returns contents in the DR line.
    # * Bio::EMBLDB::Common#dr  -> [ <Database cross-reference Hash>* ]
--- 270,306 ----
    def references
      unless @data['references']
!       ary = self.ref.map {|ent|
!         hash = Hash.new('')
!         ent.each {|key, value|
            case key
            when 'RA'
!             hash['authors'] = value.split(/, /)
            when 'RT'
              hash['title'] = value
            when 'RL'
!             if /(.*) (\d+) *(\(([^\)]+)\))?(\, |\:)([a-zA-Z\d]+\-[a-zA-Z\d]+) *\((\d+)\)\.?\z/ =~ value.to_s
!               hash['journal'] = $1.rstrip
!               hash['volume']  = $2
!               hash['issue']   = $4
!               hash['pages']   = $6
!               hash['year']    = $7
!             else
!               hash['journal'] = value
!             end
            when 'RX'  # PUBMED, MEDLINE
!             value.split(/\. /).each {|item|
!               tag, xref = item.split(/\; /).map {|i| i.strip.sub(/\.\z/, '') }
                hash[ tag.downcase ]  = xref
              }
            end
!         }
!         Reference.new(hash)
!       }
!       @data['references'] = References.new(ary)
      end
      @data['references']
    end
  
+ 
    # returns contents in the DR line.
    # * Bio::EMBLDB::Common#dr  -> [ <Database cross-reference Hash>* ]


From ngoto at dev.open-bio.org  Wed Apr 23 14:04:53 2008
From: ngoto at dev.open-bio.org (Naohisa Goto)
Date: Wed, 23 Apr 2008 18:04:53 +0000
Subject: [BioRuby-cvs] bioruby/lib/bio/db/embl common.rb,1.12.2.2,1.12.2.3
Message-ID: <200804231804.m3NI4rUv012864@dev.open-bio.org>

Update of /home/repository/bioruby/bioruby/lib/bio/db/embl
In directory dev.open-bio.org:/tmp/cvs-serv12842/lib/bio/db/embl

Modified Files:
      Tag: BRANCH-biohackathon2008
	common.rb 
Log Message:
Part of changes made between 1.12 and 1.12.2.1 is incorporated with
modifications.


Index: common.rb
===================================================================
RCS file: /home/repository/bioruby/bioruby/lib/bio/db/embl/common.rb,v
retrieving revision 1.12.2.2
retrieving revision 1.12.2.3
diff -C2 -d -r1.12.2.2 -r1.12.2.3
*** common.rb	23 Apr 2008 17:34:15 -0000	1.12.2.2
--- common.rb	23 Apr 2008 18:04:51 -0000	1.12.2.3
***************
*** 74,77 ****
--- 74,78 ----
  require 'bio/db'
  require 'bio/reference'
+ require 'bio/compat/references'
  
  module Bio
***************
*** 274,279 ****
          ent.each {|key, value|
            case key
            when 'RA'
!             hash['authors'] = value.split(/, /)
            when 'RT'
              hash['title'] = value
--- 275,288 ----
          ent.each {|key, value|
            case key
+           when 'RN'
+             if /\[(\d+)\]/ =~ value.to_s
+               hash['embl_gb_record_number'] = $1.to_i
+             end
+           when 'RC'
+             hash['comment'] = value
+           when 'RP'
+             hash['sequence_position'] = value
            when 'RA'
!             hash['authors'] = value.split(/\, /)
            when 'RT'
              hash['title'] = value
***************
*** 288,292 ****
                hash['journal'] = value
              end
!           when 'RX'  # PUBMED, MEDLINE
              value.split(/\. /).each {|item|
                tag, xref = item.split(/\; /).map {|i| i.strip.sub(/\.\z/, '') }
--- 297,301 ----
                hash['journal'] = value
              end
!           when 'RX'  # PUBMED, DOI, (AGRICOLA)
              value.split(/\. /).each {|item|
                tag, xref = item.split(/\; /).map {|i| i.strip.sub(/\.\z/, '') }
***************
*** 297,301 ****
          Reference.new(hash)
        }
!       @data['references'] = References.new(ary)
      end
      @data['references']
--- 306,310 ----
          Reference.new(hash)
        }
!       @data['references'] = ary.extend(Bio::References::BackwardCompatibility)
      end
      @data['references']


From ngoto at dev.open-bio.org  Wed Apr 23 14:52:20 2008
From: ngoto at dev.open-bio.org (Naohisa Goto)
Date: Wed, 23 Apr 2008 18:52:20 +0000
Subject: [BioRuby-cvs] bioruby/lib/bio reference.rb,1.24.2.5,1.24.2.6
Message-ID: <200804231852.m3NIqKW0013081@dev.open-bio.org>

Update of /home/repository/bioruby/bioruby/lib/bio
In directory dev.open-bio.org:/tmp/cvs-serv13059/lib/bio

Modified Files:
      Tag: BRANCH-biohackathon2008
	reference.rb 
Log Message:
* lib/bio/reference.rb
  * New methods: Bio::Reference#comments, Bio::Reference#doi
  * Code of Bio::Reference#embl is moved to lib/bio/db/embl/format_embl.rb
    to improve tolerance for various data (e.g. references with no
    record numbers or with duplicated record numbers).
* lib/bio/db/embl/common.rb
  * Changes to support for Bio::Reference#comments.
* lib/bio/db/embl/format_embl.rb
  * Bio::Sequence::Format::NucFormatter::Embl#reference_format_embl
    (private method) is added based on Bio::Reference#embl.
  * Changes to improve tolerance for various data.


Index: reference.rb
===================================================================
RCS file: /home/repository/bioruby/bioruby/lib/bio/reference.rb,v
retrieving revision 1.24.2.5
retrieving revision 1.24.2.6
diff -C2 -d -r1.24.2.5 -r1.24.2.6
*** reference.rb	4 Mar 2008 11:31:45 -0000	1.24.2.5
--- reference.rb	23 Apr 2008 18:52:18 -0000	1.24.2.6
***************
*** 42,47 ****
    class Reference
  
-     include Bio::Sequence::Format::INSDFeatureHelper
- 
      # Author names in an Array, [ "Hoge, J.P.", "Fuga, F.B." ].
      attr_reader :authors
--- 42,45 ----
***************
*** 70,73 ****
--- 68,74 ----
      # medline identifier (typically Fixnum)
      attr_reader :medline
+ 
+     # DOI identifier (typically String, e.g. "10.1126/science.1110418")
+     attr_reader :doi
      
      # Abstract text in String.
***************
*** 89,92 ****
--- 90,96 ----
      attr_reader :sequence_position
  
+     # Comments for the reference (typically Array of String, or nil)
+     attr_reader :comments
+ 
      # Create a new Bio::Reference object from a Hash of values. 
      # Data is extracted from the values for keys:
***************
*** 126,150 ****
      # *Returns*:: Bio::Reference object
      def initialize(hash)
!       hash.default = ''
!       @authors  = hash['authors'] # [ "Hoge, J.P.", "Fuga, F.B." ]
!       @title    = hash['title']   # "Title of the study."
!       @journal  = hash['journal'] # "Theor. J. Hoge"
!       @volume   = hash['volume']  # 12
!       @issue    = hash['issue']   # 3
!       @pages    = hash['pages']   # 123-145
!       @year     = hash['year']    # 2001
!       @pubmed   = hash['pubmed']  # 12345678
!       @medline  = hash['medline'] # 98765432
!       @abstract = hash['abstract']
        @url      = hash['url']
!       @mesh     = hash['mesh']
        @embl_gb_record_number = hash['embl_gb_record_number'] || nil
        @sequence_position = hash['sequence_position'] || nil
!       @comments = hash['comments'] || []
!       @xrefs    = hash['xrefs'] || []
!       @affiliations = hash['affiliations']
!       @authors = [] if @authors.empty?
!       @mesh    = [] if @mesh.empty?
!       @affiliations = [] if @affiliations.empty?
      end
  
--- 130,150 ----
      # *Returns*:: Bio::Reference object
      def initialize(hash)
!       @authors  = hash['authors'] || [] # [ "Hoge, J.P.", "Fuga, F.B." ]
!       @title    = hash['title']   || '' # "Title of the study."
!       @journal  = hash['journal'] || '' # "Theor. J. Hoge"
!       @volume   = hash['volume']  || '' # 12
!       @issue    = hash['issue']   || '' # 3
!       @pages    = hash['pages']   || '' # 123-145
!       @year     = hash['year']    || '' # 2001
!       @pubmed   = hash['pubmed']  || '' # 12345678
!       @medline  = hash['medline'] || '' # 98765432
!       @doi      = hash['doi']
!       @abstract = hash['abstract'] || '' 
        @url      = hash['url']
!       @mesh     = hash['mesh'] || []
        @embl_gb_record_number = hash['embl_gb_record_number'] || nil
        @sequence_position = hash['sequence_position'] || nil
!       @comments  = hash['comments']
!       @affiliations = hash['affiliations'] || []
      end
  
***************
*** 273,298 ****
      #     RL   Plant Mol. Biol. 17(2):209-219(1991).
      def embl
!       lines = Array.new
!       if ! @embl_gb_record_number.nil?
!         lines << "RN   [#{@embl_gb_record_number}]"
!       end
!       if @comments != []
!         @comments.each do |c|
!           lines << "RC   #{c}"
!         end
!       end
!       if ! @sequence_position.nil?
!         lines << "RP   #{@sequence_position}"
!       end
!       if ! @xrefs.nil?
!         @xrefs.each do |x|
!           lines << "RX   #{x}"
!         end
!       end
!       lines << wrap(@authors.join(', '), 80, 'RA   ') + ';' unless @authors.nil?
!       lines << (@title == '' ? 'RT   ;' : wrap('"' + @title + '"', 80, 'RT   ') + ';')
!       lines << wrap(@journal, 80, 'RL   ') unless @journal == ''
!       lines << "XX"
!       return lines.join("\n")
      end
  
--- 273,280 ----
      #     RL   Plant Mol. Biol. 17(2):209-219(1991).
      def embl
!       r = self
!       Bio::Sequence::Format::NucFormatter::Embl.new('').instance_eval {
!         reference_format_embl(r)
!       }
      end
  

From ngoto at dev.open-bio.org  Wed Apr 23 14:52:20 2008
From: ngoto at dev.open-bio.org (Naohisa Goto)
Date: Wed, 23 Apr 2008 18:52:20 +0000
Subject: [BioRuby-cvs] bioruby/lib/bio/db/embl format_embl.rb, 1.1.2.2,
	1.1.2.3 common.rb, 1.12.2.3, 1.12.2.4
Message-ID: <200804231852.m3NIqKHG013084@dev.open-bio.org>

Update of /home/repository/bioruby/bioruby/lib/bio/db/embl
In directory dev.open-bio.org:/tmp/cvs-serv13059/lib/bio/db/embl

Modified Files:
      Tag: BRANCH-biohackathon2008
	format_embl.rb common.rb 
Log Message:
* lib/bio/reference.rb
  * New methods: Bio::Reference#comments, Bio::Reference#doi
  * Code of Bio::Reference#embl is moved to lib/bio/db/embl/format_embl.rb
    to improve tolerance for various data (e.g. references with no
    record numbers or with duplicated record numbers).
* lib/bio/db/embl/common.rb
  * Changes to support for Bio::Reference#comments.
* lib/bio/db/embl/format_embl.rb
  * Bio::Sequence::Format::NucFormatter::Embl#reference_format_embl
    (private method) is added based on Bio::Reference#embl.
  * Changes to improve tolerance for various data.


Index: format_embl.rb
===================================================================
RCS file: /home/repository/bioruby/bioruby/lib/bio/db/embl/Attic/format_embl.rb,v
retrieving revision 1.1.2.2
retrieving revision 1.1.2.3
diff -C2 -d -r1.1.2.2 -r1.1.2.3
*** format_embl.rb	27 Mar 2008 13:38:31 -0000	1.1.2.2
--- format_embl.rb	23 Apr 2008 18:52:18 -0000	1.1.2.3
***************
*** 25,28 ****
--- 25,76 ----
      end
  
+     # format reference
+     # ref:: Bio::Reference object
+     # hash:: (optional) a hash for RN (reference number) administration
+     def reference_format_embl(ref, hash = nil)
+       lines = Array.new
+       if ref.embl_gb_record_number or hash then
+         refno = ref.embl_gb_record_number.to_i
+         hash ||= {}
+         if refno <= 0 or hash[refno] then
+           refno = hash.keys.sort[-1].to_i + 1
+           hash[refno] = true
+         end
+         lines << embl_wrap("RN   ", "[#{refno}]")
+       end
+       if ref.comments then
+         ref.comments.each do |cmnt|
+           lines << embl_wrap("RC   ", cmnt)
+         end
+       end
+       unless ref.sequence_position.to_s.empty? then
+         lines << embl_wrap("RP   ",   "#{ref.sequence_position}")
+       end
+       unless ref.doi.to_s.empty? then
+         lines << embl_wrap("RX   ",   "DOI; #{ref.doi}.")
+       end
+       unless ref.pubmed.to_s.empty? then
+         lines << embl_wrap("RX   ",   "PUBMED; #{ref.pubmed}.")
+       end
+       unless ref.authors.empty?
+         lines << embl_wrap('RA   ', ref.authors.join(', ') + ';')
+       end
+       lines << embl_wrap('RT   ',
+                          (ref.title.to_s.empty? ? '' :
+                           "\"#{ref.title}\"") + ';')
+       unless ref.journal.to_s.empty? then
+         volissue = "#{ref.volume.to_s}"
+         volissue = "#{volissue}(#{ref.issue})" unless ref.issue.to_s.empty? 
+         rl = "#{ref.journal}"
+         rl += " #{volissue}" unless volissue.empty? 
+         rl += ":#{ref.pages}" unless ref.pages.to_s.empty?
+         rl += "(#{ref.year})" unless ref.year.to_s.empty?
+         rl += '.'
+         lines << embl_wrap('RL   ', rl)
+       end
+       lines << "XX"
+       return lines.join("\n")
+     end
+ 
      def seq_format_embl(seq)
        output_lines = Array.new
***************
*** 43,64 ****
      erb_template <<'__END_OF_TEMPLATE__'
  ID   <%= entry_id %>; SV <%= sequence_version %>; <%= topology %>; <%= molecule_type %>; <%= data_class %>; <%= division %>; <%= seq.length %> BP.
! XX
  <%= embl_wrap('AC   ', accessions.reject{|a| a.nil?}.join('; ') + ';') %>
! XX
  DT   <%= date_created %>
  DT   <%= date_modified %>
! XX
  <%= embl_wrap('DE   ', definition) %>
! XX
  <%= embl_wrap('KW   ', keywords.join('; ') + '.') %>
! XX
  OS   <%= species %>
  <%= embl_wrap('OC   ', classification.join('; ') + '.') %>
  XX   
! <%= (references || []).collect{|ref| ref.format('embl')}.join("\n") %>
! XX
! FH   Key             Location/Qualifiers
! FH
! <%= format_features_embl(features || []) %>XX
  SQ   Sequence <%= seq.length %> BP; <%= seq.composition.collect{|k,v| "#{v} #{k.upcase}"}.join('; ') + '; ' + (seq.gsub(/[ACTGactg]/, '').length.to_s ) + ' other;' %>
  <%= seq_format_embl(seq) %>
--- 91,111 ----
      erb_template <<'__END_OF_TEMPLATE__'
  ID   <%= entry_id %>; SV <%= sequence_version %>; <%= topology %>; <%= molecule_type %>; <%= data_class %>; <%= division %>; <%= seq.length %> BP.
! XX   
  <%= embl_wrap('AC   ', accessions.reject{|a| a.nil?}.join('; ') + ';') %>
! XX   
  DT   <%= date_created %>
  DT   <%= date_modified %>
! XX   
  <%= embl_wrap('DE   ', definition) %>
! XX   
  <%= embl_wrap('KW   ', keywords.join('; ') + '.') %>
! XX   
  OS   <%= species %>
  <%= embl_wrap('OC   ', classification.join('; ') + '.') %>
  XX   
! <% hash = {}; (references || []).each do |ref| %><%= reference_format_embl(ref, hash) %>
! <% end %>FH   Key             Location/Qualifiers
! FH   
! <%= format_features_embl(features || []) %>XX   
  SQ   Sequence <%= seq.length %> BP; <%= seq.composition.collect{|k,v| "#{v} #{k.upcase}"}.join('; ') + '; ' + (seq.gsub(/[ACTGactg]/, '').length.to_s ) + ' other;' %>
  <%= seq_format_embl(seq) %>

Index: common.rb
===================================================================
RCS file: /home/repository/bioruby/bioruby/lib/bio/db/embl/common.rb,v
retrieving revision 1.12.2.3
retrieving revision 1.12.2.4
diff -C2 -d -r1.12.2.3 -r1.12.2.4
*** common.rb	23 Apr 2008 18:04:51 -0000	1.12.2.3
--- common.rb	23 Apr 2008 18:52:18 -0000	1.12.2.4
***************
*** 280,284 ****
              end
            when 'RC'
!             hash['comment'] = value
            when 'RP'
              hash['sequence_position'] = value
--- 280,287 ----
              end
            when 'RC'
!             unless value.to_s.strip.empty?
!               hash['comments'] ||= []
!               hash['comments'].push value
!             end
            when 'RP'
              hash['sequence_position'] = value


From ngoto at dev.open-bio.org  Thu Apr 24 09:49:44 2008
From: ngoto at dev.open-bio.org (Naohisa Goto)
Date: Thu, 24 Apr 2008 13:49:44 +0000
Subject: [BioRuby-cvs] bioruby/lib/bio/db/embl sptr.rb,1.36,1.36.2.1
Message-ID: <200804241349.m3ODni9x015583@dev.open-bio.org>

Update of /home/repository/bioruby/bioruby/lib/bio/db/embl
In directory dev.open-bio.org:/tmp/cvs-serv15545/lib/bio/db/embl

Modified Files:
      Tag: BRANCH-biohackathon2008
	sptr.rb 
Log Message:
The same change (except comment) as of 1.36 => 1.37 in CVS HEAD is made
(bug fix: Bio::SPTR#references raises NoMethodError since 
lib/bio/db/embl/sptr.rb version 1.34).


Index: sptr.rb
===================================================================
RCS file: /home/repository/bioruby/bioruby/lib/bio/db/embl/sptr.rb,v
retrieving revision 1.36
retrieving revision 1.36.2.1
diff -C2 -d -r1.36 -r1.36.2.1
*** sptr.rb	5 Apr 2007 23:35:40 -0000	1.36
--- sptr.rb	24 Apr 2008 13:49:42 -0000	1.36.2.1
***************
*** 506,514 ****
                hash['journal'] = value
              end
!           when 'RX'  # PUBMED, MEDLINE
!             value.split('.').each {|item|
!               tag, xref = item.split(/; /).map {|i| i.strip }
                hash[ tag.downcase ]  = xref
!             }
            end
          }
--- 506,513 ----
                hash['journal'] = value
              end
!           when 'RX'  # PUBMED, MEDLINE, DOI
!             value.each do |tag, xref|
                hash[ tag.downcase ]  = xref
!             end
            end
          }


From ngoto at dev.open-bio.org  Thu Apr 24 10:28:27 2008
From: ngoto at dev.open-bio.org (Naohisa Goto)
Date: Thu, 24 Apr 2008 14:28:27 +0000
Subject: [BioRuby-cvs] bioruby/lib/bio sequence.rb,0.58.2.10,0.58.2.11
Message-ID: <200804241428.m3OESRaY016145@dev.open-bio.org>

Update of /home/repository/bioruby/bioruby/lib/bio
In directory dev.open-bio.org:/tmp/cvs-serv15857/lib/bio

Modified Files:
      Tag: BRANCH-biohackathon2008
	sequence.rb 
Log Message:
* Bio::Sequence.read is renamed to Bio::Sequence.input because this method is
  a pair of Bio::Sequence#output. Bio::Sequence.read still exists as an
  alias of Bio::Sequence.
* Added document for Bio::Sequence#accessions, and fixed not to contain nil
  in the returned array.


Index: sequence.rb
===================================================================
RCS file: /home/repository/bioruby/bioruby/lib/bio/sequence.rb,v
retrieving revision 0.58.2.10
retrieving revision 0.58.2.11
diff -C2 -d -r0.58.2.10 -r0.58.2.11
*** sequence.rb	27 Mar 2008 13:38:31 -0000	0.58.2.10
--- sequence.rb	24 Apr 2008 14:28:25 -0000	0.58.2.11
***************
*** 369,373 ****
    # (GenBank, EMBL, fasta format, etc.)
    #
!   #   s = Bio::Sequence.read(str)
    # ---
    # *Arguments*:
--- 369,373 ----
    # (GenBank, EMBL, fasta format, etc.)
    #
!   #   s = Bio::Sequence.input(str)
    # ---
    # *Arguments*:
***************
*** 375,379 ****
    # * (optional) _format_: format specification (class or nil)
    # *Returns*:: Bio::Sequence object
!   def self.read(str, format = nil)
      if format then
        klass = format
--- 375,379 ----
    # * (optional) _format_: format specification (class or nil)
    # *Returns*:: Bio::Sequence object
!   def self.input(str, format = nil)
      if format then
        klass = format
***************
*** 384,391 ****
      obj.to_biosequence
    end
!   
!   
    def accessions
!     return [@primary_accession, @secondary_accessions].flatten
    end
  
--- 384,398 ----
      obj.to_biosequence
    end
! 
!   # alias of Bio::Sequence.input
!   def self.read(str, format = nil)
!     input(str, format)
!   end
! 
!   # accession numbers of the sequence
!   #
!   # *Returns*:: Array of String
    def accessions
!     [ @primary_accession, @secondary_accessions ].flatten.compact
    end
  

From helios at dev.open-bio.org  Mon Apr  7 09:15:46 2008
From: helios at dev.open-bio.org (Raoul Jean Pierre Bonnal)
Date: Mon, 07 Apr 2008 13:15:46 -0000
Subject: [BioRuby-cvs] bioruby/lib/bio/io sql.rb,1.8.2.1,1.8.2.2
Message-ID: <200804071315.m37DFcHI005486@dev.open-bio.org>

Update of /home/repository/bioruby/bioruby/lib/bio/io
In directory dev.open-bio.org:/tmp/cvs-serv5466/lib/bio/io

Modified Files:
      Tag: BRANCH-biohackathon2008
	sql.rb 
Log Message:
added "hostname" to valid_keys configurations

Index: sql.rb
===================================================================
RCS file: /home/repository/bioruby/bioruby/lib/bio/io/sql.rb,v
retrieving revision 1.8.2.1
retrieving revision 1.8.2.2
diff -C2 -d -r1.8.2.1 -r1.8.2.2
*** sql.rb	25 Mar 2008 15:46:32 -0000	1.8.2.1
--- sql.rb	7 Apr 2008 13:15:36 -0000	1.8.2.2
***************
*** 25,29 ****
        #{:database=>"biorails_development", :adapter=>"postgresql", :username=>"rails", :password=>nil}
        configurations.assert_valid_keys('development', 'production','test')
!       configurations[env].assert_valid_keys('database','adapter','username','password')
        DummyBase.configurations = configurations
        DummyBase.establish_connection "#{env}"
--- 25,29 ----
        #{:database=>"biorails_development", :adapter=>"postgresql", :username=>"rails", :password=>nil}
        configurations.assert_valid_keys('development', 'production','test')
!       configurations[env].assert_valid_keys('hostname','database','adapter','username','password')
        DummyBase.configurations = configurations
        DummyBase.establish_connection "#{env}"
***************
*** 43,46 ****
--- 43,50 ----
      end
      
+     def self.exists_database(name)
+       Bio::SQL::Biodatabase.find_by_name(name).nil? ? false : true
+     end
+     
      def self.list_entries
        Bio::SQL::Bioentry.find(:all).collect{|entry|
***************
*** 117,121 ****
    pp connection = Bio::SQL.establish_connection({'development'=>{'database'=>"biorails_development", 'adapter'=>"postgresql", 'username'=>"rails", 'password'=>nil}},'development')
    #pp YAML::load(ERB.new(IO.read('bio/io/biosql/config/database.yml')).result)
!   
    if nil
      pp Bio::SQL.list_entries
--- 121,125 ----
    pp connection = Bio::SQL.establish_connection({'development'=>{'database'=>"biorails_development", 'adapter'=>"postgresql", 'username'=>"rails", 'password'=>nil}},'development')
    #pp YAML::load(ERB.new(IO.read('bio/io/biosql/config/database.yml')).result)
!   pp Bio::SQL.list_entries
    if nil
      pp Bio::SQL.list_entries


From helios at dev.open-bio.org  Mon Apr  7 09:17:55 2008
From: helios at dev.open-bio.org (Raoul Jean Pierre Bonnal)
Date: Mon, 07 Apr 2008 13:17:55 -0000
Subject: [BioRuby-cvs] bioruby/lib/bio/io/biosql bioentry.rb, 1.1.2.1,
	1.1.2.2
Message-ID: <200804071317.m37DHmQk005551@dev.open-bio.org>

Update of /home/repository/bioruby/bioruby/lib/bio/io/biosql
In directory dev.open-bio.org:/tmp/cvs-serv5531/lib/bio/io/biosql

Modified Files:
      Tag: BRANCH-biohackathon2008
	bioentry.rb 
Log Message:
corrected table name "term" in conditions to get cdsfeatures "Shortcut". associated with the entry. 

Index: bioentry.rb
===================================================================
RCS file: /home/repository/bioruby/bioruby/lib/bio/io/biosql/Attic/bioentry.rb,v
retrieving revision 1.1.2.1
retrieving revision 1.1.2.2
diff -C2 -d -r1.1.2.1 -r1.1.2.2
*** bioentry.rb	25 Mar 2008 15:46:32 -0000	1.1.2.1
--- bioentry.rb	7 Apr 2008 13:17:46 -0000	1.1.2.2
***************
*** 13,17 ****
  				has_many :subject_bioentry_relationships, :class_name=>"BioentryRelationship", :foreign_key=>"subject_bioentry_id" #non mi convince molto credo non funzioni nel modo corretto
  
! 				has_many :cdsfeatures, :class_name=>"Seqfeature", :foreign_key =>"bioentry_id", :conditions=>["terms.name='CDS'"], :include=>"typeterm"
          
          has_many :terms, :through=>:bioentry_qualifier_values
--- 13,17 ----
  				has_many :subject_bioentry_relationships, :class_name=>"BioentryRelationship", :foreign_key=>"subject_bioentry_id" #non mi convince molto credo non funzioni nel modo corretto
  
! 				has_many :cdsfeatures, :class_name=>"Seqfeature", :foreign_key =>"bioentry_id", :conditions=>["term.name='CDS'"], :include=>"type_term"
          
          has_many :terms, :through=>:bioentry_qualifier_values


From helios at dev.open-bio.org  Mon Apr  7 09:18:19 2008
From: helios at dev.open-bio.org (Raoul Jean Pierre Bonnal)
Date: Mon, 07 Apr 2008 13:18:19 -0000
Subject: [BioRuby-cvs] bioruby/lib/bio/db/biosql sequence.rb, 1.1.2.1,
	1.1.2.2
Message-ID: <200804071318.m37DIDFm005598@dev.open-bio.org>

Update of /home/repository/bioruby/bioruby/lib/bio/db/biosql
In directory dev.open-bio.org:/tmp/cvs-serv5578/lib/bio/db/biosql

Modified Files:
      Tag: BRANCH-biohackathon2008
	sequence.rb 
Log Message:
use genbank, fasta is not working

Index: sequence.rb
===================================================================
RCS file: /home/repository/bioruby/bioruby/lib/bio/db/biosql/Attic/sequence.rb,v
retrieving revision 1.1.2.1
retrieving revision 1.1.2.2
diff -C2 -d -r1.1.2.1 -r1.1.2.2
*** sequence.rb	25 Mar 2008 15:46:32 -0000	1.1.2.1
--- sequence.rb	7 Apr 2008 13:18:11 -0000	1.1.2.2
***************
*** 674,679 ****
    
    #  parser = Bio::FlatFile.auto('/home/febo/Desktop/aj224122.embl')
!   #  parser = Bio::FlatFile.auto('/home/febo/Desktop/aj224122.gb')
!   parser = Bio::FlatFile.auto('/home/febo/Desktop/aj224122.fasta')
    
    parser.each do |entry|
--- 674,679 ----
    
    #  parser = Bio::FlatFile.auto('/home/febo/Desktop/aj224122.embl')
!   parser = Bio::FlatFile.auto('/home/febo/Desktop/aj224122.gb')
!   #parser = Bio::FlatFile.auto('/home/febo/Desktop/aj224122.fasta')
    
    parser.each do |entry|
***************
*** 686,689 ****
--- 686,690 ----
        #      pp "Sequence"
        puts result.to_biosequence.output(:genbank) #:embl
+       result.delete
      end   
    end


From ngoto at dev.open-bio.org  Tue Apr  1 06:31:37 2008
From: ngoto at dev.open-bio.org (Naohisa Goto)
Date: Tue, 01 Apr 2008 06:31:37 +0000
Subject: [BioRuby-cvs] bioruby/lib/bio/appl/blast format0.rb,1.26,1.27
Message-ID: <200804010631.m316VbfM002141@dev.open-bio.org>

Update of /home/repository/bioruby/bioruby/lib/bio/appl/blast
In directory dev.open-bio.org:/tmp/cvs-serv2121/lib/bio/appl/blast

Modified Files:
	format0.rb 
Log Message:
Fixed a bug when a null line is inserted after database title in some cases,
reported by Tomoaki NISHIYAMA.


Index: format0.rb
===================================================================
RCS file: /home/repository/bioruby/bioruby/lib/bio/appl/blast/format0.rb,v
retrieving revision 1.26
retrieving revision 1.27
diff -C2 -d -r1.26 -r1.27
*** format0.rb	12 Feb 2008 02:13:31 -0000	1.26
--- format0.rb	1 Apr 2008 06:31:35 -0000	1.27
***************
*** 294,297 ****
--- 294,302 ----
            @f0query = data.shift
            @f0database = data.shift
+           # In special case, a void line is inserted after database name.
+           if /\A +[\d\,]+ +sequences\; +[\d\,]+ total +letters\s*\z/ =~ data[0] then
+             @f0database.concat "\n"
+             @f0database.concat data.shift
+           end
          end
  

From ngoto at dev.open-bio.org  Tue Apr  1 10:36:47 2008
From: ngoto at dev.open-bio.org (Naohisa Goto)
Date: Tue, 01 Apr 2008 10:36:47 +0000
Subject: [BioRuby-cvs] bioruby/lib/bio/db/pdb chain.rb, 1.9, 1.10 pdb.rb,
	1.27, 1.28
Message-ID: <200804011036.m31Aal0p009616@dev.open-bio.org>

Update of /home/repository/bioruby/bioruby/lib/bio/db/pdb
In directory dev.open-bio.org:/tmp/cvs-serv9574/lib/bio/db/pdb

Modified Files:
	chain.rb pdb.rb 
Log Message:
* Fixed a bug that ArgumentError occurred in Bio::PDB::Chain#aaseq method 
  for nucleic acid chains. The same error might also be occurred in
  Bio::PDB#seqres and also fixed.
* Fixed a bug that current residue/heterogen is not properly initialized when
  current chain is changed.


Index: pdb.rb
===================================================================
RCS file: /home/repository/bioruby/bioruby/lib/bio/db/pdb/pdb.rb,v
retrieving revision 1.27
retrieving revision 1.28
diff -C2 -d -r1.27 -r1.28
*** pdb.rb	28 Dec 2007 14:43:44 -0000	1.27
--- pdb.rb	1 Apr 2008 10:36:44 -0000	1.28
***************
*** 1498,1501 ****
--- 1498,1504 ----
                chain = newChain
              end
+             # chain might be changed, clearing cResidue and cLigand
+             cResidue = nil
+             cLigand = nil
            end
          end
***************
*** 1551,1554 ****
--- 1554,1559 ----
            c_atom = nil
            cChain = nil
+           cResidue = nil
+           cLigand = nil
            if cModel.model_serial or cModel.chains.size > 0 then
              self.addModel(cModel)
***************
*** 1810,1814 ****
                #need to look up with Ala
                aa = aa.capitalize
!               (Bio::AminoAcid.three2one(aa) or 'X')
              end
              seq = Bio::Sequence::AA.new(a.to_s)
--- 1815,1823 ----
                #need to look up with Ala
                aa = aa.capitalize
!               (begin
!                  Bio::AminoAcid.three2one(aa)
!                rescue ArgumentError
!                  nil
!                end || 'X')
              end
              seq = Bio::Sequence::AA.new(a.to_s)
***************
*** 1816,1820 ****
              # nucleic acid sequence
              a.collect! do |na|
!               na = na.strip
                na.size == 1 ? na : 'n'
              end
--- 1825,1829 ----
              # nucleic acid sequence
              a.collect! do |na|
!               na = na.delete('^a-zA-Z')
                na.size == 1 ? na : 'n'
              end

Index: chain.rb
===================================================================
RCS file: /home/repository/bioruby/bioruby/lib/bio/db/pdb/chain.rb,v
retrieving revision 1.9
retrieving revision 1.10
diff -C2 -d -r1.9 -r1.10
*** chain.rb	18 Dec 2007 13:48:42 -0000	1.9
--- chain.rb	1 Apr 2008 10:36:44 -0000	1.10
***************
*** 190,194 ****
              end
              tlc = residue.resName.capitalize
!             olc = (Bio::AminoAcid.three2one(tlc) or 'X')
              string << olc
            end
--- 190,198 ----
              end
              tlc = residue.resName.capitalize
!             olc = (begin
!                      Bio::AminoAcid.three2one(tlc)
!                    rescue ArgumentError
!                      nil
!                    end || 'X')
              string << olc
            end


From ngoto at dev.open-bio.org  Wed Apr  2 06:24:16 2008
From: ngoto at dev.open-bio.org (Naohisa Goto)
Date: Wed, 02 Apr 2008 06:24:16 +0000
Subject: [BioRuby-cvs] bioruby ChangeLog,1.83,1.84
Message-ID: <200804020624.m326OGwp011324@dev.open-bio.org>

Update of /home/repository/bioruby/bioruby
In directory dev.open-bio.org:/tmp/cvs-serv11304

Modified Files:
	ChangeLog 
Log Message:
ChangeLog added for lib/bio/appl/blast/format0.rb,1.26,1.27,
lib/bio/db/pdb/chain.rb,1.9,1.10, and lib/bio/db/pdb/pdb.rb,1.27,1.28.


Index: ChangeLog
===================================================================
RCS file: /home/repository/bioruby/bioruby/ChangeLog,v
retrieving revision 1.83
retrieving revision 1.84
diff -C2 -d -r1.83 -r1.84
*** ChangeLog	12 Feb 2008 05:32:23 -0000	1.83
--- ChangeLog	2 Apr 2008 06:24:14 -0000	1.84
***************
*** 1,2 ****
--- 1,18 ----
+ 2008-04-01  Naohisa Goto <ng at bioruby.org>
+ 
+ 	* lib/bio/appl/blast/format0.rb
+ 
+ 	  Fixed a bug: Failed to parse database name in some cases.
+ 	  Thanks to Tomoaki Nishiyama who reported the bug and sent patches
+ 	  ([BioRuby-ja] BLAST format0 parser fails header parsing output
+ 	  of specific databases).
+ 
+ 	* lib/bio/db/pdb/chain.rb, lib/bio/db/pdb/pdb.rb
+ 
+ 	  Fixed bugs: Bio::PDB::Chain#aaseq failed for nucleotide chain;
+ 	  Failed to parse chains for some entries (e.g. 1B2M).
+ 	  Thanks to Semin Lee who reported the bugs and sent patches
+ 	  ([BioRuby] Bio::PDB parsing problem (1B2M)).
+ 
  2008-02-12  Naohisa Goto <ng at bioruby.org>
  

From ngoto at dev.open-bio.org  Tue Apr 15 13:54:41 2008
From: ngoto at dev.open-bio.org (Naohisa Goto)
Date: Tue, 15 Apr 2008 13:54:41 +0000
Subject: [BioRuby-cvs] bioruby/lib/bio/appl/blast rpsblast.rb,NONE,1.1
Message-ID: <200804151354.m3FDsfkK032072@dev.open-bio.org>

Update of /home/repository/bioruby/bioruby/lib/bio/appl/blast
In directory dev.open-bio.org:/tmp/cvs-serv32038/lib/bio/appl/blast

Added Files:
	rpsblast.rb 
Log Message:
Newly added RPS-Blast default (-m 0) output parser.


--- NEW FILE: rpsblast.rb ---
#
# = bio/appl/blast/rpsblast.rb - NCBI RPS Blast default output parser
# 
# Copyright::  Copyright (C) 2008 Naohisa Goto <ng at bioruby.org>
# License::    The Ruby License
#
# $Id: rpsblast.rb,v 1.1 2008/04/15 13:54:39 ngoto Exp $
#
# == Description
#
# NCBI RPS Blast (Reversed Position Specific Blast) default
# (-m 0 option) output parser class, Bio::Blast::RPSBlast::Report
# and related classes/modules.
#
# == References
#
# * Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
#   Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
#   "Gapped BLAST and PSI-BLAST: a new generation of protein database search
#   programs", Nucleic Acids Res. 25:3389-3402.
# * ftp://ftp.ncbi.nih.gov/blast/documents/rpsblast.html
# * http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd_help.shtml
#

require 'bio/appl/blast/format0'

module Bio
class Blast

  # NCBI RPS Blast (Reversed Position Specific Blast) namespace.
  # Currently, this module is existing only for separating namespace.
  # To parse RPSBlast results, see Bio::Blast::RPSBlast::Report documents.
  module RPSBlast

    # NCBI RPS Blast (Reversed Position Specific Blast)
    # default output parser.
    #
    # It supports defalut (-m 0 option) output of the "rpsblast" command.
    #
    # Because this class inherits Bio::Blast::Default::Report,
    # almost all methods are eqaul to Bio::Blast::Default::Report.
    # Only DELIMITER (and RS) and few methods are different.
    #
    # Note for multi-fasta result: When parsing output of rpsblast command
    # with multi-fasta sequences as input data,
    # each query's result is stored as an "iteration" of PSI-Blast,
    # because rpsblast's output with multi-fasta input is hard to split
    # by query.
    # This behavior may be changed in the future.
    #
    # Note for nucleotide results: This class is not tested with
    # nucleotide query and/or nucleotide databases.
    #
    class Report < Bio::Blast::Default::Report
      # Delimter of each entry for TBLAST. Bio::FlatFile uses it.
      DELIMITER = RS = "\nRPS-BLAST"

      # (Integer) excess read size included in DELIMITER.
      DELIMITER_OVERRUN = 9 # "RPS-BLAST"

      # Creates a new Report object from a string.
      #
      # Note for multi-fasta results: When parsing an output of rpsblast
      # command running with multi-fasta sequences,
      # each query's result is stored as an "iteration" of PSI-Blast,
      # because rpsblast's output with multi-fasta input is hard to split
      # by query.
      # This behavior may be changed in the future.
      #
      # Note for nucleotide results: This class is not tested with
      # nucleotide query and/or nucleotide databases.
      #
      def initialize(str)
        str = str.sub(/\A\s+/, '')
        # remove trailing entries for sure
        str.sub!(/\n(RPS\-BLAST.*)/m, "\n") 
        @entry_overrun = $1
        @entry = str
        data = str.split(/(?:^[ \t]*\n)+/)

        format0_split_headers(data)
        @iterations = format0_split_search(data)
        format0_split_stat_params(data)
      end

      # Returns definition of the query.
      # For a result of multi-fasta input, the first query's definition
      # is returned (The same as <tt>iterations.first.query_def</tt>).
      def query_def
        iterations.first.query_def
      end

      # Returns length of the query.
      # For a result of multi-fasta input, the first query's length
      # is returned (The same as <tt>iterations.first.query_len</tt>).
      def query_len
        iterations.first.query_len
      end

      private

      # Splits headers into the first line, reference, query line and
      # database line.
      def format0_split_headers(data)
        @f0header = data.shift
        @f0references = []
        while data[0] and /\ADatabase\:/ !~ data[0]
          @f0references.push data.shift
        end
        @f0database = data.shift
        # In special case, a void line is inserted after database name.
        if /\A +[\d\,]+ +sequences\; +[\d\,]+ total +letters\s*\z/ =~ data[0] then
          @f0database.concat "\n"
          @f0database.concat data.shift
        end
      end

      # Splits the search results.
      def format0_split_search(data)
        iterations = []
        dummystr = 'Searching..................................................done'
        if r = data[0] and /^Searching/ =~ r then
          dummystr = data.shift
        end
        while r = data[0] and /^Query\=/ =~ r
          iterations << Iteration.new(data, dummystr)
        end
        iterations
      end

      # Iteration class for RPS-Blast.
      # Though RPS-Blast does not iterate like PSI-BLAST, 
      # it aims to store a result of single query sequence.
      #
      # Normally, the instance of the class is generated
      # by Bio::Blast::RPSBlast::Report object.
      # 
      class Iteration < Bio::Blast::Default::Report::Iteration
        # Creates a new Iteration object.
        # It is designed to be called only internally from
        # the Bio::Blast::RPSBlast::Report class.
        # Users shall not use the method directly.
        def initialize(data, dummystr)
          if /\AQuery\=/ =~ data[0] then
            sc = StringScanner.new(data.shift)
            sc.skip(/\s*/)
            if sc.skip_until(/Query\= */) then
              q = []
              begin
                q << sc.scan(/.*/)
                sc.skip(/\s*^ ?/)
              end until !sc.rest or r = sc.skip(/ *\( *([\,\d]+) *letters *\)\s*\z/)
              @query_len = sc[1].delete(',').to_i if r
              @query_def = q.join(' ')
            end
          end
          data.unshift(dummystr)
          
          super(data)
        end

        # definition of the query
        attr_reader :query_def

        # length of the query sequence
        attr_reader :query_len
        
      end #class Iteration
      
    end #class Report

  end #module RPSBlast

end #module Blast
end #module Bio


From ngoto at dev.open-bio.org  Tue Apr 15 13:54:41 2008
From: ngoto at dev.open-bio.org (Naohisa Goto)
Date: Tue, 15 Apr 2008 13:54:41 +0000
Subject: [BioRuby-cvs] bioruby ChangeLog,1.84,1.85
Message-ID: <200804151354.m3FDsf3j032062@dev.open-bio.org>

Update of /home/repository/bioruby/bioruby
In directory dev.open-bio.org:/tmp/cvs-serv32038

Modified Files:
	ChangeLog 
Log Message:
Newly added RPS-Blast default (-m 0) output parser.


Index: ChangeLog
===================================================================
RCS file: /home/repository/bioruby/bioruby/ChangeLog,v
retrieving revision 1.84
retrieving revision 1.85
diff -C2 -d -r1.84 -r1.85
*** ChangeLog	2 Apr 2008 06:24:14 -0000	1.84
--- ChangeLog	15 Apr 2008 13:54:38 -0000	1.85
***************
*** 1,2 ****
--- 1,8 ----
+ 2008-04-15  Naohisa Goto <ng at bioruby.org>
+ 
+ 	* lib/bio/appl/blast/rpsblast.rb
+ 
+ 	  Newly added RPS-Blast default (-m 0) output parser.
+ 
  2008-04-01  Naohisa Goto <ng at bioruby.org>
  

From ngoto at dev.open-bio.org  Tue Apr 15 13:54:41 2008
From: ngoto at dev.open-bio.org (Naohisa Goto)
Date: Tue, 15 Apr 2008 13:54:41 +0000
Subject: [BioRuby-cvs] bioruby/lib/bio/appl blast.rb,1.34,1.35
Message-ID: <200804151354.m3FDsfmn032067@dev.open-bio.org>

Update of /home/repository/bioruby/bioruby/lib/bio/appl
In directory dev.open-bio.org:/tmp/cvs-serv32038/lib/bio/appl

Modified Files:
	blast.rb 
Log Message:
Newly added RPS-Blast default (-m 0) output parser.


Index: blast.rb
===================================================================
RCS file: /home/repository/bioruby/bioruby/lib/bio/appl/blast.rb,v
retrieving revision 1.34
retrieving revision 1.35
diff -C2 -d -r1.34 -r1.35
*** blast.rb	30 Jan 2008 17:43:34 -0000	1.34
--- blast.rb	15 Apr 2008 13:54:39 -0000	1.35
***************
*** 73,76 ****
--- 73,77 ----
      autoload :WU,           'bio/appl/blast/wublast'
      autoload :Bl2seq,       'bio/appl/bl2seq/report'
+     autoload :RPSBlast,     'bio/appl/blast/rpsblast'
  
      # This is a shortcut for Bio::Blast.new:


From ngoto at dev.open-bio.org  Fri Apr 18 15:40:38 2008
From: ngoto at dev.open-bio.org (Naohisa Goto)
Date: Fri, 18 Apr 2008 15:40:38 +0000
Subject: [BioRuby-cvs] bioruby/lib/bio/db/embl sptr.rb,1.36,1.37
Message-ID: <200804181540.m3IFecgN008057@dev.open-bio.org>

Update of /home/repository/bioruby/bioruby/lib/bio/db/embl
In directory dev.open-bio.org:/tmp/cvs-serv8036/lib/bio/db/embl

Modified Files:
	sptr.rb 
Log Message:
bug fix: Bio::SPTR#references raises NoMethodError since 
lib/bio/db/embl/sptr.rb version 1.34.


Index: sptr.rb
===================================================================
RCS file: /home/repository/bioruby/bioruby/lib/bio/db/embl/sptr.rb,v
retrieving revision 1.36
retrieving revision 1.37
diff -C2 -d -r1.36 -r1.37
*** sptr.rb	5 Apr 2007 23:35:40 -0000	1.36
--- sptr.rb	18 Apr 2008 15:40:36 -0000	1.37
***************
*** 507,514 ****
              end
            when 'RX'  # PUBMED, MEDLINE
!             value.split('.').each {|item|
!               tag, xref = item.split(/; /).map {|i| i.strip }
                hash[ tag.downcase ]  = xref
!             }
            end
          }
--- 507,513 ----
              end
            when 'RX'  # PUBMED, MEDLINE
!             value.each do |tag, xref|
                hash[ tag.downcase ]  = xref
!             end
            end
          }


From ngoto at dev.open-bio.org  Wed Apr 23 16:48:28 2008
From: ngoto at dev.open-bio.org (Naohisa Goto)
Date: Wed, 23 Apr 2008 16:48:28 +0000
Subject: [BioRuby-cvs] bioruby/lib/bio/db/embl common.rb,1.12,1.13
Message-ID: <200804231648.m3NGmSSa012476@dev.open-bio.org>

Update of /home/repository/bioruby/bioruby/lib/bio/db/embl
In directory dev.open-bio.org:/tmp/cvs-serv12456/lib/bio/db/embl

Modified Files:
	common.rb 
Log Message:
Bug fix: Bio::EMBL#references failed to parse journal name, volume, issue,
pages, and year. In addition, it might failed to parse PubMed ID.


Index: common.rb
===================================================================
RCS file: /home/repository/bioruby/bioruby/lib/bio/db/embl/common.rb,v
retrieving revision 1.12
retrieving revision 1.13
diff -C2 -d -r1.12 -r1.13
*** common.rb	5 Apr 2007 23:35:40 -0000	1.12
--- common.rb	23 Apr 2008 16:48:25 -0000	1.13
***************
*** 279,294 ****
              hash['title'] = value
            when 'RL'
!             if value =~ /(.*) (\d+) \((\d+)\), (\d+-\d+) \((\d+)\)$/
!               hash['journal'] = $1
                hash['volume']  = $2
!               hash['issue']   = $3
!               hash['pages']   = $4
!               hash['year']    = $5
              else
                hash['journal'] = value
              end
            when 'RX'  # PUBMED, MEDLINE
!             value.split('.').each {|item|
!               tag, xref = item.split(/; /).map {|i| i.strip }
                hash[ tag.downcase ]  = xref
              }
--- 279,294 ----
              hash['title'] = value
            when 'RL'
!             if /(.*) (\d+) *(\(([^\)]+)\))?(\, |\:)([a-zA-Z\d]+\-[a-zA-Z\d]+) *\((\d+)\)\.?\z/ =~ value.to_s
!               hash['journal'] = $1.rstrip
                hash['volume']  = $2
!               hash['issue']   = $4
!               hash['pages']   = $6
!               hash['year']    = $7
              else
                hash['journal'] = value
              end
            when 'RX'  # PUBMED, MEDLINE
!             value.split(/\. /).each {|item|
!               tag, xref = item.split(/\; /).map {|i| i.strip.sub(/\.\z/, '') }
                hash[ tag.downcase ]  = xref
              }


From ngoto at dev.open-bio.org  Wed Apr 23 17:34:17 2008
From: ngoto at dev.open-bio.org (Naohisa Goto)
Date: Wed, 23 Apr 2008 17:34:17 +0000
Subject: [BioRuby-cvs] bioruby/lib/bio/db/embl common.rb,1.12.2.1,1.12.2.2
Message-ID: <200804231734.m3NHYHMP012740@dev.open-bio.org>

Update of /home/repository/bioruby/bioruby/lib/bio/db/embl
In directory dev.open-bio.org:/tmp/cvs-serv12720/lib/bio/db/embl

Modified Files:
      Tag: BRANCH-biohackathon2008
	common.rb 
Log Message:
lib/bio/db/embl/common.rb in branch BRANCH-biohackathon2008 is copied from
CVS HEAD revision 1.13 because of the bug fixed in revision 1.13.
(Bug fix: Bio::EMBL#references failed to parse journal name, volume, issue,
pages, and year. In addition, it might fail to parse PubMed ID.)


Index: common.rb
===================================================================
RCS file: /home/repository/bioruby/bioruby/lib/bio/db/embl/common.rb,v
retrieving revision 1.12.2.1
retrieving revision 1.12.2.2
diff -C2 -d -r1.12.2.1 -r1.12.2.2
*** common.rb	20 Feb 2008 09:56:22 -0000	1.12.2.1
--- common.rb	23 Apr 2008 17:34:15 -0000	1.12.2.2
***************
*** 241,305 ****
    def ref
      unless @data['R']
!       @data['R'] = Array.new
!       # Get the different references as 'blurbs' (the lines together)
!       reference_blurbs = get('R').split(/\nRN   /)
!       reference_blurbs.each_index do |i|
!         reference_blurbs[i] = 'RN   ' + reference_blurbs[i] unless reference_blurbs[i] =~ /^RN   /
!       end
!       
!       # For each reference, we'll first create a hash that looks like below.
!       # Suppose the input is:
!       #   RA   name1, name2, name3
!       #   RA   name4
!       #   RT   some part of the title that
!       #   RT   did not fit on one line
!       # Then the hash looks like:
!       #   h = {
!       #         'RA' => ["name1, name2, name3", "name4"],
!       #         'RT' => ["some part of the title that", "did not fit on one line"]
!       #       }
!       reference_blurbs.each do |rb|
!         line_based_data = Hash.new
!         rb.split(/\n/).each do |line|
!           key, value = line.scan(/^(R[A-Z])   "?(\[?.*[A-Za-z0-9]\]?)/)[0]
!           if line_based_data[key].nil?
!             line_based_data[key] = Array.new
!           end
!           line_based_data[key].push(value)
!         end
! 
!         # Now we have to sanitize the hash: the authors should be kept in an 
!         # array, the title should be 1 string, ... So the hash should look like:
!         #  h = {
!         #        'RA' => ["name1", "name2", "name3", "name4"],
!         #        'RT' => 'some part of the title that did not fit on one line'
!         #      }
!         line_based_data.keys.each do |key|
!           if ['RC', 'RP', 'RT', 'RL'].include?(key)
!             line_based_data[key] = line_based_data[key].join(' ')
!           elsif ['RA', 'RX'].include?(key)
!             sanitized_data = Array.new
!             line_based_data[key].each do |v|
!               sanitized_data.push(v.split(/\s*,\s*/))
!             end
!             line_based_data[key] = sanitized_data.flatten
!           elsif key == 'RN'
!             line_based_data[key] = line_based_data[key][0].sub(/^\[/,'').sub(/\]$/,'').to_i
            end
          end
!         
!         # And put it in @data. @data in the end looks like this:
!         #  data = [
!         #           {
!         #             'RA' => ["name1", "name2", "name3", "name4"],
!         #             'RT' => 'some part of the title that did not fit on one line'
!         #           },
!         #           {
!         #             'RA' => ["name1", "name2", "name3", "name4"],
!         #             'RT' => 'some part of the title that did not fit on one line'
!         #           }
!         #         ]
!         @data['R'].push(line_based_data)
        end
      end
      @data['R']
--- 241,265 ----
    def ref
      unless @data['R']
!       ary = Array.new
!       get('R').split(/\nRN   /).each do |str|
!         raw = {'RN' => '', 'RC' => '', 'RP' => '', 'RX' => '', 
!                'RA' => '', 'RT' => '', 'RL' => '', 'RG' => ''}
!         str = 'RN   ' + str unless /^RN   / =~ str
!         str.split("\n").each do |line|
!           if /^(R[NPXARLCTG])   (.+)/ =~ line
!             raw[$1] += $2 + ' '
!           else
!             raise "Invalid format in R lines, \n[#{line}]\n"
            end
          end
!         raw.each_value {|v| 
!           v.strip! 
!           v.sub!(/^"/,'')
!           v.sub!(/;$/,'')
!           v.sub!(/"$/,'')
!         }
!         ary.push(raw)
        end
+       @data['R'] = ary
      end
      @data['R']
***************
*** 310,345 ****
    def references
      unless @data['references']
!       @data['references'] = Array.new
!       self.ref.each do |ref|
!         hash = Hash.new
!         ref.each do |key, value|
            case key
-           when 'RN'
-             hash['embl_gb_record_number'] = value
-           when 'RC'
-             hash['comments'] = value
-           when 'RX'
-             hash['xrefs'] = value
-           when 'RP'
-             hash['sequence_position'] = value
            when 'RA'
!             hash['authors'] = value
            when 'RT'
              hash['title'] = value
            when 'RL'
!             hash['journal'] = value
            when 'RX'  # PUBMED, MEDLINE
!             value.each {|item|
!               tag, xref = item.split(/; /).map {|i| i.strip }
                hash[ tag.downcase ]  = xref
              }
            end
!         end
!         @data['references'].push(Reference.new(hash))
!       end
      end
      @data['references']
    end
  
    # returns contents in the DR line.
    # * Bio::EMBLDB::Common#dr  -> [ <Database cross-reference Hash>* ]
--- 270,306 ----
    def references
      unless @data['references']
!       ary = self.ref.map {|ent|
!         hash = Hash.new('')
!         ent.each {|key, value|
            case key
            when 'RA'
!             hash['authors'] = value.split(/, /)
            when 'RT'
              hash['title'] = value
            when 'RL'
!             if /(.*) (\d+) *(\(([^\)]+)\))?(\, |\:)([a-zA-Z\d]+\-[a-zA-Z\d]+) *\((\d+)\)\.?\z/ =~ value.to_s
!               hash['journal'] = $1.rstrip
!               hash['volume']  = $2
!               hash['issue']   = $4
!               hash['pages']   = $6
!               hash['year']    = $7
!             else
!               hash['journal'] = value
!             end
            when 'RX'  # PUBMED, MEDLINE
!             value.split(/\. /).each {|item|
!               tag, xref = item.split(/\; /).map {|i| i.strip.sub(/\.\z/, '') }
                hash[ tag.downcase ]  = xref
              }
            end
!         }
!         Reference.new(hash)
!       }
!       @data['references'] = References.new(ary)
      end
      @data['references']
    end
  
+ 
    # returns contents in the DR line.
    # * Bio::EMBLDB::Common#dr  -> [ <Database cross-reference Hash>* ]


From ngoto at dev.open-bio.org  Wed Apr 23 18:04:53 2008
From: ngoto at dev.open-bio.org (Naohisa Goto)
Date: Wed, 23 Apr 2008 18:04:53 +0000
Subject: [BioRuby-cvs] bioruby/lib/bio/db/embl common.rb,1.12.2.2,1.12.2.3
Message-ID: <200804231804.m3NI4rUv012864@dev.open-bio.org>

Update of /home/repository/bioruby/bioruby/lib/bio/db/embl
In directory dev.open-bio.org:/tmp/cvs-serv12842/lib/bio/db/embl

Modified Files:
      Tag: BRANCH-biohackathon2008
	common.rb 
Log Message:
Part of changes made between 1.12 and 1.12.2.1 is incorporated with
modifications.


Index: common.rb
===================================================================
RCS file: /home/repository/bioruby/bioruby/lib/bio/db/embl/common.rb,v
retrieving revision 1.12.2.2
retrieving revision 1.12.2.3
diff -C2 -d -r1.12.2.2 -r1.12.2.3
*** common.rb	23 Apr 2008 17:34:15 -0000	1.12.2.2
--- common.rb	23 Apr 2008 18:04:51 -0000	1.12.2.3
***************
*** 74,77 ****
--- 74,78 ----
  require 'bio/db'
  require 'bio/reference'
+ require 'bio/compat/references'
  
  module Bio
***************
*** 274,279 ****
          ent.each {|key, value|
            case key
            when 'RA'
!             hash['authors'] = value.split(/, /)
            when 'RT'
              hash['title'] = value
--- 275,288 ----
          ent.each {|key, value|
            case key
+           when 'RN'
+             if /\[(\d+)\]/ =~ value.to_s
+               hash['embl_gb_record_number'] = $1.to_i
+             end
+           when 'RC'
+             hash['comment'] = value
+           when 'RP'
+             hash['sequence_position'] = value
            when 'RA'
!             hash['authors'] = value.split(/\, /)
            when 'RT'
              hash['title'] = value
***************
*** 288,292 ****
                hash['journal'] = value
              end
!           when 'RX'  # PUBMED, MEDLINE
              value.split(/\. /).each {|item|
                tag, xref = item.split(/\; /).map {|i| i.strip.sub(/\.\z/, '') }
--- 297,301 ----
                hash['journal'] = value
              end
!           when 'RX'  # PUBMED, DOI, (AGRICOLA)
              value.split(/\. /).each {|item|
                tag, xref = item.split(/\; /).map {|i| i.strip.sub(/\.\z/, '') }
***************
*** 297,301 ****
          Reference.new(hash)
        }
!       @data['references'] = References.new(ary)
      end
      @data['references']
--- 306,310 ----
          Reference.new(hash)
        }
!       @data['references'] = ary.extend(Bio::References::BackwardCompatibility)
      end
      @data['references']


From ngoto at dev.open-bio.org  Wed Apr 23 18:52:20 2008
From: ngoto at dev.open-bio.org (Naohisa Goto)
Date: Wed, 23 Apr 2008 18:52:20 +0000
Subject: [BioRuby-cvs] bioruby/lib/bio reference.rb,1.24.2.5,1.24.2.6
Message-ID: <200804231852.m3NIqKW0013081@dev.open-bio.org>

Update of /home/repository/bioruby/bioruby/lib/bio
In directory dev.open-bio.org:/tmp/cvs-serv13059/lib/bio

Modified Files:
      Tag: BRANCH-biohackathon2008
	reference.rb 
Log Message:
* lib/bio/reference.rb
  * New methods: Bio::Reference#comments, Bio::Reference#doi
  * Code of Bio::Reference#embl is moved to lib/bio/db/embl/format_embl.rb
    to improve tolerance for various data (e.g. references with no
    record numbers or with duplicated record numbers).
* lib/bio/db/embl/common.rb
  * Changes to support for Bio::Reference#comments.
* lib/bio/db/embl/format_embl.rb
  * Bio::Sequence::Format::NucFormatter::Embl#reference_format_embl
    (private method) is added based on Bio::Reference#embl.
  * Changes to improve tolerance for various data.


Index: reference.rb
===================================================================
RCS file: /home/repository/bioruby/bioruby/lib/bio/reference.rb,v
retrieving revision 1.24.2.5
retrieving revision 1.24.2.6
diff -C2 -d -r1.24.2.5 -r1.24.2.6
*** reference.rb	4 Mar 2008 11:31:45 -0000	1.24.2.5
--- reference.rb	23 Apr 2008 18:52:18 -0000	1.24.2.6
***************
*** 42,47 ****
    class Reference
  
-     include Bio::Sequence::Format::INSDFeatureHelper
- 
      # Author names in an Array, [ "Hoge, J.P.", "Fuga, F.B." ].
      attr_reader :authors
--- 42,45 ----
***************
*** 70,73 ****
--- 68,74 ----
      # medline identifier (typically Fixnum)
      attr_reader :medline
+ 
+     # DOI identifier (typically String, e.g. "10.1126/science.1110418")
+     attr_reader :doi
      
      # Abstract text in String.
***************
*** 89,92 ****
--- 90,96 ----
      attr_reader :sequence_position
  
+     # Comments for the reference (typically Array of String, or nil)
+     attr_reader :comments
+ 
      # Create a new Bio::Reference object from a Hash of values. 
      # Data is extracted from the values for keys:
***************
*** 126,150 ****
      # *Returns*:: Bio::Reference object
      def initialize(hash)
!       hash.default = ''
!       @authors  = hash['authors'] # [ "Hoge, J.P.", "Fuga, F.B." ]
!       @title    = hash['title']   # "Title of the study."
!       @journal  = hash['journal'] # "Theor. J. Hoge"
!       @volume   = hash['volume']  # 12
!       @issue    = hash['issue']   # 3
!       @pages    = hash['pages']   # 123-145
!       @year     = hash['year']    # 2001
!       @pubmed   = hash['pubmed']  # 12345678
!       @medline  = hash['medline'] # 98765432
!       @abstract = hash['abstract']
        @url      = hash['url']
!       @mesh     = hash['mesh']
        @embl_gb_record_number = hash['embl_gb_record_number'] || nil
        @sequence_position = hash['sequence_position'] || nil
!       @comments = hash['comments'] || []
!       @xrefs    = hash['xrefs'] || []
!       @affiliations = hash['affiliations']
!       @authors = [] if @authors.empty?
!       @mesh    = [] if @mesh.empty?
!       @affiliations = [] if @affiliations.empty?
      end
  
--- 130,150 ----
      # *Returns*:: Bio::Reference object
      def initialize(hash)
!       @authors  = hash['authors'] || [] # [ "Hoge, J.P.", "Fuga, F.B." ]
!       @title    = hash['title']   || '' # "Title of the study."
!       @journal  = hash['journal'] || '' # "Theor. J. Hoge"
!       @volume   = hash['volume']  || '' # 12
!       @issue    = hash['issue']   || '' # 3
!       @pages    = hash['pages']   || '' # 123-145
!       @year     = hash['year']    || '' # 2001
!       @pubmed   = hash['pubmed']  || '' # 12345678
!       @medline  = hash['medline'] || '' # 98765432
!       @doi      = hash['doi']
!       @abstract = hash['abstract'] || '' 
        @url      = hash['url']
!       @mesh     = hash['mesh'] || []
        @embl_gb_record_number = hash['embl_gb_record_number'] || nil
        @sequence_position = hash['sequence_position'] || nil
!       @comments  = hash['comments']
!       @affiliations = hash['affiliations'] || []
      end
  
***************
*** 273,298 ****
      #     RL   Plant Mol. Biol. 17(2):209-219(1991).
      def embl
!       lines = Array.new
!       if ! @embl_gb_record_number.nil?
!         lines << "RN   [#{@embl_gb_record_number}]"
!       end
!       if @comments != []
!         @comments.each do |c|
!           lines << "RC   #{c}"
!         end
!       end
!       if ! @sequence_position.nil?
!         lines << "RP   #{@sequence_position}"
!       end
!       if ! @xrefs.nil?
!         @xrefs.each do |x|
!           lines << "RX   #{x}"
!         end
!       end
!       lines << wrap(@authors.join(', '), 80, 'RA   ') + ';' unless @authors.nil?
!       lines << (@title == '' ? 'RT   ;' : wrap('"' + @title + '"', 80, 'RT   ') + ';')
!       lines << wrap(@journal, 80, 'RL   ') unless @journal == ''
!       lines << "XX"
!       return lines.join("\n")
      end
  
--- 273,280 ----
      #     RL   Plant Mol. Biol. 17(2):209-219(1991).
      def embl
!       r = self
!       Bio::Sequence::Format::NucFormatter::Embl.new('').instance_eval {
!         reference_format_embl(r)
!       }
      end
  

From ngoto at dev.open-bio.org  Wed Apr 23 18:52:20 2008
From: ngoto at dev.open-bio.org (Naohisa Goto)
Date: Wed, 23 Apr 2008 18:52:20 +0000
Subject: [BioRuby-cvs] bioruby/lib/bio/db/embl format_embl.rb, 1.1.2.2,
	1.1.2.3 common.rb, 1.12.2.3, 1.12.2.4
Message-ID: <200804231852.m3NIqKHG013084@dev.open-bio.org>

Update of /home/repository/bioruby/bioruby/lib/bio/db/embl
In directory dev.open-bio.org:/tmp/cvs-serv13059/lib/bio/db/embl

Modified Files:
      Tag: BRANCH-biohackathon2008
	format_embl.rb common.rb 
Log Message:
* lib/bio/reference.rb
  * New methods: Bio::Reference#comments, Bio::Reference#doi
  * Code of Bio::Reference#embl is moved to lib/bio/db/embl/format_embl.rb
    to improve tolerance for various data (e.g. references with no
    record numbers or with duplicated record numbers).
* lib/bio/db/embl/common.rb
  * Changes to support for Bio::Reference#comments.
* lib/bio/db/embl/format_embl.rb
  * Bio::Sequence::Format::NucFormatter::Embl#reference_format_embl
    (private method) is added based on Bio::Reference#embl.
  * Changes to improve tolerance for various data.


Index: format_embl.rb
===================================================================
RCS file: /home/repository/bioruby/bioruby/lib/bio/db/embl/Attic/format_embl.rb,v
retrieving revision 1.1.2.2
retrieving revision 1.1.2.3
diff -C2 -d -r1.1.2.2 -r1.1.2.3
*** format_embl.rb	27 Mar 2008 13:38:31 -0000	1.1.2.2
--- format_embl.rb	23 Apr 2008 18:52:18 -0000	1.1.2.3
***************
*** 25,28 ****
--- 25,76 ----
      end
  
+     # format reference
+     # ref:: Bio::Reference object
+     # hash:: (optional) a hash for RN (reference number) administration
+     def reference_format_embl(ref, hash = nil)
+       lines = Array.new
+       if ref.embl_gb_record_number or hash then
+         refno = ref.embl_gb_record_number.to_i
+         hash ||= {}
+         if refno <= 0 or hash[refno] then
+           refno = hash.keys.sort[-1].to_i + 1
+           hash[refno] = true
+         end
+         lines << embl_wrap("RN   ", "[#{refno}]")
+       end
+       if ref.comments then
+         ref.comments.each do |cmnt|
+           lines << embl_wrap("RC   ", cmnt)
+         end
+       end
+       unless ref.sequence_position.to_s.empty? then
+         lines << embl_wrap("RP   ",   "#{ref.sequence_position}")
+       end
+       unless ref.doi.to_s.empty? then
+         lines << embl_wrap("RX   ",   "DOI; #{ref.doi}.")
+       end
+       unless ref.pubmed.to_s.empty? then
+         lines << embl_wrap("RX   ",   "PUBMED; #{ref.pubmed}.")
+       end
+       unless ref.authors.empty?
+         lines << embl_wrap('RA   ', ref.authors.join(', ') + ';')
+       end
+       lines << embl_wrap('RT   ',
+                          (ref.title.to_s.empty? ? '' :
+                           "\"#{ref.title}\"") + ';')
+       unless ref.journal.to_s.empty? then
+         volissue = "#{ref.volume.to_s}"
+         volissue = "#{volissue}(#{ref.issue})" unless ref.issue.to_s.empty? 
+         rl = "#{ref.journal}"
+         rl += " #{volissue}" unless volissue.empty? 
+         rl += ":#{ref.pages}" unless ref.pages.to_s.empty?
+         rl += "(#{ref.year})" unless ref.year.to_s.empty?
+         rl += '.'
+         lines << embl_wrap('RL   ', rl)
+       end
+       lines << "XX"
+       return lines.join("\n")
+     end
+ 
      def seq_format_embl(seq)
        output_lines = Array.new
***************
*** 43,64 ****
      erb_template <<'__END_OF_TEMPLATE__'
  ID   <%= entry_id %>; SV <%= sequence_version %>; <%= topology %>; <%= molecule_type %>; <%= data_class %>; <%= division %>; <%= seq.length %> BP.
! XX
  <%= embl_wrap('AC   ', accessions.reject{|a| a.nil?}.join('; ') + ';') %>
! XX
  DT   <%= date_created %>
  DT   <%= date_modified %>
! XX
  <%= embl_wrap('DE   ', definition) %>
! XX
  <%= embl_wrap('KW   ', keywords.join('; ') + '.') %>
! XX
  OS   <%= species %>
  <%= embl_wrap('OC   ', classification.join('; ') + '.') %>
  XX   
! <%= (references || []).collect{|ref| ref.format('embl')}.join("\n") %>
! XX
! FH   Key             Location/Qualifiers
! FH
! <%= format_features_embl(features || []) %>XX
  SQ   Sequence <%= seq.length %> BP; <%= seq.composition.collect{|k,v| "#{v} #{k.upcase}"}.join('; ') + '; ' + (seq.gsub(/[ACTGactg]/, '').length.to_s ) + ' other;' %>
  <%= seq_format_embl(seq) %>
--- 91,111 ----
      erb_template <<'__END_OF_TEMPLATE__'
  ID   <%= entry_id %>; SV <%= sequence_version %>; <%= topology %>; <%= molecule_type %>; <%= data_class %>; <%= division %>; <%= seq.length %> BP.
! XX   
  <%= embl_wrap('AC   ', accessions.reject{|a| a.nil?}.join('; ') + ';') %>
! XX   
  DT   <%= date_created %>
  DT   <%= date_modified %>
! XX   
  <%= embl_wrap('DE   ', definition) %>
! XX   
  <%= embl_wrap('KW   ', keywords.join('; ') + '.') %>
! XX   
  OS   <%= species %>
  <%= embl_wrap('OC   ', classification.join('; ') + '.') %>
  XX   
! <% hash = {}; (references || []).each do |ref| %><%= reference_format_embl(ref, hash) %>
! <% end %>FH   Key             Location/Qualifiers
! FH   
! <%= format_features_embl(features || []) %>XX   
  SQ   Sequence <%= seq.length %> BP; <%= seq.composition.collect{|k,v| "#{v} #{k.upcase}"}.join('; ') + '; ' + (seq.gsub(/[ACTGactg]/, '').length.to_s ) + ' other;' %>
  <%= seq_format_embl(seq) %>

Index: common.rb
===================================================================
RCS file: /home/repository/bioruby/bioruby/lib/bio/db/embl/common.rb,v
retrieving revision 1.12.2.3
retrieving revision 1.12.2.4
diff -C2 -d -r1.12.2.3 -r1.12.2.4
*** common.rb	23 Apr 2008 18:04:51 -0000	1.12.2.3
--- common.rb	23 Apr 2008 18:52:18 -0000	1.12.2.4
***************
*** 280,284 ****
              end
            when 'RC'
!             hash['comment'] = value
            when 'RP'
              hash['sequence_position'] = value
--- 280,287 ----
              end
            when 'RC'
!             unless value.to_s.strip.empty?
!               hash['comments'] ||= []
!               hash['comments'].push value
!             end
            when 'RP'
              hash['sequence_position'] = value


From ngoto at dev.open-bio.org  Thu Apr 24 13:49:44 2008
From: ngoto at dev.open-bio.org (Naohisa Goto)
Date: Thu, 24 Apr 2008 13:49:44 +0000
Subject: [BioRuby-cvs] bioruby/lib/bio/db/embl sptr.rb,1.36,1.36.2.1
Message-ID: <200804241349.m3ODni9x015583@dev.open-bio.org>

Update of /home/repository/bioruby/bioruby/lib/bio/db/embl
In directory dev.open-bio.org:/tmp/cvs-serv15545/lib/bio/db/embl

Modified Files:
      Tag: BRANCH-biohackathon2008
	sptr.rb 
Log Message:
The same change (except comment) as of 1.36 => 1.37 in CVS HEAD is made
(bug fix: Bio::SPTR#references raises NoMethodError since 
lib/bio/db/embl/sptr.rb version 1.34).


Index: sptr.rb
===================================================================
RCS file: /home/repository/bioruby/bioruby/lib/bio/db/embl/sptr.rb,v
retrieving revision 1.36
retrieving revision 1.36.2.1
diff -C2 -d -r1.36 -r1.36.2.1
*** sptr.rb	5 Apr 2007 23:35:40 -0000	1.36
--- sptr.rb	24 Apr 2008 13:49:42 -0000	1.36.2.1
***************
*** 506,514 ****
                hash['journal'] = value
              end
!           when 'RX'  # PUBMED, MEDLINE
!             value.split('.').each {|item|
!               tag, xref = item.split(/; /).map {|i| i.strip }
                hash[ tag.downcase ]  = xref
!             }
            end
          }
--- 506,513 ----
                hash['journal'] = value
              end
!           when 'RX'  # PUBMED, MEDLINE, DOI
!             value.each do |tag, xref|
                hash[ tag.downcase ]  = xref
!             end
            end
          }


From ngoto at dev.open-bio.org  Thu Apr 24 14:28:27 2008
From: ngoto at dev.open-bio.org (Naohisa Goto)
Date: Thu, 24 Apr 2008 14:28:27 +0000
Subject: [BioRuby-cvs] bioruby/lib/bio sequence.rb,0.58.2.10,0.58.2.11
Message-ID: <200804241428.m3OESRaY016145@dev.open-bio.org>

Update of /home/repository/bioruby/bioruby/lib/bio
In directory dev.open-bio.org:/tmp/cvs-serv15857/lib/bio

Modified Files:
      Tag: BRANCH-biohackathon2008
	sequence.rb 
Log Message:
* Bio::Sequence.read is renamed to Bio::Sequence.input because this method is
  a pair of Bio::Sequence#output. Bio::Sequence.read still exists as an
  alias of Bio::Sequence.
* Added document for Bio::Sequence#accessions, and fixed not to contain nil
  in the returned array.


Index: sequence.rb
===================================================================
RCS file: /home/repository/bioruby/bioruby/lib/bio/sequence.rb,v
retrieving revision 0.58.2.10
retrieving revision 0.58.2.11
diff -C2 -d -r0.58.2.10 -r0.58.2.11
*** sequence.rb	27 Mar 2008 13:38:31 -0000	0.58.2.10
--- sequence.rb	24 Apr 2008 14:28:25 -0000	0.58.2.11
***************
*** 369,373 ****
    # (GenBank, EMBL, fasta format, etc.)
    #
!   #   s = Bio::Sequence.read(str)
    # ---
    # *Arguments*:
--- 369,373 ----
    # (GenBank, EMBL, fasta format, etc.)
    #
!   #   s = Bio::Sequence.input(str)
    # ---
    # *Arguments*:
***************
*** 375,379 ****
    # * (optional) _format_: format specification (class or nil)
    # *Returns*:: Bio::Sequence object
!   def self.read(str, format = nil)
      if format then
        klass = format
--- 375,379 ----
    # * (optional) _format_: format specification (class or nil)
    # *Returns*:: Bio::Sequence object
!   def self.input(str, format = nil)
      if format then
        klass = format
***************
*** 384,391 ****
      obj.to_biosequence
    end
!   
!   
    def accessions
!     return [@primary_accession, @secondary_accessions].flatten
    end
  
--- 384,398 ----
      obj.to_biosequence
    end
! 
!   # alias of Bio::Sequence.input
!   def self.read(str, format = nil)
!     input(str, format)
!   end
! 
!   # accession numbers of the sequence
!   #
!   # *Returns*:: Array of String
    def accessions
!     [ @primary_accession, @secondary_accessions ].flatten.compact
    end
  

From helios at dev.open-bio.org  Mon Apr  7 13:15:46 2008
From: helios at dev.open-bio.org (Raoul Jean Pierre Bonnal)
Date: Mon, 07 Apr 2008 13:15:46 -0000
Subject: [BioRuby-cvs] bioruby/lib/bio/io sql.rb,1.8.2.1,1.8.2.2
Message-ID: <200804071315.m37DFcHI005486@dev.open-bio.org>

Update of /home/repository/bioruby/bioruby/lib/bio/io
In directory dev.open-bio.org:/tmp/cvs-serv5466/lib/bio/io

Modified Files:
      Tag: BRANCH-biohackathon2008
	sql.rb 
Log Message:
added "hostname" to valid_keys configurations

Index: sql.rb
===================================================================
RCS file: /home/repository/bioruby/bioruby/lib/bio/io/sql.rb,v
retrieving revision 1.8.2.1
retrieving revision 1.8.2.2
diff -C2 -d -r1.8.2.1 -r1.8.2.2
*** sql.rb	25 Mar 2008 15:46:32 -0000	1.8.2.1
--- sql.rb	7 Apr 2008 13:15:36 -0000	1.8.2.2
***************
*** 25,29 ****
        #{:database=>"biorails_development", :adapter=>"postgresql", :username=>"rails", :password=>nil}
        configurations.assert_valid_keys('development', 'production','test')
!       configurations[env].assert_valid_keys('database','adapter','username','password')
        DummyBase.configurations = configurations
        DummyBase.establish_connection "#{env}"
--- 25,29 ----
        #{:database=>"biorails_development", :adapter=>"postgresql", :username=>"rails", :password=>nil}
        configurations.assert_valid_keys('development', 'production','test')
!       configurations[env].assert_valid_keys('hostname','database','adapter','username','password')
        DummyBase.configurations = configurations
        DummyBase.establish_connection "#{env}"
***************
*** 43,46 ****
--- 43,50 ----
      end
      
+     def self.exists_database(name)
+       Bio::SQL::Biodatabase.find_by_name(name).nil? ? false : true
+     end
+     
      def self.list_entries
        Bio::SQL::Bioentry.find(:all).collect{|entry|
***************
*** 117,121 ****
    pp connection = Bio::SQL.establish_connection({'development'=>{'database'=>"biorails_development", 'adapter'=>"postgresql", 'username'=>"rails", 'password'=>nil}},'development')
    #pp YAML::load(ERB.new(IO.read('bio/io/biosql/config/database.yml')).result)
!   
    if nil
      pp Bio::SQL.list_entries
--- 121,125 ----
    pp connection = Bio::SQL.establish_connection({'development'=>{'database'=>"biorails_development", 'adapter'=>"postgresql", 'username'=>"rails", 'password'=>nil}},'development')
    #pp YAML::load(ERB.new(IO.read('bio/io/biosql/config/database.yml')).result)
!   pp Bio::SQL.list_entries
    if nil
      pp Bio::SQL.list_entries


From helios at dev.open-bio.org  Mon Apr  7 13:17:55 2008
From: helios at dev.open-bio.org (Raoul Jean Pierre Bonnal)
Date: Mon, 07 Apr 2008 13:17:55 -0000
Subject: [BioRuby-cvs] bioruby/lib/bio/io/biosql bioentry.rb, 1.1.2.1,
	1.1.2.2
Message-ID: <200804071317.m37DHmQk005551@dev.open-bio.org>

Update of /home/repository/bioruby/bioruby/lib/bio/io/biosql
In directory dev.open-bio.org:/tmp/cvs-serv5531/lib/bio/io/biosql

Modified Files:
      Tag: BRANCH-biohackathon2008
	bioentry.rb 
Log Message:
corrected table name "term" in conditions to get cdsfeatures "Shortcut". associated with the entry. 

Index: bioentry.rb
===================================================================
RCS file: /home/repository/bioruby/bioruby/lib/bio/io/biosql/Attic/bioentry.rb,v
retrieving revision 1.1.2.1
retrieving revision 1.1.2.2
diff -C2 -d -r1.1.2.1 -r1.1.2.2
*** bioentry.rb	25 Mar 2008 15:46:32 -0000	1.1.2.1
--- bioentry.rb	7 Apr 2008 13:17:46 -0000	1.1.2.2
***************
*** 13,17 ****
  				has_many :subject_bioentry_relationships, :class_name=>"BioentryRelationship", :foreign_key=>"subject_bioentry_id" #non mi convince molto credo non funzioni nel modo corretto
  
! 				has_many :cdsfeatures, :class_name=>"Seqfeature", :foreign_key =>"bioentry_id", :conditions=>["terms.name='CDS'"], :include=>"typeterm"
          
          has_many :terms, :through=>:bioentry_qualifier_values
--- 13,17 ----
  				has_many :subject_bioentry_relationships, :class_name=>"BioentryRelationship", :foreign_key=>"subject_bioentry_id" #non mi convince molto credo non funzioni nel modo corretto
  
! 				has_many :cdsfeatures, :class_name=>"Seqfeature", :foreign_key =>"bioentry_id", :conditions=>["term.name='CDS'"], :include=>"type_term"
          
          has_many :terms, :through=>:bioentry_qualifier_values


From helios at dev.open-bio.org  Mon Apr  7 13:18:19 2008
From: helios at dev.open-bio.org (Raoul Jean Pierre Bonnal)
Date: Mon, 07 Apr 2008 13:18:19 -0000
Subject: [BioRuby-cvs] bioruby/lib/bio/db/biosql sequence.rb, 1.1.2.1,
	1.1.2.2
Message-ID: <200804071318.m37DIDFm005598@dev.open-bio.org>

Update of /home/repository/bioruby/bioruby/lib/bio/db/biosql
In directory dev.open-bio.org:/tmp/cvs-serv5578/lib/bio/db/biosql

Modified Files:
      Tag: BRANCH-biohackathon2008
	sequence.rb 
Log Message:
use genbank, fasta is not working

Index: sequence.rb
===================================================================
RCS file: /home/repository/bioruby/bioruby/lib/bio/db/biosql/Attic/sequence.rb,v
retrieving revision 1.1.2.1
retrieving revision 1.1.2.2
diff -C2 -d -r1.1.2.1 -r1.1.2.2
*** sequence.rb	25 Mar 2008 15:46:32 -0000	1.1.2.1
--- sequence.rb	7 Apr 2008 13:18:11 -0000	1.1.2.2
***************
*** 674,679 ****
    
    #  parser = Bio::FlatFile.auto('/home/febo/Desktop/aj224122.embl')
!   #  parser = Bio::FlatFile.auto('/home/febo/Desktop/aj224122.gb')
!   parser = Bio::FlatFile.auto('/home/febo/Desktop/aj224122.fasta')
    
    parser.each do |entry|
--- 674,679 ----
    
    #  parser = Bio::FlatFile.auto('/home/febo/Desktop/aj224122.embl')
!   parser = Bio::FlatFile.auto('/home/febo/Desktop/aj224122.gb')
!   #parser = Bio::FlatFile.auto('/home/febo/Desktop/aj224122.fasta')
    
    parser.each do |entry|
***************
*** 686,689 ****
--- 686,690 ----
        #      pp "Sequence"
        puts result.to_biosequence.output(:genbank) #:embl
+       result.delete
      end   
    end