From ngoto at dev.open-bio.org Mon Mar 3 13:30:53 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Mon, 03 Mar 2008 18:30:53 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/genbank common.rb, 1.11.2.1, 1.11.2.2 genbank.rb, 0.40.2.1, 0.40.2.2 Message-ID: <200803031830.m23IUrSe005148@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/genbank In directory dev.open-bio.org:/tmp/cvs-serv5128/lib/bio/db/genbank Modified Files: Tag: BRANCH-biohackathon2008 common.rb genbank.rb Log Message: * lib/bio/db/genbank/common.rb * accessions method was broken * fixed a bug about embl_gb_record_number and sequence_position in references * lib/bio/db/genbank/genbank.rb * fixed some mistaken variable names in to_biosequence() Index: genbank.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/genbank/genbank.rb,v retrieving revision 0.40.2.1 retrieving revision 0.40.2.2 diff -C2 -d -r0.40.2.1 -r0.40.2.2 *** genbank.rb 14 Feb 2008 08:51:45 -0000 0.40.2.1 --- genbank.rb 3 Mar 2008 18:30:50 -0000 0.40.2.2 *************** *** 142,154 **** sequence.sequence_version = self.version ! seq.date_created = nil #???? sequence.date_modified = self.date sequence.keywords = self.keywords sequence.species = self.organism ! sequence.classification = self.taxonomy ! sequence.organnella = nil # not used sequence.comments = self.comment sequence.references = self.references return sequence end --- 142,155 ---- sequence.sequence_version = self.version ! #sequence.date_created = nil #???? sequence.date_modified = self.date sequence.keywords = self.keywords sequence.species = self.organism ! sequence.classification = self.taxonomy.to_s.sub(/\.\z/, '').split(/\s*\;\s*/) ! #sequence.organnella = nil # not used sequence.comments = self.comment sequence.references = self.references + sequence.features = self.features return sequence end Index: common.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/genbank/common.rb,v retrieving revision 1.11.2.1 retrieving revision 1.11.2.2 diff -C2 -d -r1.11.2.1 -r1.11.2.2 *** common.rb 28 Feb 2008 05:54:51 -0000 1.11.2.1 --- common.rb 3 Mar 2008 18:30:50 -0000 1.11.2.2 *************** *** 45,49 **** # ACCESSION -- Returns contents of the ACCESSION record as an Array. def accessions ! accession.split(/\s+/) end --- 45,49 ---- # ACCESSION -- Returns contents of the ACCESSION record as an Array. def accessions ! field_fetch('ACCESSION').strip.split(/\s+/) end *************** *** 141,148 **** subtag2array(ref).each do |field| case tag_get(field) ! when /^\s*REFERENCE\s+(\d+)(\s+\(bases\s+(\d+)\s+to\s+(\d+)\))?/ ! hash['embl_gb_record_number'] = $1.to_i ! if $2 then ! hash['sequence_position'] = "#{$3}-#{$4}" end when /AUTHORS/ --- 141,154 ---- subtag2array(ref).each do |field| case tag_get(field) ! when /REFERENCE/ ! if /(\d+)(\s*\((.+)\))?/m =~ tag_cut(field) then ! hash['embl_gb_record_number'] = $1.to_i ! if $3 and $3 != 'sites' then ! seqpos = $3 ! seqpos.sub!(/\A\s*bases\s+/, '') ! seqpos.gsub!(/(\d+)\s+to\s+(\d+)/, "\\1-\\2") ! seqpos.gsub!(/\s*\;\s*/, ', ') ! hash['sequence_position'] = seqpos ! end end when /AUTHORS/ From ngoto at dev.open-bio.org Tue Mar 4 04:22:38 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Tue, 04 Mar 2008 09:22:38 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/genbank genbank.rb, 0.40.2.2, 0.40.2.3 Message-ID: <200803040922.m249McN4007026@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/genbank In directory dev.open-bio.org:/tmp/cvs-serv7006/lib/bio/db/genbank Modified Files: Tag: BRANCH-biohackathon2008 genbank.rb Log Message: in to_biosequence(), conversion of definition was missing. Index: genbank.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/genbank/genbank.rb,v retrieving revision 0.40.2.2 retrieving revision 0.40.2.3 diff -C2 -d -r0.40.2.2 -r0.40.2.3 *** genbank.rb 3 Mar 2008 18:30:50 -0000 0.40.2.2 --- genbank.rb 4 Mar 2008 09:22:35 -0000 0.40.2.3 *************** *** 145,148 **** --- 145,149 ---- sequence.date_modified = self.date + sequence.definition = self.definition sequence.keywords = self.keywords sequence.species = self.organism From ngoto at dev.open-bio.org Tue Mar 4 04:46:12 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Tue, 04 Mar 2008 09:46:12 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/compat - New directory Message-ID: <200803040946.m249kCjw007182@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/compat In directory dev.open-bio.org:/tmp/cvs-serv7162/lib/bio/compat Log Message: Directory /home/repository/bioruby/bioruby/lib/bio/compat added to the repository --> Using per-directory sticky tag `BRANCH-biohackathon2008' From ngoto at dev.open-bio.org Tue Mar 4 05:07:51 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Tue, 04 Mar 2008 10:07:51 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/compat references.rb,NONE,1.1.2.1 Message-ID: <200803041007.m24A7p8X007317@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/compat In directory dev.open-bio.org:/tmp/cvs-serv7295/lib/bio/compat Added Files: Tag: BRANCH-biohackathon2008 references.rb Log Message: Bio::References and backward-compatibility module (renamed to Bio::References::BackwardCompatibility) is moved to lib/bio/compat/references.rb --- NEW FILE: references.rb --- # # = bio/compat/references.rb - Obsoleted References class # # Copyright:: Copyright (C) 2008 # Toshiaki Katayama , # Ryan Raaum , # Jan Aerts , # Naohisa Goto # License:: The Ruby License # # $Id: references.rb,v 1.1.2.1 2008/03/04 10:07:49 ngoto Exp $ # # == Description # # The Bio::References class was obsoleted after BioRuby 1.2.1. # To keep compatibility, some wrapper methods are provided in this file. # As the compatibility methods (and Bio::References) will soon be removed, # Please change your code not to use Bio::References. # # Note that Bio::Reference is different from Bio::References. # Bio::Reference still exists for storing a reference information # in sequence entries. module Bio # = DESCRIPTION # # This class is OBSOLETED, and will soon be removed. # Instead of this class, an array is to be used. # # # A container class for Bio::Reference objects. # # = USAGE # # This class should NOT be used. # # refs = Bio::References.new # refs.append(Bio::Reference.new(hash)) # refs.each do |reference| # ... # end # class References # module to keep backward compatibility with obsoleted Bio::References module BackwardCompatibility #:nodoc: # Backward compatibility with Bio::References#references. # Now, references are stored in an array, and # you should change your code not to use this method. def references warn 'Bio::References is obsoleted. Now, references are stored in an array.' self end # Backward compatibility with Bio::References#append. # Now, references are stored in an array, and # you should change your code not to use this method. def append(reference) warn 'Bio::References is obsoleted. Now, references are stored in an array.' self.push(reference) if reference.is_a? Reference self end end #module BackwardCompatibility # This method should not be used. # Only for backward compatibility of existing code. # # Since Bio::References is obsoleted, # Bio::References.new not returns Bio::References object, # but modifies given _ary_ and returns the _ary_. # # *Arguments*: # * (optional) __: Array of Bio::Reference objects # *Returns*:: the given array def self.new(ary = []) warn 'Bio::References is obsoleted. Some methods are added to given array to keep backward compatibility.' ary.extend(BackwardCompatibility) ary end # Array of Bio::Reference objects attr_accessor :references # Normally, users can not call this method. # # Create a new Bio::References object # # refs = Bio::References.new # --- # *Arguments*: # * (optional) __: Array of Bio::Reference objects # *Returns*:: Bio::References object def initialize(ary = []) @references = ary end # Add a Bio::Reference object to the container. # # refs.append(reference) # --- # *Arguments*: # * (required) _reference_: Bio::Reference object # *Returns*:: current Bio::References object def append(reference) @references.push(reference) if reference.is_a? Reference return self end # Iterate through Bio::Reference objects. # # refs.each do |reference| # ... # end # --- # *Block*:: yields each Bio::Reference object def each @references.each do |reference| yield reference end end end #class References end #module Bio From ngoto at dev.open-bio.org Tue Mar 4 05:07:51 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Tue, 04 Mar 2008 10:07:51 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio reference.rb,1.24.2.3,1.24.2.4 Message-ID: <200803041007.m24A7pT0007322@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio In directory dev.open-bio.org:/tmp/cvs-serv7295/lib/bio Modified Files: Tag: BRANCH-biohackathon2008 reference.rb Log Message: Bio::References and backward-compatibility module (renamed to Bio::References::BackwardCompatibility) is moved to lib/bio/compat/references.rb Index: reference.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/reference.rb,v retrieving revision 1.24.2.3 retrieving revision 1.24.2.4 diff -C2 -d -r1.24.2.3 -r1.24.2.4 *** reference.rb 28 Feb 2008 05:51:03 -0000 1.24.2.3 --- reference.rb 4 Mar 2008 10:07:49 -0000 1.24.2.4 *************** *** 578,680 **** end - # = DESCRIPTION - # - # This class is OBSOLETED, and will soon be removed. - # Instead of this class, an array is to be used. - # - # - # A container class for Bio::Reference objects. - # - # = USAGE - # - # This class should NOT be used. - # - # refs = Bio::References.new - # refs.append(Bio::Reference.new(hash)) - # refs.each do |reference| - # ... - # end - # - class References - - # module to keep backward compatibility with obsoleted Bio::References - module BackwardCompatibilityForBioReferences #:nodoc: - - # Backward compatibility with Bio::References#references. - # Now, references are stored in an array, and - # you should change your code not to use this method. - def references - warn 'Bio::References is obsoleted. Now, references are stored in an array.' - self - end - - # Backward compatibility with Bio::References#append. - # Now, references are stored in an array, and - # you should change your code not to use this method. - def append(reference) - warn 'Bio::References is obsoleted. Now, references are stored in an array.' - self.push(reference) if reference.is_a? Reference - self - end - end #module BackwardCompatibilityForBioReferences - - # This method should not be used. - # Only for backward compatibility of existing code. - # - # Since Bio::References is obsoleted, - # Bio::References.new not returns Bio::References object, - # but modifies given _ary_ and returns the _ary_. - # - # *Arguments*: - # * (optional) __: Array of Bio::Reference objects - # *Returns*:: the given array - def self.new(ary = []) - warn 'Bio::References is obsoleted. Some methods are added to given array to keep backward compatibility.' - ary.extend(BackwardCompatibilityForBioReferences) - ary - end - - # Array of Bio::Reference objects - attr_accessor :references - - # Create a new Bio::References object - # - # refs = Bio::References.new - # --- - # *Arguments*: - # * (optional) __: Array of Bio::Reference objects - # *Returns*:: Bio::References object - def initialize(ary = []) - @references = ary - end - - - # Add a Bio::Reference object to the container. - # - # refs.append(reference) - # --- - # *Arguments*: - # * (required) _reference_: Bio::Reference object - # *Returns*:: current Bio::References object - def append(reference) - @references.push(reference) if reference.is_a? Reference - return self - end - - # Iterate through Bio::Reference objects. - # - # refs.each do |reference| - # ... - # end - # --- - # *Block*:: yields each Bio::Reference object - def each - @references.each do |reference| - yield reference - end - end - - end - end --- 578,581 ---- From ngoto at dev.open-bio.org Tue Mar 4 05:12:24 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Tue, 04 Mar 2008 10:12:24 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/compat features.rb,NONE,1.1.2.1 Message-ID: <200803041012.m24ACObW007373@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/compat In directory dev.open-bio.org:/tmp/cvs-serv7351/lib/bio/compat Added Files: Tag: BRANCH-biohackathon2008 features.rb Log Message: Bio::Features is moved to lib/bio/compat/features.rb, and a module to keep backward compatibility (Bio::Features::BackwardCompatibility) is added. --- NEW FILE: features.rb --- # # = bio/compat/features.rb - Obsoleted Features class # # Copyright:: Copyright (c) 2002, 2005 Toshiaki Katayama # 2006 Jan Aerts # 2008 Naohisa Goto # License:: The Ruby License # # $Id: features.rb,v 1.1.2.1 2008/03/04 10:12:22 ngoto Exp $ # # == Description # # The Bio::Features class was obsoleted after BioRuby 1.2.1. # To keep compatibility, some wrapper methods are provided in this file. # As the compatibility methods (and Bio::Features) will soon be removed, # Please change your code not to use Bio::Features. # # Note that Bio::Feature is different from the Bio::Features. # Bio::Feature still exists to store DDBJ/GenBank/EMBL feature information. require 'bio/location' module Bio # = DESCRIPTION # # This class is OBSOLETED, and will soon be removed. # Instead of this class, an array is to be used. # # # Container for a list of Feature objects. # # = USAGE # # First, create some Bio::Feature objects # feature1 = Bio::Feature.new('intron','3627..4059') # feature2 = Bio::Feature.new('exon','4060..4236') # feature3 = Bio::Feature.new('intron','4237..4426') # feature4 = Bio::Feature.new('CDS','join(2538..3626,4060..4236)', # [ Bio::Feature::Qualifier.new('gene', 'CYP2D6'), # Bio::Feature::Qualifier.new('translation','MGXXTVMHLL...') # ]) # # # And create a container for them # feature_container = Bio::Features.new([ feature1, feature2, feature3, feature4 ]) # # # Iterate over all features and print # feature_container.each do |feature| # puts feature.feature + "\t" + feature.position # feature.each do |qualifier| # puts "- " + qualifier.qualifier + ": " + qualifier.value # end # end # # # Iterate only over CDS features and extract translated amino acid sequences # features.each("CDS") do |feature| # hash = feature.to_hash # name = hash["gene"] || hash["product"] || hash["note"] # aaseq = hash["translation"] # pos = feature.position # if name and seq # puts ">#{gene} #{feature.position}" # puts aaseq # end # end class Features # module to keep backward compatibility with obsoleted Bio::Features module BackwardCompatibility #:nodoc: # Backward compatibility with Bio::Features#features. # Now, features are stored in an array, and # you should change your code not to use this method. def features warn 'Bio::Features is obsoleted. Now, features are stored in an array.' self end # Backward compatibility with Bio::Features#append. # Now, references are stored in an array, and # you should change your code not to use this method. def append(feature) warn 'Bio::Features is obsoleted. Now, features are stored in an array.' self.push(feature) if feature.is_a? Feature self end end #module BackwardCompatibility # This method should not be used. # Only for backward compatibility of existing code. # # Since Bio::Features is obsoleted, # Bio::Features.new not returns Bio::Features object, # but modifies given _ary_ and returns the _ary_. # # *Arguments*: # * (optional) __: Array of Bio::Feature objects # *Returns*:: the given array def self.new(ary = []) warn 'Bio::Feature is obsoleted. Some methods are added to given array to keep backward compatibility.' ary.extend(BackwardCompatibility) ary end # Normally, users can not call this method. # # Create a new Bio::Features object. # # *Arguments*: # * (optional) _list of features_: list of Bio::Feature objects # *Returns*:: Bio::Features object def initialize(ary = []) @features = ary end # Returns an Array of Feature objects. attr_accessor :features # Appends a Feature object to Features. # # *Arguments*: # * (required) _feature_: Bio::Feature object # *Returns*:: Bio::Features object def append(a) @features.push(a) if a.is_a? Feature return self end # Iterates on each feature object. # # *Arguments*: # * (optional) _key_: if specified, only iterates over features with this key def each(arg = nil) @features.each do |x| next if arg and x.feature != arg yield x end end # Short cut for the Features#features[n] def [](*arg) @features[*arg] end # Short cut for the Features#features.first def first @features.first end # Short cut for the Features#features.last def last @features.last end end # Features end # Bio From ngoto at dev.open-bio.org Tue Mar 4 05:12:24 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Tue, 04 Mar 2008 10:12:24 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio feature.rb,1.13,1.13.2.1 Message-ID: <200803041012.m24ACO7D007378@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio In directory dev.open-bio.org:/tmp/cvs-serv7351/lib/bio Modified Files: Tag: BRANCH-biohackathon2008 feature.rb Log Message: Bio::Features is moved to lib/bio/compat/features.rb, and a module to keep backward compatibility (Bio::Features::BackwardCompatibility) is added. Index: feature.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/feature.rb,v retrieving revision 1.13 retrieving revision 1.13.2.1 diff -C2 -d -r1.13 -r1.13.2.1 *** feature.rb 5 Apr 2007 23:35:39 -0000 1.13 --- feature.rb 4 Mar 2008 10:12:22 -0000 1.13.2.1 *************** *** 136,226 **** end #Feature - - # = DESCRIPTION - # Container for a list of Feature objects. - # - # = USAGE - # # First, create some Bio::Feature objects - # feature1 = Bio::Feature.new('intron','3627..4059') - # feature2 = Bio::Feature.new('exon','4060..4236') - # feature3 = Bio::Feature.new('intron','4237..4426') - # feature4 = Bio::Feature.new('CDS','join(2538..3626,4060..4236)', - # [ Bio::Feature::Qualifier.new('gene', 'CYP2D6'), - # Bio::Feature::Qualifier.new('translation','MGXXTVMHLL...') - # ]) - # - # # And create a container for them - # feature_container = Bio::Features.new([ feature1, feature2, feature3, feature4 ]) - # - # # Iterate over all features and print - # feature_container.each do |feature| - # puts feature.feature + "\t" + feature.position - # feature.each do |qualifier| - # puts "- " + qualifier.qualifier + ": " + qualifier.value - # end - # end - # - # # Iterate only over CDS features and extract translated amino acid sequences - # features.each("CDS") do |feature| - # hash = feature.to_hash - # name = hash["gene"] || hash["product"] || hash["note"] - # aaseq = hash["translation"] - # pos = feature.position - # if name and seq - # puts ">#{gene} #{feature.position}" - # puts aaseq - # end - # end - class Features - # Create a new Bio::Features object. - # - # *Arguments*: - # * (optional) _list of features_: list of Bio::Feature objects - # *Returns*:: Bio::Features object - def initialize(ary = []) - @features = ary - end - - # Returns an Array of Feature objects. - attr_accessor :features - - # Appends a Feature object to Features. - # - # *Arguments*: - # * (required) _feature_: Bio::Feature object - # *Returns*:: Bio::Features object - def append(a) - @features.push(a) if a.is_a? Feature - return self - end - - # Iterates on each feature object. - # - # *Arguments*: - # * (optional) _key_: if specified, only iterates over features with this key - def each(arg = nil) - @features.each do |x| - next if arg and x.feature != arg - yield x - end - end - - # Short cut for the Features#features[n] - def [](*arg) - @features[*arg] - end - - # Short cut for the Features#features.first - def first - @features.first - end - - # Short cut for the Features#features.last - def last - @features.last - end - - end # Features - end # Bio --- 136,139 ---- From ngoto at dev.open-bio.org Tue Mar 4 05:32:57 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Tue, 04 Mar 2008 10:32:57 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/genbank common.rb, 1.11.2.2, 1.11.2.3 Message-ID: <200803041032.m24AWvnU007490@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/genbank In directory dev.open-bio.org:/tmp/cvs-serv7470/lib/bio/db/genbank Modified Files: Tag: BRANCH-biohackathon2008 common.rb Log Message: Changed not to use Bio::References and Bio::Features. To keep backward compatibility, BackwardCompatibility modules is used to extend an array. Index: common.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/genbank/common.rb,v retrieving revision 1.11.2.2 retrieving revision 1.11.2.3 diff -C2 -d -r1.11.2.2 -r1.11.2.3 *** common.rb 3 Mar 2008 18:30:50 -0000 1.11.2.2 --- common.rb 4 Mar 2008 10:32:55 -0000 1.11.2.3 *************** *** 179,183 **** ary.push(Reference.new(hash)) end ! @data['REFERENCE'] = References.new(ary) end if block_given? --- 179,183 ---- ary.push(Reference.new(hash)) end ! @data['REFERENCE'] = ary.extend(Bio::References::BackwardCompatibility) end if block_given? *************** *** 197,202 **** ! # FEATURES -- Returns contents of the FEATURES record as a Bio::Features ! # object. def features unless @data['FEATURES'] --- 197,202 ---- ! # FEATURES -- Returns contents of the FEATURES record as an array of ! # Bio::Feature objects. def features unless @data['FEATURES'] *************** *** 240,244 **** end ! @data['FEATURES'] = Features.new(ary) end if block_given? --- 240,244 ---- end ! @data['FEATURES'] = ary.extend(Bio::Features::BackwardCompatibility) end if block_given? From ngoto at dev.open-bio.org Tue Mar 4 05:56:44 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Tue, 04 Mar 2008 10:56:44 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/embl embl.rb,1.29.2.2,1.29.2.3 Message-ID: <200803041056.m24AuiM8007583@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/embl In directory dev.open-bio.org:/tmp/cvs-serv7563/lib/bio/db/embl Modified Files: Tag: BRANCH-biohackathon2008 embl.rb Log Message: In Bio::EMBL#ft(), added "extend Bio::Features::BackwardCompatibility" to keep backward compatibility. Index: embl.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/embl/embl.rb,v retrieving revision 1.29.2.2 retrieving revision 1.29.2.3 diff -C2 -d -r1.29.2.2 -r1.29.2.3 *** embl.rb 20 Feb 2008 09:56:22 -0000 1.29.2.2 --- embl.rb 4 Mar 2008 10:56:42 -0000 1.29.2.3 *************** *** 257,261 **** def ft unless @data['FT'] ! @data['FT'] = Array.new in_quote = false @orig['FT'].each_line do |line| --- 257,261 ---- def ft unless @data['FT'] ! ary = Array.new in_quote = false @orig['FT'].each_line do |line| *************** *** 265,271 **** body = line[20,60].chomp # feature value (position, /qualifier=) if line =~ /^FT {3}(\S+)/ ! @data['FT'].push([ $1, body ]) # [ feature, position, /q="data", ... ] elsif body =~ /^ \// and not in_quote ! @data['FT'].last.push(body) # /q="data..., /q=data, /q if body =~ /=" / and body !~ /"$/ --- 265,271 ---- body = line[20,60].chomp # feature value (position, /qualifier=) if line =~ /^FT {3}(\S+)/ ! ary.push([ $1, body ]) # [ feature, position, /q="data", ... ] elsif body =~ /^ \// and not in_quote ! ary.last.push(body) # /q="data..., /q=data, /q if body =~ /=" / and body !~ /"$/ *************** *** 274,278 **** else ! @data['FT'].last.last << body # ...data..., ...data..." if body =~ /"$/ --- 274,278 ---- else ! ary.last.last << body # ...data..., ...data..." if body =~ /"$/ *************** *** 282,289 **** end ! @data['FT'].map! do |subary| parse_qualifiers(subary) end end if block_given? --- 282,290 ---- end ! ary.map! do |subary| parse_qualifiers(subary) end + @data['FT'] = ary.extend(Bio::Features::BackwardCompatibility) end if block_given? *************** *** 445,447 **** puts entry.to_biosequence.output(:embl) end ! end \ No newline at end of file --- 446,448 ---- puts entry.to_biosequence.output(:embl) end ! end From ngoto at dev.open-bio.org Tue Mar 4 06:10:30 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Tue, 04 Mar 2008 11:10:30 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/sequence format.rb,1.4.2.6,1.4.2.7 Message-ID: <200803041110.m24BAU07007698@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/sequence In directory dev.open-bio.org:/tmp/cvs-serv7656/lib/bio/sequence Modified Files: Tag: BRANCH-biohackathon2008 format.rb Log Message: * lib/bio/sequence.rb Bio::Sequence#output is moved to lib/bio/sequence/format.rb. * lib/bio/sequence/format.rb * Bio::Sequence#output is changed not to directly read erb file. * Bio::Sequence::Format::FormatterBase class, a base class of formatter, is newly added. * Bio::Sequence::Format::Formatter, NucFormatter, AminoFormatter are newly added to store formatter classes. * Bio::Sequence#list_output_formats is added. * (The names of above classes/modules/methods might be changed if more appropriate names are given.) Index: format.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/sequence/format.rb,v retrieving revision 1.4.2.6 retrieving revision 1.4.2.7 diff -C2 -d -r1.4.2.6 -r1.4.2.7 *** format.rb 22 Feb 2008 14:30:44 -0000 1.4.2.6 --- format.rb 4 Mar 2008 11:10:28 -0000 1.4.2.7 *************** *** 2,9 **** # = bio/sequence/format.rb - various output format of the biological sequence # ! # Copyright:: Copyright (C) 2006 # Toshiaki Katayama , # Naohisa Goto , ! # Ryan Raaum # License:: The Ruby License # --- 2,10 ---- # = bio/sequence/format.rb - various output format of the biological sequence # ! # Copyright:: Copyright (C) 2006-2008 # Toshiaki Katayama , # Naohisa Goto , ! # Ryan Raaum , ! # Jan Aerts # License:: The Ruby License # *************** *** 15,18 **** --- 16,20 ---- # + require 'erb' module Bio *************** *** 32,62 **** module Format ! # INTERNAL USE ONLY, YOU SHOULD NOT CALL THIS METHOD. (And in any ! # case, it would be difficult to successfully call this method outside ! # its expected context). ! # ! # Output the FASTA format string of the sequence. ! # ! # UNFORTUNATLY, the current implementation of Bio::Sequence is incapable of ! # using either the header or width arguments. So something needs to be ! # changed... # ! # Currently, this method is used in Bio::Sequence#output like so, # # s = Bio::Sequence.new('atgc') # puts s.output(:fasta) #=> "> \natgc\n" # --- ! # *Arguments*: ! # * (optional) _header_: String (default nil) ! # * (optional) _width_: Fixnum (default nil) # *Returns*:: String object ! def format_fasta(header = nil, width = nil) ! header ||= "#{@entry_id} #{@definition}" ! ">#{header}\n" + ! if width ! @seq.to_s.gsub(Regexp.new(".{1,#{width}}"), "\\0\n") else ! @seq.to_s + "\n" end end --- 34,181 ---- module Format ! # Repository of generic (or both nucleotide and protein) sequence ! # formatter classes ! module Formatter ! ! # Raw format generatar ! autoload :Raw, 'bio/sequence/format_raw' ! ! # Fasta format generater ! autoload :Fasta, 'bio/db/fasta/format_fasta' ! ! # NCBI-style Fasta format generatar ! # (resemble to EMBOSS "ncbi" format) ! autoload :Fasta_ncbi, 'bio/db/fasta/format_fasta' ! ! end #module Formatter ! ! # Repository of nucleotide sequence formatter classes ! module NucFormatter ! ! # GenBank format generater ! # Note that the name is 'Genbank' and NOT 'GenBank' ! autoload :Genbank, 'bio/db/genbank/format_genbank' ! ! # EMBL format generater ! # Note that the name is 'Embl' and NOT 'EMBL' ! autoload :Embl, 'bio/db/embl/format_embl' ! ! end #module NucFormatter ! ! # Repository of protein sequence formatter classes ! module AminoFormatter ! # currently no formats available ! end #module AminoFormatter ! ! # Formatter base class. ! # Any formatter class should inherit this class. ! class FormatterBase ! ! # Returns a formatterd string of the given sequence ! # --- ! # *Arguments*: ! # * (required) _sequence_: Bio::Sequence object ! # * (optional) _options_: a Hash object ! # *Returns*:: String object ! def self.output(sequence, options = {}) ! self.new(sequence, options).output ! end ! ! # register new Erb template ! def self.erb_template(str) ! erb = ERB.new(str) ! erb.def_method(self, 'output') ! true ! end ! private_class_method :erb_template ! ! # generates output data ! # --- ! # *Returns*:: String object ! def output ! raise NotImplementedError, 'should be implemented in subclass' ! end ! ! # creates a new formatter object for output ! def initialize(sequence, options = {}) ! @sequence = sequence ! @options = options ! end ! ! private ! ! # any unknown methods are delegated to the sequence object ! def method_missing(sym, *args, &block) #:nodoc: ! begin ! @sequence.__send__(sym, *args, &block) ! rescue NoMethodError => evar ! lineno = __LINE__ - 2 ! file = __FILE__ ! bt_here = [ "#{file}:#{lineno}:in \`__send__\'", ! "#{file}:#{lineno}:in \`method_missing\'" ! ] ! if bt_here == evar.backtrace[0, 2] then ! bt = evar.backtrace[2..-1] ! evar = evar.class.new("undefined method \`#{sym.to_s}\' for #{self.inspect}") ! evar.set_backtrace(bt) ! end ! raise(evar) ! end ! end ! end #class FormatterBase ! ! # Using Bio::Sequence::Format, return a String with the Bio::Sequence ! # object formatted in the given style. # ! # Formats currently implemented are: 'fasta', 'genbank', and 'embl' # # s = Bio::Sequence.new('atgc') # puts s.output(:fasta) #=> "> \natgc\n" + # + # The style argument is given as a Ruby + # Symbol(http://www.ruby-doc.org/core/classes/Symbol.html) # --- ! # *Arguments*: ! # * (required) _format_: :fasta, :genbank, *or* :embl # *Returns*:: String object ! def output(format = :fasta, options = {}) ! formatter_const = format.to_s.capitalize.intern ! formatter_class = nil ! get_formatter_repositories.each do |mod| ! begin ! formatter_class = mod.const_get(formatter_const) ! rescue NameError ! end ! break if formatter_class ! end ! unless formatter_class then ! raise "unknown format name #{format.inspect}" ! end ! ! formatter_class.output(self, options) ! end ! ! # Returns a list of available output formats for the sequence ! # --- ! # *Arguments*: ! # *Returns*:: Array of Symbols ! def list_output_formats ! a = get_formatter_repositories.collect { |mod| mod.constants } ! a.flatten! ! a.collect! { |x| x.to_s.downcase.intern } ! a ! end ! ! private ! ! # returns formatter repository modules ! def get_formatter_repositories ! if self.moltype == Bio::Sequence::NA then ! [ NucFormatter, Formatter ] ! elsif self.moltype == Bio::Sequence::AA then ! [ AminoFormatter, Formatter ] else ! [ NucFormatter, AminoFormatter, Formatter ] end end *************** *** 72,90 **** #end # INTERNAL USE ONLY, YOU SHOULD NOT CALL THIS METHOD. (And in any # case, it would be difficult to successfully call this method outside # its expected context). # ! # Output the Genbank format string of the sequence. # Used in Bio::Sequence#output. # --- # *Returns*:: String object ! #def format_genbank ! # prefix = ' ' * 5 ! # indent = prefix + ' ' * 16 ! # fwidth = 79 - indent.length ! # ! # format_features(prefix, indent, fwidth) ! #end # INTERNAL USE ONLY, YOU SHOULD NOT CALL THIS METHOD. (And in any --- 191,215 ---- #end + #+++ + + # Formatting helper methods for INSD (NCBI, EMBL, DDBJ) feature table + module INSDFeatureHelper + private + # INTERNAL USE ONLY, YOU SHOULD NOT CALL THIS METHOD. (And in any # case, it would be difficult to successfully call this method outside # its expected context). # ! # Output the Genbank feature format string of the sequence. # Used in Bio::Sequence#output. # --- # *Returns*:: String object ! def format_features_genbank(features) ! prefix = ' ' * 5 ! indent = prefix + ' ' * 16 ! fwidth = 79 - indent.length ! ! format_features(features, prefix, indent, fwidth) ! end # INTERNAL USE ONLY, YOU SHOULD NOT CALL THIS METHOD. (And in any *************** *** 92,130 **** # its expected context). # ! # Output the EMBL format string of the sequence. # Used in Bio::Sequence#output. # --- # *Returns*:: String object ! #def format_embl ! # prefix = 'FT ' ! # indent = prefix + ' ' * 16 ! # fwidth = 80 - indent.length ! # ! # format_features(prefix, indent, fwidth) ! #end ! ! #+++ ! ! private ! def format_features(prefix, indent, width) ! result = '' ! @features.each do |feature| ! result << prefix + sprintf("%-16s", feature.feature) ! position = feature.position ! #position = feature.locations.to_s ! head = '' ! wrap(position, width).each_line do |line| ! result << head << line ! head = indent ! end ! result << format_qualifiers(feature.qualifiers, indent, width) ! end return result end def format_qualifiers(qualifiers, indent, width) qualifiers.collect do |qualifier| --- 217,255 ---- # its expected context). # ! # Output the EMBL feature format string of the sequence. # Used in Bio::Sequence#output. # --- # *Returns*:: String object ! def format_features_embl(features) ! prefix = 'FT ' ! indent = prefix + ' ' * 16 ! fwidth = 80 - indent.length ! ! format_features(features, prefix, indent, fwidth) ! end ! # format INSD featurs ! def format_features(features, prefix, indent, width) ! result = [] ! features.each do |feature| ! result.push format_feature(feature, prefix, indent, width) ! end ! return result.join('') ! end ! # format an INSD feature ! def format_feature(feature, prefix, indent, width) ! result = prefix + sprintf("%-16s", feature.feature) ! position = feature.position ! #position = feature.locations.to_s ! result << wrap_and_split_lines(position, width).join("\n" + indent) ! result << "\n" ! result << format_qualifiers(feature.qualifiers, indent, width) return result end + # format qualifiers def format_qualifiers(qualifiers, indent, width) qualifiers.collect do |qualifier| *************** *** 133,137 **** if v == true ! lines = wrap('/' + q, width) elsif q == 'translation' lines = fold("/#{q}=\"#{v}\"", width) --- 258,262 ---- if v == true ! lines = wrap_with_newline('/' + q, width) elsif q == 'translation' lines = fold("/#{q}=\"#{v}\"", width) *************** *** 142,146 **** v = '"' + v + '"' end ! lines = wrap('/' + q + '=' + v, width) end --- 267,271 ---- v = '"' + v + '"' end ! lines = wrap_with_newline('/' + q + '=' + v, width) end *************** *** 154,158 **** end ! def wrap(str, width) result = [] left = str.dup --- 279,287 ---- end ! def fold_and_split_lines(str, width) ! str.scan(Regexp.new(".{1,#{width}}")) ! end ! ! def wrap_and_split_lines(str, width) result = [] left = str.dup *************** *** 172,176 **** result << line end ! result << left if left result_string = result.join("\n") result_string << "\n" unless result_string.empty? --- 301,310 ---- result << line end ! result << left if left and !(left.to_s.empty?) ! return result ! end ! ! def wrap_with_newline(str, width) ! result = wrap_and_split_lines(str, width) result_string = result.join("\n") result_string << "\n" unless result_string.empty? *************** *** 178,185 **** end ! end # Format ! end # Sequence ! end # Bio --- 312,329 ---- end ! def wrap(str, width = 80, prefix = '') ! actual_width = width - prefix.length ! result = wrap_and_split_lines(str, actual_width) ! result_string = result.join("\n#{prefix}") ! result_string = prefix + result_string unless result_string.empty? ! return result_string ! end ! end #module INSDFeatureHelper ! end #module Format ! ! end #class Sequence ! ! end #module Bio From ngoto at dev.open-bio.org Tue Mar 4 06:10:30 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Tue, 04 Mar 2008 11:10:30 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio sequence.rb,0.58.2.7,0.58.2.8 Message-ID: <200803041110.m24BAUBl007703@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio In directory dev.open-bio.org:/tmp/cvs-serv7656/lib/bio Modified Files: Tag: BRANCH-biohackathon2008 sequence.rb Log Message: * lib/bio/sequence.rb Bio::Sequence#output is moved to lib/bio/sequence/format.rb. * lib/bio/sequence/format.rb * Bio::Sequence#output is changed not to directly read erb file. * Bio::Sequence::Format::FormatterBase class, a base class of formatter, is newly added. * Bio::Sequence::Format::Formatter, NucFormatter, AminoFormatter are newly added to store formatter classes. * Bio::Sequence#list_output_formats is added. * (The names of above classes/modules/methods might be changed if more appropriate names are given.) Index: sequence.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/sequence.rb,v retrieving revision 0.58.2.7 retrieving revision 0.58.2.8 diff -C2 -d -r0.58.2.7 -r0.58.2.8 *** sequence.rb 20 Feb 2008 09:56:22 -0000 0.58.2.7 --- sequence.rb 4 Mar 2008 11:10:28 -0000 0.58.2.8 *************** *** 13,17 **** # - require 'erb' require 'bio/sequence/compat' --- 13,16 ---- *************** *** 156,178 **** attr_accessor :seq - # Using Bio::Sequence::Format, return a String with the Bio::Sequence - # object formatted in the given style. - # - # Formats currently implemented are: 'fasta', 'genbank', and 'embl' - # - # s = Bio::Sequence.new('atgc') - # puts s.output(:fasta) #=> "> \natgc\n" - # - # The style argument is given as a Ruby - # Symbol(http://www.ruby-doc.org/core/classes/Symbol.html) - # --- - # *Arguments*: - # * (required) _format_: :fasta, :genbank, *or* :embl - # *Returns*:: String object - def output(format = :fasta) - record_template = ERB.new(File.read(File.dirname(__FILE__) + "/db/#{format.to_s}/format.erb")) - record_template.result(binding) - end - # Guess the type of sequence, Amino Acid or Nucleic Acid, and create a # new sequence object (Bio::Sequence::AA or Bio::Sequence::NA) on the basis --- 155,158 ---- From ngoto at dev.open-bio.org Tue Mar 4 06:14:05 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Tue, 04 Mar 2008 11:14:05 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/sequence common.rb,1.6.2.1,1.6.2.2 Message-ID: <200803041114.m24BE5Oh007773@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/sequence In directory dev.open-bio.org:/tmp/cvs-serv7753/lib/bio/sequence Modified Files: Tag: BRANCH-biohackathon2008 common.rb Log Message: format_embl is moved to lib/bio/embl/format_embl.rb Index: common.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/sequence/common.rb,v retrieving revision 1.6.2.1 retrieving revision 1.6.2.2 diff -C2 -d -r1.6.2.1 -r1.6.2.2 *** common.rb 20 Feb 2008 09:56:22 -0000 1.6.2.1 --- common.rb 4 Mar 2008 11:14:03 -0000 1.6.2.2 *************** *** 67,87 **** end - def format_embl - output_lines = Array.new - counter = 0 - remainder = self.window_search(60,60) do |subseq| - counter += 60 - subseq.gsub!(/(.{10})/, '\1 ') - output_lines.push(' '*5 + subseq + counter.to_s.rjust(9)) - end - counter += remainder.length - remainder = (remainder.to_s + ' '*(60-remainder.length)) - remainder.gsub!(/(.{10})/, '\1 ') - output_lines.push(' '*5 + remainder + counter.to_s.rjust(9)) - return output_lines.join("\n") - end - - - # Normalize the current sequence, removing all whitespace and # transforming all positions to uppercase if the sequence is AA or --- 67,70 ---- From ngoto at dev.open-bio.org Tue Mar 4 06:16:59 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Tue, 04 Mar 2008 11:16:59 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/embl format_embl.rb,NONE,1.1.2.1 Message-ID: <200803041116.m24BGxSm007801@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/embl In directory dev.open-bio.org:/tmp/cvs-serv7781/lib/bio/db/embl Added Files: Tag: BRANCH-biohackathon2008 format_embl.rb Log Message: EMBL formatter class, internally used by Bio::Sequence, is newly added. --- NEW FILE: format_embl.rb --- # # = bio/db/embl/format_embl.rb - EMBL format generater # # Copyright:: Copyright (C) 2008 Jan Aerts # License:: The Ruby License # # $Id: format_embl.rb,v 1.1.2.1 2008/03/04 11:16:57 ngoto Exp $ # require 'bio/sequence/format' module Bio::Sequence::Format::NucFormatter # INTERNAL USE ONLY, YOU SHOULD NOT USE THIS CLASS. # Embl format output class for Bio::Sequence. class Embl < Bio::Sequence::Format::FormatterBase # helper methods include Bio::Sequence::Format::INSDFeatureHelper private def embl_wrap(prefix, str) wrap(str.to_s, 80, prefix) end def seq_format_embl(seq) output_lines = Array.new counter = 0 remainder = seq.window_search(60,60) do |subseq| counter += 60 subseq.gsub!(/(.{10})/, '\1 ') output_lines.push(' '*5 + subseq + counter.to_s.rjust(9)) end counter += remainder.length remainder = (remainder.to_s + ' '*(60-remainder.length)) remainder.gsub!(/(.{10})/, '\1 ') output_lines.push(' '*5 + remainder + counter.to_s.rjust(9)) return output_lines.join("\n") end # Erb template of EMBL format for Bio::Sequence erb_template <<'__END_OF_TEMPLATE__' ID <%= entry_id %>; SV <%= sequence_version %>; <%= topology %>; <%= molecule_type %>; <%= data_class %>; <%= division %>; <%= seq.length %> BP. XX <%= embl_wrap('AC ', accessions.reject{|a| a.nil?}.join('; ') + ';') %> XX DT <%= date_created %> DT <%= date_modified %> XX <%= embl_wrap('DE ', definition) %> XX <%= embl_wrap('KW ', keywords.join('; ') + '.') %> XX OS <%= species %> <%= embl_wrap('OC ', classification.join('; ') + '.') %> XX <%= references.collect{|ref| ref.format('embl')}.join("\n") %> XX FH Key Location/Qualifiers FH <%= format_features_embl(features) %>XX SQ Sequence <%= seq.length %> BP; <%= seq.composition.collect{|k,v| "#{v} #{k.upcase}"}.join('; ') + '; ' + (seq.gsub(/[ACTGactg]/, '').length.to_s ) + ' other;' %> <%= seq_format_embl(seq) %> // __END_OF_TEMPLATE__ end #class Embl end #module Bio::Sequence::Format::NucFormatter From ngoto at dev.open-bio.org Tue Mar 4 06:19:18 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Tue, 04 Mar 2008 11:19:18 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/genbank format_genbank.rb, NONE, 1.1.2.1 Message-ID: <200803041119.m24BJIvh007829@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/genbank In directory dev.open-bio.org:/tmp/cvs-serv7809/lib/bio/db/genbank Added Files: Tag: BRANCH-biohackathon2008 format_genbank.rb Log Message: Bio::Sequence::Format::NucFormatter::Genbank, GenBank sequence format generater class, is newly added. Note that this class is currently internal use only and users should not use it directly. --- NEW FILE: format_genbank.rb --- # # = bio/db/genbank/format_genbank.rb - GenBank format generater # # Copyright:: Copyright (C) 2008 Naohisa Goto # License:: The Ruby License # # $Id: format_genbank.rb,v 1.1.2.1 2008/03/04 11:19:16 ngoto Exp $ # require 'bio/sequence/format' module Bio::Sequence::Format::NucFormatter # INTERNAL USE ONLY, YOU SHOULD NOT USE THIS CLASS. # GenBank format output class for Bio::Sequence. class Genbank < Bio::Sequence::Format::FormatterBase # helper methods include Bio::Sequence::Format::INSDFeatureHelper private # string wrapper for GenBank format def genbank_wrap(str) wrap(str.to_s, 67).gsub(/\n/, "\n" + " " * 12) end # string wrap with adding a dot at the end of the string def genbank_wrap_dot(str) str = str.to_s str = str + '.' unless /\.\z/ =~ str genbank_wrap(str) end # formats sequence lines as GenBank def each_genbank_seqline(str) #:yields: counter, seqline i = 1 a = str.scan(/.{1,60}/) do |s| yield i, s.gsub(/(.{1,10})/, " \\1") i += 60 end end # Erb template of GenBank format for Bio::Sequence erb_template <<'__END_OF_TEMPLATE__' LOCUS <%= sprintf("%-16s", entry_id) %> <%= sprintf("%11d", length) %> bp <%= sprintf("%3s", '') %><%= sprintf("%-6s", molecule_type) %> <%= sprintf("%-8s", topology) %><%= sprintf("%4s", division) %> <%= sprintf("%-11s", date_modified) %> DEFINITION <%= genbank_wrap_dot(definition.to_s) %> ACCESSION <%= genbank_wrap(([ primary_accession ] + (secondary_accessions or [])).join(" ")) %> VERSION <%= primary_accession %>.<%= sequence_version %><% unless true or gi_number.to_s.empty? %>GI:<%= gi_number %><% end %> KEYWORDS <%= genbank_wrap_dot((keywords or []).join('; ')) %> SOURCE <%= genbank_wrap(species) %> ORGANISM <%= genbank_wrap(species) %> <%= genbank_wrap_dot((classification or []).join('; ')) %> <% n = 0 (references or []).each do |ref| n += 1 pos = ref.sequence_position.to_s.gsub(/\s/, '') pos.gsub!(/(\d+)\-(\d+)/, "\\1 to \\2") pos.gsub!(/\s*\,\s*/, '; ') if pos.empty? pos = '' else pos = " (bases #{pos})" end journal = ref.journal.to_s volissue = ref.volume.to_s volissue += " (#{ref.issue})" unless ref.issue.to_s.empty? journal += " #{volissue}," unless volissue.empty? journal += " #{ref.pages}" unless ref.pages.to_s.empty? journal += " (#{ref.year})" unless ref.year.to_s.empty? alist = ref.authors.collect { |x| x.gsub(/\, /, ',') } lastauthor = alist.pop authorsline = alist.join(', ') authorsline.concat(" and ") unless alist.empty? authorsline.concat lastauthor.to_s %>REFERENCE <%= genbank_wrap(sprintf('%-2d%s', n, pos)) %> AUTHORS <%= genbank_wrap(authorsline) %> TITLE <%= genbank_wrap(ref.title.to_s) %> JOURNAL <%= genbank_wrap(journal) %> <% unless ref.pubmed.to_s.empty? %> PUBMED <%= ref.pubmed %> <% end end %>FEATURES Location/Qualifiers <%= format_features_genbank(features || []) %>ORIGIN <% each_genbank_seqline(seq) do |i, s| %><%= sprintf('%9d', i) %><%= s %> <% end %>// __END_OF_TEMPLATE__ end #class Genbank end #module Bio::Sequence::Format::NucFormatter From ngoto at dev.open-bio.org Tue Mar 4 06:27:01 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Tue, 04 Mar 2008 11:27:01 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/fasta format_fasta.rb, NONE, 1.1.2.1 Message-ID: <200803041127.m24BR14P007878@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/fasta In directory dev.open-bio.org:/tmp/cvs-serv7858/lib/bio/db/fasta Added Files: Tag: BRANCH-biohackathon2008 format_fasta.rb Log Message: Bio::Sequence::Format::Formatter::Fasta and Fasta_ncbi are newly added. Both are FASTA sequence format generater classes. (Fasta_ncbi is experimental, and would be removed if we determine it is not needed.) Note that these classes are currently internal use only and users should not use them directly. --- NEW FILE: format_fasta.rb --- # # = bio/db/fasta/format_fasta.rb - Fasta format generater # # Copyright:: Copyright (C) 2006-2008 # Toshiaki Katayama , # Naohisa Goto , # Jan Aerts # License:: The Ruby License # # $Id: format_fasta.rb,v 1.1.2.1 2008/03/04 11:26:59 ngoto Exp $ # require 'bio/sequence/format' module Bio::Sequence::Format::Formatter # INTERNAL USE ONLY, YOU SHOULD NOT USE THIS CLASS. # Simple Fasta format output class for Bio::Sequence. class Fasta < Bio::Sequence::Format::FormatterBase # INTERNAL USE ONLY, YOU SHOULD NOT CALL THIS METHOD. # # Creates a new Fasta format generater object from the sequence. # # --- # *Arguments*: # * _sequence_: Bio::Sequence object # * (optional) :header => _header_: String (default nil) # * (optional) :width => _width_: Fixnum (default 70) def initialize; end if false # dummy for RDoc # INTERNAL USE ONLY, YOU SHOULD NOT CALL THIS METHOD. # # Output the FASTA format string of the sequence. # # Currently, this method is used in Bio::Sequence#output like so, # # s = Bio::Sequence.new('atgc') # puts s.output(:fasta) #=> "> \natgc\n" # --- # *Returns*:: String object def output header = @options[:header] width = @options.has_key?(:width) ? @options[:width] : 70 seq = @sequence.seq entry_id = @sequence.entry_id || "#{@sequence.primary_accession}.#{@sequence.sequence_version}" definition = @sequence.definition header ||= "#{entry_id} #{definition}" ">#{header}\n" + if width seq.to_s.gsub(Regexp.new(".{1,#{width}}"), "\\0\n") else seq.to_s + "\n" end end end #class Fasta # INTERNAL USE ONLY, YOU SHOULD NOT USE THIS CLASS. # NCBI-Style Fasta format output class for Bio::Sequence. # (like "ncbi" format in EMBOSS) # # Note that this class is under construction. class Fasta_ncbi < Bio::Sequence::Format::FormatterBase # INTERNAL USE ONLY, YOU SHOULD NOT CALL THIS METHOD. # # Output the FASTA format string of the sequence. # # Currently, this method is used in Bio::Sequence#output like so, # # s = Bio::Sequence.new('atgc') # puts s.output(:ncbi) #=> "> \natgc\n" # --- # *Returns*:: String object def output width = 70 seq = @sequence.seq #gi = @sequence.gi_number dbname = 'lcl' if @sequence.primary_accession.to_s.empty? then idstr = @sequence.entry_id else idstr = "#{@sequence.primary_accession}.#{@sequence.sequence_version}" end definition = @sequence.definition header = "#{dbname}|#{idstr} #{definition}" ">#{header}\n" + seq.to_s.gsub(Regexp.new(".{1,#{width}}"), "\\0\n") end end #class Ncbi end #module Bio::Sequence::Format::Formatter From ngoto at dev.open-bio.org Tue Mar 4 06:28:49 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Tue, 04 Mar 2008 11:28:49 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/sequence format_raw.rb,NONE,1.1.2.1 Message-ID: <200803041128.m24BSnON007906@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/sequence In directory dev.open-bio.org:/tmp/cvs-serv7886/lib/bio/sequence Added Files: Tag: BRANCH-biohackathon2008 format_raw.rb Log Message: Raw sequence format (sequence only; without any newline and white-spaces) formatter class is newly added. (Internal use only) --- NEW FILE: format_raw.rb --- # # = bio/sequence/format_raw.rb - Raw sequence formatter # # Copyright:: Copyright (C) 2008 Naohisa Goto # License:: The Ruby License # # $Id: format_raw.rb,v 1.1.2.1 2008/03/04 11:28:46 ngoto Exp $ # require 'bio/sequence/format' module Bio::Sequence::Format::Formatter # Raw sequence output formatter class class Raw < Bio::Sequence::Format::FormatterBase # output raw sequence data def output "#{@sequence.seq}" end end #class Raw end #module Bio::Sequence::Format::Formatter From ngoto at dev.open-bio.org Tue Mar 4 06:29:38 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Tue, 04 Mar 2008 11:29:38 +0000 Subject: [BioRuby-cvs] bioruby/lib bio.rb,1.89.2.3,1.89.2.4 Message-ID: <200803041129.m24BTcRt007955@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib In directory dev.open-bio.org:/tmp/cvs-serv7935/lib Modified Files: Tag: BRANCH-biohackathon2008 bio.rb Log Message: changed autoload file path of Bio::References and Bio::Features Index: bio.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio.rb,v retrieving revision 1.89.2.3 retrieving revision 1.89.2.4 diff -C2 -d -r1.89.2.3 -r1.89.2.4 *** bio.rb 22 Feb 2008 14:26:16 -0000 1.89.2.3 --- bio.rb 4 Mar 2008 11:29:36 -0000 1.89.2.4 *************** *** 27,36 **** autoload :Feature, 'bio/feature' ! autoload :Features, 'bio/feature' ## References/Reference autoload :Reference, 'bio/reference' ! autoload :References, 'bio/reference' ## Pathway/Relation --- 27,36 ---- autoload :Feature, 'bio/feature' ! autoload :Features, 'bio/compat/features' ## References/Reference autoload :Reference, 'bio/reference' ! autoload :References, 'bio/compat/references' ## Pathway/Relation From ngoto at dev.open-bio.org Tue Mar 4 06:31:47 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Tue, 04 Mar 2008 11:31:47 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio reference.rb,1.24.2.4,1.24.2.5 Message-ID: <200803041131.m24BVloU008025@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio In directory dev.open-bio.org:/tmp/cvs-serv8005/lib/bio Modified Files: Tag: BRANCH-biohackathon2008 reference.rb Log Message: changed to use Bio::Sequence::Format::INSDFeatureHelper#wrap(). Index: reference.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/reference.rb,v retrieving revision 1.24.2.4 retrieving revision 1.24.2.5 diff -C2 -d -r1.24.2.4 -r1.24.2.5 *** reference.rb 4 Mar 2008 10:07:49 -0000 1.24.2.4 --- reference.rb 4 Mar 2008 11:31:45 -0000 1.24.2.5 *************** *** 42,45 **** --- 42,47 ---- class Reference + include Bio::Sequence::Format::INSDFeatureHelper + # Author names in an Array, [ "Hoge, J.P.", "Fuga, F.B." ]. attr_reader :authors *************** *** 288,294 **** end end ! lines << @authors.join(', ').wrap(80, 'RA ') + ';' unless @authors.nil? ! lines << (@title == '' ? 'RT ;' : ('"' + @title + '"').wrap(80, 'RT ') + ';') ! lines << @journal.wrap(80, 'RL ') unless @journal == '' lines << "XX" return lines.join("\n") --- 290,296 ---- end end ! lines << wrap(@authors.join(', '), 80, 'RA ') + ';' unless @authors.nil? ! lines << (@title == '' ? 'RT ;' : wrap('"' + @title + '"', 80, 'RT ') + ';') ! lines << wrap(@journal, 80, 'RL ') unless @journal == '' lines << "XX" return lines.join("\n") From ngoto at dev.open-bio.org Mon Mar 10 09:42:28 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Mon, 10 Mar 2008 13:42:28 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/compat features.rb,1.1.2.1,1.1.2.2 Message-ID: <200803101342.m2ADgSYs009554@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/compat In directory dev.open-bio.org:/tmp/cvs-serv9534/lib/bio/compat Modified Files: Tag: BRANCH-biohackathon2008 features.rb Log Message: fixed typo in warning message Index: features.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/compat/Attic/features.rb,v retrieving revision 1.1.2.1 retrieving revision 1.1.2.2 diff -C2 -d -r1.1.2.1 -r1.1.2.2 *** features.rb 4 Mar 2008 10:12:22 -0000 1.1.2.1 --- features.rb 10 Mar 2008 13:42:26 -0000 1.1.2.2 *************** *** 97,101 **** # *Returns*:: the given array def self.new(ary = []) ! warn 'Bio::Feature is obsoleted. Some methods are added to given array to keep backward compatibility.' ary.extend(BackwardCompatibility) ary --- 97,101 ---- # *Returns*:: the given array def self.new(ary = []) ! warn 'Bio::Features is obsoleted. Some methods are added to given array to keep backward compatibility.' ary.extend(BackwardCompatibility) ary From ngoto at dev.open-bio.org Fri Mar 21 02:24:45 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Fri, 21 Mar 2008 06:24:45 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/embl embl.rb,1.29.2.3,1.29.2.4 Message-ID: <200803210624.m2L6OjlR031776@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/embl In directory dev.open-bio.org:/tmp/cvs-serv31756/lib/bio/db/embl Modified Files: Tag: BRANCH-biohackathon2008 embl.rb Log Message: "require 'bio/compat/features'" and "require 'bio/compat/references'" are added, and example code in the bottom of the file is removed to avoid possible confusion with unit tests. Index: embl.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/embl/embl.rb,v retrieving revision 1.29.2.3 retrieving revision 1.29.2.4 diff -C2 -d -r1.29.2.3 -r1.29.2.4 *** embl.rb 4 Mar 2008 10:56:42 -0000 1.29.2.3 --- embl.rb 21 Mar 2008 06:24:42 -0000 1.29.2.4 *************** *** 34,37 **** --- 34,39 ---- require 'bio/db' require 'bio/db/embl/common' + require 'bio/compat/features' + require 'bio/compat/references' module Bio *************** *** 432,448 **** end # module Bio - if __FILE__ == $0 - require '../../../bio' - require 'yaml' - - prefix = 'FT ' - indent = prefix + ' ' * 16 - fwidth = 80 - indent.length - - # parser = Bio::FlatFile.auto('/home/aertsj/LocalDocuments/bioruby_biohackathon/bioruby/test/data/embl/AB090716.embl') - parser = Bio::FlatFile.auto('/home/aertsj/LocalDocuments/hackathon/aj224122.embl') - parser.each do |entry| - # entry.ref - puts entry.to_biosequence.output(:embl) - end - end --- 434,435 ---- From ngoto at dev.open-bio.org Wed Mar 26 07:34:08 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Wed, 26 Mar 2008 11:34:08 +0000 Subject: [BioRuby-cvs] bioruby .project,1.1.2.1,NONE Message-ID: <200803261134.m2QBY7Im016555@dev.open-bio.org> Update of /home/repository/bioruby/bioruby In directory dev.open-bio.org:/tmp/cvs-serv16535 Removed Files: Tag: BRANCH-biohackathon2008 .project Log Message: Removed mistakenly added file .project. --- .project DELETED --- From ngoto at dev.open-bio.org Thu Mar 27 09:07:21 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Thu, 27 Mar 2008 13:07:21 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio sequence.rb,0.58.2.8,0.58.2.9 Message-ID: <200803271307.m2RD7LcR020772@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio In directory dev.open-bio.org:/tmp/cvs-serv20752/lib/bio Modified Files: Tag: BRANCH-biohackathon2008 sequence.rb Log Message: Added documents for attributes added during Biohackathon2008. Index: sequence.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/sequence.rb,v retrieving revision 0.58.2.8 retrieving revision 0.58.2.9 diff -C2 -d -r0.58.2.8 -r0.58.2.9 *** sequence.rb 4 Mar 2008 11:10:28 -0000 0.58.2.8 --- sequence.rb 27 Mar 2008 13:07:19 -0000 0.58.2.9 *************** *** 73,78 **** include Format - attr_accessor :sequence_version, :topology, :molecule_type, :data_class, :division, :primary_accession, :secondary_accessions, :date_created, :date_modified, :species, :classification - # Create a new Bio::Sequence object # --- 73,76 ---- *************** *** 154,158 **** --- 152,196 ---- # but could be a simple String attr_accessor :seq + + #--- + # Attributes below have been added during BioHackathon2008 + #+++ + # Version number of the sequence (String). + attr_accessor :sequence_version + + # Topology (String). "circular" or "linear". + attr_accessor :topology + + # molecular type (String). "DNA" or "RNA" for nucleotide sequence. + attr_accessor :molecule_type + + # Data Class defined by EMBL (String) + # See http://www.ebi.ac.uk/embl/Documentation/User_manual/usrman.html#3_1 + attr_accessor :data_class + + # Taxonomic Division defined by EMBL/GenBank/DDBJ (String) + # See http://www.ebi.ac.uk/embl/Documentation/User_manual/usrman.html#3_2 + attr_accessor :division + + # Primary accession number (String) + attr_accessor :primary_accession + + # Secondary accession numbers (Array of String) + attr_accessor :secondary_accessions + + # Created date of the sequence entry (String) + attr_accessor :date_created + + # Last modified date of the sequence entry (String) + attr_accessor :date_modified + + # Organism species (String). For example, "Escherichia coli". + attr_accessor :species + + # Organism classification, taxonomic classification of the source organism. + # (Array of String) + attr_accessor :classification + # Guess the type of sequence, Amino Acid or Nucleic Acid, and create a # new sequence object (Bio::Sequence::AA or Bio::Sequence::NA) on the basis From ngoto at dev.open-bio.org Thu Mar 27 09:32:30 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Thu, 27 Mar 2008 13:32:30 +0000 Subject: [BioRuby-cvs] bioruby/test/functional/bio/sequence - New directory Message-ID: <200803271332.m2RDWUOj020821@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/test/functional/bio/sequence In directory dev.open-bio.org:/tmp/cvs-serv20801/test/functional/bio/sequence Log Message: Directory /home/repository/bioruby/bioruby/test/functional/bio/sequence added to the repository --> Using per-directory sticky tag `BRANCH-biohackathon2008' From ngoto at dev.open-bio.org Thu Mar 27 09:38:33 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Thu, 27 Mar 2008 13:38:33 +0000 Subject: [BioRuby-cvs] bioruby/test/functional/bio/sequence test_output_embl.rb, NONE, 1.1.2.1 Message-ID: <200803271338.m2RDcX8k020926@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/test/functional/bio/sequence In directory dev.open-bio.org:/tmp/cvs-serv20870/test/functional/bio/sequence Added Files: Tag: BRANCH-biohackathon2008 test_output_embl.rb Log Message: Example code in sequence.rb written by Jan Aerts is moved to test/functional/bio/sequence/test_output_embl.rb. Fixed a bug in lib/bio/db/embl/format_embl.rb: failed to output when features or references are nil. This bug is found by above test code. --- NEW FILE: test_output_embl.rb --- # # test/functional/bio/sequence/test_output_embl.rb - Functional test for Bio::Sequence#output(:embl) # # Copyright:: Copyright (C) 2008 # Jan Aerts # License:: The Ruby License # # $Id: test_output_embl.rb,v 1.1.2.1 2008/03/27 13:38:31 ngoto Exp $ # require 'pathname' libpath = Pathname.new(File.join(File.dirname(__FILE__), ['..'] * 4, 'lib')).cleanpath.to_s $:.unshift(libpath) unless $:.include?(libpath) require 'test/unit' require 'bio' module Bio class FuncTestSequenceOutputEMBL < Test::Unit::TestCase def setup @seq = Bio::Sequence.auto('aattaaaacgccacgcaaggcgattctaggaaatcaaaacgacacgaaatgtggggtgggtgtttgggtaggaaagacagttgtcaacatcagggatttggattgaatcaaaaaaaaagtccttagatttcataaaagctaatcacgcctcaaaactggggcctatctcttcttttttgtcgcttcctgtcggtccttctctatttcttctccaacccctcatttttgaatatttacataacaaaccgttttactttctttggtcaaaattagacccaaaattctatattagtttaagatatgtggtctgtaatttattgttgtattgatataaaaattagttataagcgattatatttttatgctcaagtaactggtgttagttaactatattccaccacgataacctgattacataaaatatgattttaatcattttagtaaaccatatcgcacgttggatgattaattttaacggtttaataacacgtgattaaattatttttagaatgattatttacaaacggaaaagctatatgtgacacaataactcgtgcagtattgttagtttgaaaagtgtatttggtttcttatatttggcctcgattttcagtttatgtgctttttacaaagttttattttcgttatctgtttaacgcgacatttgttgtatggctttaccgatttgagaataaaatcatattacctttatgtagccatgtgtggtgtaatatataataatggtccttctacgaaaaaagcagatcacaattgaaataaagggtgaaatttggtgtcccttttcttcgtcgaaataacagaactaaataaaagaaagtgttatagtatattacgtccgaagaataatccatattcctgaaatacagtcaacatattatatatttagtactttatataaagttaggaattaaatcatatgttttatcgaccatattaagt! cacaactttatcataaattaatctgtaattagaattccaagttcgccaccgaatttcgtaacctaatctacatataatagataaaatatatatatgtagagtaattatgatatctatgtatgtagtcatggtatatgaattttgaaattggcaaggtaacattgacggatcgtaacccaacaaataatattaattacaaaatgggtgggcgggaatagtatacaactcataattccactcactttttgtattattaggatatgaaataagagtaatcaacatgcataataaagatgtataatttcttcatcttaaaaaacataactacatggtttaatacacaattttaccttttatcaaaaaagtatttcacaattcactcgcaaattacgaaatgatggctagtgcttcaactccaaatttcgaatattttaaatcacgatgtgtagaaccttttatttactggatactaatcactagtttattgagccaaccaattagttaaatagaacaatcaatattatagccagatattttttcctttaaaaatatttaaaagaggggccagaaaagaaccagagagggaggccatgagacattattatcactagtcaaaaacaacaaaccctccttttgctttttcatataaattattatattttattttgcaggtttcttctcttcttcttcttcttcttcttcttcttcctcttggctgctttctttcatcatccataaagtgaaagctaacgcatagagagagccatatcgtcccaaaaaaagcaaaagtccaaaaaaaaacaactccaaaacattctctcttagctctttactctttagtttctctctctctctctgcctttctctttgttgaagttcatggatgctacgaagtggactcaggtacgtaaaaagatatctctctgctatatctgtttgtttgtagcttctccccgactctcacgctctctctctctctctctctctc! tttgtgtatctctctactcacataaatatatacatgtgtgtgtatgcatgtttatatgtatgtatgaaac cagtagtggttatacagatagtctatatagagatatcaatatgatgtgttttaatttagactttttatatatccgtttgaaacttccgaagttctcgaatggagttaaggaagttttgttctctacaagttcaatttttcttgtcattaattataaaactctgataactaatggataaaaaaggtatgctttgttagttaccttttgttcttggtgctcaggtcttaccatttttttcctaaattttaattagtctcctttctttaattaattttatgttaacgcactgacgatttaacgttaacaaaaaaacctagattctttttcttttcaatagagcataattattacttcaatttcatttatctcacactaaaccctaatcttggcgaaattccttttatatatataaatttaattaatttttccacaatcttggcggaattcaggactcggttttgcttgttattgttctctcttttaatttgacatggttagggaatacttaaagtatgtcttaattttatagggttttcaagaaatgataaacgtaaagccaatggagcaaatgatttctagcaccaacaacaacacaccgcaacaacaaccaacattcatcgccaccaacacaaggccaaacgccaccgcatccaatggtggctccggaggaaataccaacaacacggctacgatggaaactagaaaggcgaggccacaagagaaagtaaattgtccaagatgcaactcaacaaacacaaagttctgttattacaacaactacagtctcacgcaaccaagatacttctgcaaaggttgtcgaaggtattggaccgaaggtggctctcttcgtaacgtcccagtcggaggtagctcaagaaagaacaagagatcctctacacctttagcttcaccttctaatcccaaacttccagatctaaacccaccgattcttttctcaagccaaatccctaataagtcaaataaagatc! tcaacttgctatctttcccggtcatgcaagatcatcatcatcatggtatgtctcatttttttcatatgcccaagatagagaacaacaatacttcatcctcaatctatgcttcatcatctcctgtctcagctcttgagcttctaagatccaatggagtctcttcaagaggcatgaacacgttcttgcctggtcaaatgatggattcaaactcagtcctgtactcatctttagggtttccaacaatgcctgattacaaacagagtaataacaacctttcattctccattgatcatcatcaagggattggacataacaccatcaacagtaaccaaagagctcaagataacaatgatgacatgaatggagcaagtagggttttgttccctttttcagacatgaaagagctttcaagcacaacccaagagaagagtcatggtaataatacatattggaatgggatgttcagtaatacaggaggatcttcatggtgaaaaaaggttaaaaagagctcatgaactatcagctttcttctctttttctgtttttttctcctattttattatagtttttactttgatgatcttttgttttttctcacatggggaactttacttaaagttgtcagaacttagtttacagattgtctttttattccttctttctggttttccttttttcctttttttatcagtctttttaaaatatgtatttcataattgggtttgatcattcatatttattagtatcaaaatagagtctatgttcatgagggagtgttaaggggtgtgagggtagaagaataagtgaatacgggggcccg') @seq.entry_id = 'AJ224122' @seq.sequence_version = 3 @seq.topology = 'linear' @seq.molecule_type = 'genomic DNA' @seq.data_class = 'STD' @seq.division = 'PLN' @seq.primary_accession = 'AJ224122' @seq.secondary_accessions = [] @seq.date_created = '27-FEB-1998 (Rel. 54, Created)' @seq.date_modified = '14-NOV-2006 (Rel. 89, Last updated, Version 6)' @seq.definition = 'Arabidopsis thaliana DAG1 gene' @seq.keywords = ['BBFa gene', 'transcription factor'] @seq.species = 'Arabidopsis thaliana (thale cress)' @seq.classification = ['Eukaryota', 'Viridiplantae', 'Streptophyta', 'Embryophyta', 'Tracheophyta', 'Spermatophyta', 'Magnoliophyta', 'eudicotyledons', 'core eudicotyledons', 'rosids', 'eurosids II', 'Brassicales', 'Brassicaceae', 'Arabidopsis'] end def test_output_embl assert_nothing_raised { puts @seq.output(:embl) } end def test_output_fasta assert_nothing_raised { @seq.output(:fasta) } end end #class FuncTestSequenceOutputEMBL end #module Bio From ngoto at dev.open-bio.org Thu Mar 27 09:38:33 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Thu, 27 Mar 2008 13:38:33 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/embl format_embl.rb, 1.1.2.1, 1.1.2.2 Message-ID: <200803271338.m2RDcXg2020921@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/embl In directory dev.open-bio.org:/tmp/cvs-serv20870/lib/bio/db/embl Modified Files: Tag: BRANCH-biohackathon2008 format_embl.rb Log Message: Example code in sequence.rb written by Jan Aerts is moved to test/functional/bio/sequence/test_output_embl.rb. Fixed a bug in lib/bio/db/embl/format_embl.rb: failed to output when features or references are nil. This bug is found by above test code. Index: format_embl.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/embl/Attic/format_embl.rb,v retrieving revision 1.1.2.1 retrieving revision 1.1.2.2 diff -C2 -d -r1.1.2.1 -r1.1.2.2 *** format_embl.rb 4 Mar 2008 11:16:57 -0000 1.1.2.1 --- format_embl.rb 27 Mar 2008 13:38:31 -0000 1.1.2.2 *************** *** 56,64 **** <%= embl_wrap('OC ', classification.join('; ') + '.') %> XX ! <%= references.collect{|ref| ref.format('embl')}.join("\n") %> XX FH Key Location/Qualifiers FH ! <%= format_features_embl(features) %>XX SQ Sequence <%= seq.length %> BP; <%= seq.composition.collect{|k,v| "#{v} #{k.upcase}"}.join('; ') + '; ' + (seq.gsub(/[ACTGactg]/, '').length.to_s ) + ' other;' %> <%= seq_format_embl(seq) %> --- 56,64 ---- <%= embl_wrap('OC ', classification.join('; ') + '.') %> XX ! <%= (references || []).collect{|ref| ref.format('embl')}.join("\n") %> XX FH Key Location/Qualifiers FH ! <%= format_features_embl(features || []) %>XX SQ Sequence <%= seq.length %> BP; <%= seq.composition.collect{|k,v| "#{v} #{k.upcase}"}.join('; ') + '; ' + (seq.gsub(/[ACTGactg]/, '').length.to_s ) + ' other;' %> <%= seq_format_embl(seq) %> From ngoto at dev.open-bio.org Thu Mar 27 09:38:33 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Thu, 27 Mar 2008 13:38:33 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio sequence.rb,0.58.2.9,0.58.2.10 Message-ID: <200803271338.m2RDcXox020916@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio In directory dev.open-bio.org:/tmp/cvs-serv20870/lib/bio Modified Files: Tag: BRANCH-biohackathon2008 sequence.rb Log Message: Example code in sequence.rb written by Jan Aerts is moved to test/functional/bio/sequence/test_output_embl.rb. Fixed a bug in lib/bio/db/embl/format_embl.rb: failed to output when features or references are nil. This bug is found by above test code. Index: sequence.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/sequence.rb,v retrieving revision 0.58.2.9 retrieving revision 0.58.2.10 diff -C2 -d -r0.58.2.9 -r0.58.2.10 *** sequence.rb 27 Mar 2008 13:07:19 -0000 0.58.2.9 --- sequence.rb 27 Mar 2008 13:38:31 -0000 0.58.2.10 *************** *** 395,424 **** end # Bio - - if __FILE__ == $0 - - require 'bio' - seq = Bio::Sequence.new('aattaaaacgccacgcaaggcgattctaggaaatcaaaacgacacgaaatgtggggtgggtgtttgggtaggaaagacagttgtcaacatcagggatttggattgaatcaaaaaaaaagtccttagatttcataaaagctaatcacgcctcaaaactggggcctatctcttcttttttgtcgcttcctgtcggtccttctctatttcttctccaacccctcatttttgaatatttacataacaaaccgttttactttctttggtcaaaattagacccaaaattctatattagtttaagatatgtggtctgtaatttattgttgtattgatataaaaattagttataagcgattatatttttatgctcaagtaactggtgttagttaactatattccaccacgataacctgattacataaaatatgattttaatcattttagtaaaccatatcgcacgttggatgattaattttaacggtttaataacacgtgattaaattatttttagaatgattatttacaaacggaaaagctatatgtgacacaataactcgtgcagtattgttagtttgaaaagtgtatttggtttcttatatttggcctcgattttcagtttatgtgctttttacaaagttttattttcgttatctgtttaacgcgacatttgttgtatggctttaccgatttgagaataaaatcatattacctttatgtagccatgtgtggtgtaatatataataatggtccttctacgaaaaaagcagatcacaattgaaataaagggtgaaatttggtgtcccttttcttcgtcgaaataacagaactaaataaaagaaagtgttatagtatattacgtccgaagaataatccatattcctgaaatacagtcaacatattatatatttagtactttatataaagttaggaattaaatcatatgttttatcgaccatattaagt! cacaactttatcataaattaatctgtaattagaattccaagttcgccaccgaatttcgtaacctaatctacatataatagataaaatatatatatgtagagtaattatgatatctatgtatgtagtcatggtatatgaattttgaaattggcaaggtaacattgacggatcgtaacccaacaaataatattaattacaaaatgggtgggcgggaatagtatacaactcataattccactcactttttgtattattaggatatgaaataagagtaatcaacatgcataataaagatgtataatttcttcatcttaaaaaacataactacatggtttaatacacaattttaccttttatcaaaaaagtatttcacaattcactcgcaaattacgaaatgatggctagtgcttcaactccaaatttcgaatattttaaatcacgatgtgtagaaccttttatttactggatactaatcactagtttattgagccaaccaattagttaaatagaacaatcaatattatagccagatattttttcctttaaaaatatttaaaagaggggccagaaaagaaccagagagggaggccatgagacattattatcactagtcaaaaacaacaaaccctccttttgctttttcatataaattattatattttattttgcaggtttcttctcttcttcttcttcttcttcttcttcttcctcttggctgctttctttcatcatccataaagtgaaagctaacgcatagagagagccatatcgtcccaaaaaaagcaaaagtccaaaaaaaaacaactccaaaacattctctcttagctctttactctttagtttctctctctctctctgcctttctctttgttgaagttcatggatgctacgaagtggactcaggtacgtaaaaagatatctctctgctatatctgtttgtttgtagcttctccccgactctcacgctctctctctctctctctctctc! tttgtgtatctctctactcacataaatatatacatgtgtgtgtatgcatgtttatatgtatgtatgaaac cagtagtggttatacagatagtctatatagagatatcaatatgatgtgttttaatttagactttttatatatccgtttgaaacttccgaagttctcgaatggagttaaggaagttttgttctctacaagttcaatttttcttgtcattaattataaaactctgataactaatggataaaaaaggtatgctttgttagttaccttttgttcttggtgctcaggtcttaccatttttttcctaaattttaattagtctcctttctttaattaattttatgttaacgcactgacgatttaacgttaacaaaaaaacctagattctttttcttttcaatagagcataattattacttcaatttcatttatctcacactaaaccctaatcttggcgaaattccttttatatatataaatttaattaatttttccacaatcttggcggaattcaggactcggttttgcttgttattgttctctcttttaatttgacatggttagggaatacttaaagtatgtcttaattttatagggttttcaagaaatgataaacgtaaagccaatggagcaaatgatttctagcaccaacaacaacacaccgcaacaacaaccaacattcatcgccaccaacacaaggccaaacgccaccgcatccaatggtggctccggaggaaataccaacaacacggctacgatggaaactagaaaggcgaggccacaagagaaagtaaattgtccaagatgcaactcaacaaacacaaagttctgttattacaacaactacagtctcacgcaaccaagatacttctgcaaaggttgtcgaaggtattggaccgaaggtggctctcttcgtaacgtcccagtcggaggtagctcaagaaagaacaagagatcctctacacctttagcttcaccttctaatcccaaacttccagatctaaacccaccgattcttttctcaagccaaatccctaataagtcaaataaagatc! tcaacttgctatctttcccggtcatgcaagatcatcatcatcatggtatgtctcatttttttcatatgcccaagatagagaacaacaatacttcatcctcaatctatgcttcatcatctcctgtctcagctcttgagcttctaagatccaatggagtctcttcaagaggcatgaacacgttcttgcctggtcaaatgatggattcaaactcagtcctgtactcatctttagggtttccaacaatgcctgattacaaacagagtaataacaacctttcattctccattgatcatcatcaagggattggacataacaccatcaacagtaaccaaagagctcaagataacaatgatgacatgaatggagcaagtagggttttgttccctttttcagacatgaaagagctttcaagcacaacccaagagaagagtcatggtaataatacatattggaatgggatgttcagtaatacaggaggatcttcatggtgaaaaaaggttaaaaagagctcatgaactatcagctttcttctctttttctgtttttttctcctattttattatagtttttactttgatgatcttttgttttttctcacatggggaactttacttaaagttgtcagaacttagtttacagattgtctttttattccttctttctggttttccttttttcctttttttatcagtctttttaaaatatgtatttcataattgggtttgatcattcatatttattagtatcaaaatagagtctatgttcatgagggagtgttaaggggtgtgagggtagaagaataagtgaatacgggggcccg') - seq.entry_id = 'AJ224122' - seq.sequence_version = 3 - seq.topology = 'linear' - seq.molecule_type = 'genomic DNA' - seq.data_class = 'STD' - seq.division = 'PLN' - seq.primary_accession = 'AJ224122' - seq.secondary_accessions = [] - seq.date_created = '27-FEB-1998 (Rel. 54, Created)' - seq.date_modified = '14-NOV-2006 (Rel. 89, Last updated, Version 6)' - seq.definition = 'Arabidopsis thaliana DAG1 gene' - seq.keywords = ['BBFa gene', 'transcription factor'] - seq.species = 'Arabidopsis thaliana (thale cress)' - seq.classification = ['Eukaryota', 'Viridiplantae', 'Streptophyta', 'Embryophyta', 'Tracheophyta', - 'Spermatophyta', 'Magnoliophyta', 'eudicotyledons', 'core eudicotyledons', 'rosids', - 'eurosids II', 'Brassicales', 'Brassicaceae', 'Arabidopsis'] - - # puts seq.output(:embl) - puts seq.output(:fasta) - - end - - --- 395,396 ---- From ngoto at dev.open-bio.org Thu Mar 27 20:56:29 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Fri, 28 Mar 2008 00:56:29 +0000 Subject: [BioRuby-cvs] bioruby/test/functional/bio/sequence test_output_embl.rb, 1.1.2.1, 1.1.2.2 Message-ID: <200803280056.m2S0uTTt022850@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/test/functional/bio/sequence In directory dev.open-bio.org:/tmp/cvs-serv22830/test/functional/bio/sequence Modified Files: Tag: BRANCH-biohackathon2008 test_output_embl.rb Log Message: removed unwanted puts Index: test_output_embl.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/test/functional/bio/sequence/Attic/test_output_embl.rb,v retrieving revision 1.1.2.1 retrieving revision 1.1.2.2 diff -C2 -d -r1.1.2.1 -r1.1.2.2 *** test_output_embl.rb 27 Mar 2008 13:38:31 -0000 1.1.2.1 --- test_output_embl.rb 28 Mar 2008 00:56:27 -0000 1.1.2.2 *************** *** 39,43 **** def test_output_embl ! assert_nothing_raised { puts @seq.output(:embl) } end --- 39,43 ---- def test_output_embl ! assert_nothing_raised { @seq.output(:embl) } end From helios at dev.open-bio.org Tue Mar 25 11:46:37 2008 From: helios at dev.open-bio.org (Raoul Jean Pierre Bonnal) Date: Tue, 25 Mar 2008 15:46:37 -0000 Subject: [BioRuby-cvs] bioruby/lib/bio/io/biosql/config - New directory Message-ID: <200803251546.m2PFkTuN013246@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/io/biosql/config In directory dev.open-bio.org:/tmp/cvs-serv13217/lib/bio/io/biosql/config Log Message: Directory /home/repository/bioruby/bioruby/lib/bio/io/biosql/config added to the repository From helios at dev.open-bio.org Tue Mar 25 11:46:38 2008 From: helios at dev.open-bio.org (Raoul Jean Pierre Bonnal) Date: Tue, 25 Mar 2008 15:46:38 -0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/biosql - New directory Message-ID: <200803251546.m2PFkS7R013241@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/biosql In directory dev.open-bio.org:/tmp/cvs-serv13217/lib/bio/db/biosql Log Message: Directory /home/repository/bioruby/bioruby/lib/bio/db/biosql added to the repository --> Using per-directory sticky tag `BRANCH-biohackathon2008' From helios at dev.open-bio.org Tue Mar 25 11:46:40 2008 From: helios at dev.open-bio.org (Raoul Jean Pierre Bonnal) Date: Tue, 25 Mar 2008 15:46:40 -0000 Subject: [BioRuby-cvs] bioruby/lib/bio/io/biosql - New directory Message-ID: <200803251546.m2PFkSDD013243@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/io/biosql In directory dev.open-bio.org:/tmp/cvs-serv13217/lib/bio/io/biosql Log Message: Directory /home/repository/bioruby/bioruby/lib/bio/io/biosql added to the repository --> Using per-directory sticky tag `BRANCH-biohackathon2008' From helios at dev.open-bio.org Tue Mar 25 11:46:59 2008 From: helios at dev.open-bio.org (Raoul Jean Pierre Bonnal) Date: Tue, 25 Mar 2008 15:46:59 -0000 Subject: [BioRuby-cvs] bioruby .project,NONE,1.1.2.1 Message-ID: <200803251546.m2PFkYHd013326@dev.open-bio.org> Update of /home/repository/bioruby/bioruby In directory dev.open-bio.org:/tmp/cvs-serv13290 Added Files: Tag: BRANCH-biohackathon2008 .project Log Message: BioSQL release "MIFI". biosql->biosequence, biosequence->biosql. Supported formats: Embl, Genbank; support sql stransactions creating new sequences on biosql; does not support references and comments for genbank and embl. Fasta->biosequence->biosql dosn't work. --- NEW FILE: .project --- bioruby org.rubypeople.rdt.core.rubybuilder org.rubypeople.rdt.core.rubynature From helios at dev.open-bio.org Tue Mar 25 11:47:01 2008 From: helios at dev.open-bio.org (Raoul Jean Pierre Bonnal) Date: Tue, 25 Mar 2008 15:47:01 -0000 Subject: [BioRuby-cvs] bioruby/lib/bio/io sql.rb,1.8,1.8.2.1 Message-ID: <200803251546.m2PFkYcK013334@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/io In directory dev.open-bio.org:/tmp/cvs-serv13290/lib/bio/io Modified Files: Tag: BRANCH-biohackathon2008 sql.rb Log Message: BioSQL release "MIFI". biosql->biosequence, biosequence->biosql. Supported formats: Embl, Genbank; support sql stransactions creating new sequences on biosql; does not support references and comments for genbank and embl. Fasta->biosequence->biosql dosn't work. Index: sql.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/io/sql.rb,v retrieving revision 1.8 retrieving revision 1.8.2.1 diff -C2 -d -r1.8 -r1.8.2.1 *** sql.rb 5 Apr 2007 23:35:41 -0000 1.8 --- sql.rb 25 Mar 2008 15:46:32 -0000 1.8.2.1 *************** *** 1,365 **** - # - # = bio/io/sql.rb - BioSQL access module - # - # Copyright:: Copyright (C) 2002 Toshiaki Katayama - # Copyright:: Copyright (C) 2006 Raoul Jean Pierre Bonnal - # License:: The Ruby License - # - # $Id$ - # - - begin - require 'dbi' - rescue LoadError - end - require 'bio/sequence' - require 'bio/feature' - - - module Bio - - class SQL - - def initialize(db = 'dbi:Mysql:biosql', user = nil, pass = nil) - @dbh = DBI.connect(db, user, pass) - end - - def close - @dbh.disconnect - end - - # Returns Bio::SQL::Sequence object. - def fetch(accession) # or display_id for fall back - query = "select * from bioentry where accession = ?" - entry = @dbh.execute(query, accession).fetch - return Sequence.new(@dbh, entry) if entry - - query = "select * from bioentry where display_id = ?" - entry = @dbh.execute(query, accession).fetch - return Sequence.new(@dbh, entry) if entry - end - alias get_by_id fetch - - - # for lazy fetching - - class Sequence - - def initialize(dbh, entry) - @dbh = dbh - @bioentry_id = entry['bioentry_id'] - @database_id = entry['biodatabase_id'] - @entry_id = entry['display_id'] - @accession = entry['accession'] - @version = entry['entry_version'] - @division = entry['division'] - end - attr_reader :accession, :division, :entry_id, :version - - - def to_fasta - if seq = seq - return seq.to_fasta(@accession) - end - end - - # Returns Bio::Sequence::NA or AA object. - def seq - query = "select * from biosequence where bioentry_id = ?" - row = @dbh.execute(query, @bioentry_id).fetch - return unless row - - mol = row['alphabet'] - seq = row['seq'] - - case mol - when /.na/i # 'dna' or 'rna' - Bio::Sequence::NA.new(seq) - else # 'protein' - Bio::Sequence::AA.new(seq) - end - end - - # Returns Bio::Sequence::NA or AA object (by lazy fetching). - def subseq(from, to) - length = to - from + 1 - query = "select alphabet, substring(seq, ?, ?) as subseq" + - " from biosequence where bioentry_id = ?" - row = @dbh.execute(query, from, length, @bioentry_id).fetch - return unless row - - mol = row['alphabet'] - seq = row['subseq'] - - case mol - when /.na/i # 'dna' or 'rna' - Bio::Sequence::NA.new(seq) - else # 'protein' - Bio::Sequence::AA.new(seq) - end - end - - - # Returns Bio::Features object. - def features - array = [] - query = "select * from seqfeature where bioentry_id = ?" - @dbh.execute(query, @bioentry_id).fetch_all.each do |row| - next unless row - - f_id = row['seqfeature_id'] - k_id = row['type_term_id'] - s_id = row['source_term_id'] - rank = row['rank'].to_i - 1 ! # key : type (gene, CDS, ...) ! type = feature_key(k_id) ! ! # source : database (EMBL/GenBank/SwissProt) ! database = feature_source(s_id) ! ! # location : position ! locations = feature_locations(f_id) ! ! # qualifier ! qualifiers = feature_qualifiers(f_id) ! ! # rank ! array[rank] = Bio::Feature.new(type, locations, qualifiers) ! end ! return Bio::Features.new(array) ! end ! ! ! # Returns reference informations in Array of Hash (not Bio::Reference). ! def references ! array = [] ! query = <<-END ! select * from bioentry_reference, reference ! where bioentry_id = ? and ! bioentry_reference.reference_id = reference.reference_id ! END ! @dbh.execute(query, @bioentry_id).fetch_all.each do |row| ! next unless row ! ! hash = { ! 'start' => row['start_pos'], ! 'end' => row['end_pos'], ! 'journal' => row['location'], ! 'title' => row['title'], ! 'authors' => row['authors'], ! 'medline' => row['crc'] ! } ! hash.default = '' ! ! rank = row['rank'].to_i - 1 ! array[rank] = hash ! end ! return array ! end ! ! ! # Returns the first comment. For complete comments, use comments method. ! def comment ! query = "select * from comment where bioentry_id = ?" ! row = @dbh.execute(query, @bioentry_id).fetch ! row ? row['comment_text'] : '' ! end ! ! # Returns comments in an Array of Strings. ! def comments ! array = [] ! query = "select * from comment where bioentry_id = ?" ! @dbh.execute(query, @bioentry_id).fetch_all.each do |row| ! next unless row ! rank = row['rank'].to_i - 1 ! array[rank] = row['comment_text'] ! end ! return array ! end ! ! def database ! query = "select * from biodatabase where biodatabase_id = ?" ! row = @dbh.execute(query, @database_id).fetch ! row ? row['name'] : '' ! end ! ! def date ! query = "select * from bioentry_date where bioentry_id = ?" ! row = @dbh.execute(query, @bioentry_id).fetch ! row ? row['date'] : '' ! end ! ! def dblink ! query = "select * from bioentry_direct_links where source_bioentry_id = ?" ! row = @dbh.execute(query, @bioentry_id).fetch ! row ? [row['dbname'], row['accession']] : [] ! end ! ! def definition ! query = "select * from bioentry_description where bioentry_id = ?" ! row = @dbh.execute(query, @bioentry_id).fetch ! row ? row['description'] : '' ! end ! ! def keyword ! query = "select * from bioentry_keywords where bioentry_id = ?" ! row = @dbh.execute(query, @bioentry_id).fetch ! row ? row['keywords'] : '' ! end ! ! # Use lineage, common_name, ncbi_taxa_id methods to extract in detail. ! def taxonomy ! query = <<-END ! select taxon_name.name, taxon.ncbi_taxon_id from bioentry ! join taxon_name using(taxon_id) join taxon using (taxon_id) ! where bioentry_id = ? ! END ! row = @dbh.execute(query, @bioentry_id).fetch ! # @lineage = row ? row['full_lineage'] : '' ! @common_name = row ? row['name'] : '' ! @ncbi_taxa_id = row ? row['ncbi_taxon_id'] : '' ! row ? [@lineage, @common_name, @ncbi_taxa_id] : [] ! end ! def lineage ! taxonomy unless @lineage ! return @lineage ! end - def common_name - taxonomy unless @common_name - return @common_name - end ! def ncbi_taxa_id ! taxonomy unless @ncbi_taxa_id ! return @ncbi_taxa_id end ! ! ! private ! ! def feature_key(k_id) ! query = "select * from term where term_id= ?" ! row = @dbh.execute(query, k_id).fetch ! row ? row['name'] : '' end ! ! def feature_source(s_id) ! query = "select * from term where term_id = ?" ! row = @dbh.execute(query, s_id).fetch ! row ? row['name'] : '' end ! ! def feature_locations(f_id) ! locations = [] ! query = "select * from location where seqfeature_id = ?" ! @dbh.execute(query, f_id).fetch_all.each do |row| ! next unless row ! ! location = Bio::Location.new ! location.strand = row['strand'] ! location.from = row['start_pos'] ! location.to = row['end_pos'] ! ! xref = feature_locations_remote(row['dbxref_if']) ! location.xref_id = xref.shift unless xref.empty? ! ! # just omit fuzzy location for now... ! #feature_locations_qv(row['seqfeature_location_id']) ! ! rank = row['rank'].to_i - 1 ! locations[rank] = location ! end ! return Bio::Locations.new(locations) end ! ! def feature_locations_remote(l_id) ! query = "select * from dbxref where dbxref_id = ?" ! row = @dbh.execute(query, l_id).fetch ! row ? [row['accession'], row['version']] : [] end ! ! def feature_locations_qv(l_id) ! query = "select * from location_qualifier_value where location_id = ?" ! row = @dbh.execute(query, l_id).fetch ! row ? [row['value'], row['int_value']] : [] end ! ! def feature_qualifiers(f_id) ! qualifiers = [] ! query = "select * from seqfeature_qualifier_value where seqfeature_id = ?" ! @dbh.execute(query, f_id).fetch_all.each do |row| ! next unless row ! ! key = feature_qualifiers_key(row['seqfeature_id']) ! value = row['value'] ! qualifier = Bio::Feature::Qualifier.new(key, value) ! ! rank = row['rank'].to_i - 1 ! qualifiers[rank] = qualifier ! end ! return qualifiers.compact # .compact is nasty hack for a while end ! ! def feature_qualifiers_key(q_id) ! query = <<-END ! select * from seqfeature_qualifier_value ! join term using(term_id) where seqfeature_id = ? ! END ! row = @dbh.execute(query, q_id).fetch ! row ? row['name'] : '' end ! end ! ! end # SQL ! ! end # Bio ! if __FILE__ == $0 ! begin ! require 'pp' ! alias p pp ! rescue LoadError end ! ! db = ARGV.empty? ? 'dbi:Mysql:database=biosql;host=localhost' : ARGV.shift ! serv = Bio::SQL.new(db, 'root') ! ! ent0 = serv.fetch('X76706') ! ent0 = serv.fetch('A15H9FIB') ! ent1 = serv.fetch('J01902') ! ent2 = serv.fetch('X04311') ! ! pp ent0.features ! pp ent0.references ! ! pp ent1.seq ! pp ent1.seq.translate ! pp ent1.seq.gc ! pp ent1.subseq(1,20) ! ! pp ent2.accession ! pp ent2.comment ! pp ent2.comments ! pp ent2.common_name ! pp ent2.database ! pp ent2.date ! pp ent2.dblink ! pp ent2.definition ! pp ent2.division ! pp ent2.entry_id ! pp ent2.features ! pp ent2.keyword ! pp ent2.lineage ! pp ent2.ncbi_taxa_id ! pp ent2.references ! pp ent2.seq ! pp ent2.subseq(1,10) ! pp ent2.taxonomy ! pp ent2.version ! end - --- 1,145 ---- ! require 'rubygems' ! require 'erb' ! require 'composite_primary_keys' ! # BiosqlPlug ! =begin ! Ok Hilmar gives to me some clarification ! 1) "EMBL/GenBank/SwissProt" name in term table, is only a convention assuming data loaded by genbank embl ans swissprot formats. ! If your features come from others ways for example blast or alignment ... whatever.. the user as to take care about the source. ! =end ! =begin ! TODO: ! 1) source_term_id => surce_term and check before if the source term is present or not and the level, the root should always be something "EMBL/GenBank/SwissProt" or contestualized. ! 2) Into DummyBase class delete connection there and use Bio::ArSQL.establish_connection which reads info from a yml file. ! 3) Chk Locations in Biofeatures ArSQL ! =end ! module Bio ! class SQL ! #no check is made ! def self.establish_connection(configurations, env) ! #configurations is an hash similar what YAML returns. ! #{:database=>"biorails_development", :adapter=>"postgresql", :username=>"rails", :password=>nil} ! configurations.assert_valid_keys('development', 'production','test') ! configurations[env].assert_valid_keys('database','adapter','username','password') ! DummyBase.configurations = configurations ! DummyBase.establish_connection "#{env}" end ! ! def self.fetch_id(id) ! Bio::SQL::Bioentry.find(id) end ! ! def self.fetch_accession(accession) ! accession.upcase! ! Bio::SQL::Bioentry.exists?(:accession => accession) ? Bio::SQL::Sequence.new(:entry=>Bio::SQL::Bioentry.find_by_accession(accession)) : nil end ! ! def self.exists_accession(accession) ! Bio::SQL::Bioentry.find_by_accession(accession.upcase).nil? ? false : true end ! ! def self.list_entries ! Bio::SQL::Bioentry.find(:all).collect{|entry| ! {:id=>entry.bioentry_id, :accession=>entry.accession} ! } end ! ! def self.list_databases ! Bio::SQL::Biodatabase.find(:all).collect{|entry| ! {:id=>entry.biodatabase_id, :name => entry.name} ! } end ! ! def self.delete_entry_id(id) ! Bioentry.delete(id) end ! ! def self.delete_entry_accession(accession) ! Bioentry.delete(Bioentry.find_by_accession(accession)) end ! ! ! class DummyBase < ActiveRecord::Base ! #NOTE: Using postgresql, not setting sequence name, system will discover the name by default. ! #NOTE: this class will not establish the connection automatically ! self.abstract_class = true ! self.pluralize_table_names = false ! #prepend table name to the usual id, avoid to specify primary id for every table ! self.primary_key_prefix_type = :table_name_with_underscore ! #biosql_configurations=YAML::load(ERB.new(IO.read(File.join(File.dirname(__FILE__),'../config', 'database.yml'))).result) ! #self.configurations=biosql_configurations ! #self.establish_connection "development" ! end #DummyBase ! ! autoload :Biodatabase, 'bio/io/biosql/biodatabase' ! autoload :Bioentry, 'bio/io/biosql/bioentry' ! autoload :BioentryDbxref, 'bio/io/biosql/bioentry_dbxref' ! autoload :BioentryPath, 'bio/io/biosql/bioentry_path' ! autoload :BioentryQualifierValue, 'bio/io/biosql/bioentry_qualifier_value' ! autoload :BioentryReference, 'bio/io/biosql/bioentry_reference' ! autoload :BioentryRelationship, 'bio/io/biosql/bioentry_relationship' ! autoload :Biosequence, 'bio/io/biosql/biosequence' ! autoload :Comment, 'bio/io/biosql/comment' ! autoload :Dbxref, 'bio/io/biosql/dbxref' ! autoload :DbxrefQualifierValue, 'bio/io/biosql/dbxref_qualifier_value' ! autoload :Location, 'bio/io/biosql/location' ! autoload :LocationQualifierValue, 'bio/io/biosql/location_qualifier_value' ! autoload :Ontology, 'bio/io/biosql/ontology' ! autoload :Reference, 'bio/io/biosql/reference' ! autoload :Seqfeature, 'bio/io/biosql/seqfeature' ! autoload :SeqfeatureDbxref, 'bio/io/biosql/seqfeature_dbxref' ! autoload :SeqfeaturePath, 'bio/io/biosql/seqfeature_path' ! autoload :SeqfeatureQualifierValue, 'bio/io/biosql/seqfeature_qualifier_value' ! autoload :SeqfeatureRelationship, 'bio/io/biosql/seqfeature_relationship' ! autoload :Taxon, 'bio/io/biosql/taxon' ! autoload :TaxonName, 'bio/io/biosql/taxon_name' ! autoload :Term, 'bio/io/biosql/term' ! autoload :TermDbxref, 'bio/io/biosql/term_dbxref' ! autoload :TermPath, 'bio/io/biosql/term_path' ! autoload :TermRelationship, 'bio/io/biosql/term_relationship' ! autoload :TermRelationshipTerm, 'bio/io/biosql/term_relationship_term' ! autoload :Sequence, 'bio/db/biosql/sequence' ! end #biosql ! ! end #Bio if __FILE__ == $0 ! require 'rubygems' ! require 'composite_primary_keys' ! require 'bio' ! require 'pp' ! ! # pp connection = Bio::SQL.establish_connection('bio/io/biosql/config/database.yml','development') ! pp connection = Bio::SQL.establish_connection({'development'=>{'database'=>"biorails_development", 'adapter'=>"postgresql", 'username'=>"rails", 'password'=>nil}},'development') ! #pp YAML::load(ERB.new(IO.read('bio/io/biosql/config/database.yml')).result) ! ! if nil ! pp Bio::SQL.list_entries ! bioseq = Bio::SQL.fetch_accession('AJ224122') ! pp bioseq ! pp bioseq.entry_id ! #TODO create a test only for tables not sequence here ! pp bioseq.molecule_type ! #pp bioseq.molecule_type.class ! #bioseq.molecule_type_update('dna', 1) ! pp Bio::SQL::Taxon.find(8121).taxon_names end ! #pp bioseq.molecule_type ! #term = Bio::SQL::Term.find_by_name('mol_type') ! #pp term ! #pp bioseq.entry.bioentry_qualifier_values.create(:term=>term, :rank=>2, :value=>'pippo') ! #pp bioseq.entry.bioentry_qualifier_values.inspect ! #pp bioseq.entry.bioentry_qualifier_values.find_all_by_term_id(26) ! #pp primo.class ! # pp primo.value='dna' ! # pp primo.save ! #pp bioseq.molecule_type= 'prova' ! ! #Bio::SQL::BioentryQualifierValue.delete(delete.bioentry_id,delete.term_id,delete.rank) ! ! end From helios at dev.open-bio.org Tue Mar 25 11:47:04 2008 From: helios at dev.open-bio.org (Raoul Jean Pierre Bonnal) Date: Tue, 25 Mar 2008 15:47:04 -0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/biosql sequence.rb,NONE,1.1.2.1 Message-ID: <200803251546.m2PFkYrY013322@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/biosql In directory dev.open-bio.org:/tmp/cvs-serv13290/lib/bio/db/biosql Added Files: Tag: BRANCH-biohackathon2008 sequence.rb Log Message: BioSQL release "MIFI". biosql->biosequence, biosequence->biosql. Supported formats: Embl, Genbank; support sql stransactions creating new sequences on biosql; does not support references and comments for genbank and embl. Fasta->biosequence->biosql dosn't work. --- NEW FILE: sequence.rb --- #TODO save on db reading from a genbank or embl object module Bio class SQL class Sequence private # example # bioentry_qualifier_anchor :molecule_type, :synonym=>'mol_type' # this function creates other 3 functions, molecule_type, molecule_type=, molecule_type_update #molecule_type => return an array of strings, where each string is the value associated with the qualifier, ordered by rank. #molecule_type=value add a bioentry_qualifier value to the table #molecule_type_update(value, rank) update an entry of the table with an existing rank #the method inferr the qualifier term from the name of the first symbol, or you can specify a synonym to use #creating an object with to_biosql is transaction safe. #TODO: implement setting for more than a qualifier-vale. def self.bioentry_qualifier_anchor(sym, *args) options = args.first || Hash.new #options.assert_valid_keys(:rank,:synonym,:multi) method_reader = sym.to_s.to_sym method_writer_operator = (sym.to_s+"=").to_sym method_writer_modder = (sym.to_s+"_update").to_sym synonym = options[:synonym].nil? ? sym.to_s : options[:synonym] #Bio::SQL::Term.create(:name=>synonym, :ontology=> Bio::SQL::Ontology.find_by_name('Annotation Tags')) unless Bio::SQL::Term.exists?(:name =>synonym) send :define_method, method_reader do #return an array of bioentry_qualifier_values begin term = Term.find_or_create_by_name(:name => synonym, :ontology=> Ontology.find_by_name('Annotation Tags')) bioentry_qualifier_values = @entry.bioentry_qualifier_values.find_all_by_term_id(term) bioentry_qualifier_values.map{|row| row.value} unless bioentry_qualifier_values.nil? rescue Exception => e puts "Reader Error: #{synonym} #{e.message}" end end send :define_method, method_writer_operator do |value| begin term = Term.find_or_create_by_name(:name => synonym, :ontology=> Ontology.find_by_name('Annotation Tags')) datas = @entry.bioentry_qualifier_values.find_all_by_term_id(term.term_id) #add an element incrementing the rank or setting the first to 1 @entry.bioentry_qualifier_values.create(:term_id=>term.term_id, :rank=>datas.empty? ? 1 : datas.last.rank.succ, :value=>value) rescue Exception => e puts "WriterOperator= Error: #{synonym} #{e.message}" end end send :define_method, method_writer_modder do |value, rank| begin term = Term.find_or_create_by_name(:name => synonym, :ontology=> Ontology.find_by_name('Annotation Tags')) data = @entry.bioentry_qualifier_values.find_by_term_id_and_rank(term.term_id, rank) if data.nil? send method_writer_operator, value else data.value=value data.save! end rescue Exception => e puts "WriterModder Error: #{synonym} #{e.message}" end end end public attr_reader :entry def delete @entry.destroy end def get_seqfeature(sf) #in seqfeature BioSQL class locations_str = sf.locations.map{|loc| loc.to_s}.join(',') #pp sf.locations.inspect locations_str = "join(#{locations_str})" if sf.locations.count>1 Bio::Feature.new(sf.type_term.name, locations_str,sf.seqfeature_qualifier_values.collect{|sfqv| Bio::Feature::Qualifier.new(sfqv.term.name,sfqv.value)}) end def length=(len) @entry.biosequence.length=len end def initialize(options={}) options.assert_valid_keys(:entry, :biodatabase_id,:biosequence) return @entry = options[:entry] unless options[:entry].nil? return to_biosql(options[:biosequence], options[:biodatabase_id]) unless options[:biosequence].nil? or options[:biodatabase_id].nil? end def to_biosql(bs,biodatabase_id) #Transcaction works greatly!!! # begin Bioentry.transaction do @entry = Bioentry.new(:biodatabase_id=>biodatabase_id, :name=>bs.entry_id) # pp "primary" self.primary_accession = bs.primary_accession # pp "def" self.definition = bs.definition unless bs.definition.nil? # pp "seqver" self.sequence_version = bs.sequence_version # pp "divi" self.division = bs.division unless bs.division.nil? @entry.save! # pp "secacc" bs.secondary_accessions.each do |sa| #write as qualifier every secondary accession into the array self.secondary_accessions = sa end #to create the sequence entry needs to exists # pp "seq" self.seq = bs.seq unless bs.seq.nil? # pp "mol" self.molecule_type = bs.molecule_type unless bs.molecule_type.nil? # pp "dc" self.data_class = bs.data_class unless bs.data_class.nil? # pp "top" self.topology = bs.topology unless bs.topology.nil? # pp "datec" self.date_created = bs.date_created unless bs.date_created.nil? # pp "datemod" self.date_modified = bs.date_modified unless bs.date_modified.nil? # pp "key" bs.keywords.each do |kw| #write as qualifier every secondary accessions into the array self.keywords = kw end #FIX: problem settinf texon_name: embl has "Arabidopsis thaliana (thale cress)" but in taxon_name table there isn't this name. I must check if there is a new version of the table #pp "spec" self.species = bs.species unless bs.species.nil? # pp "Debug: #{bs.species}" # pp "feat" bs.features.each do |feat| self.feature=feat end #TODO: add comments and references end #transaction return self rescue Exception => e pp "to_biosql exception: #{e}" end end #to_biosql def name @entry.name end alias entry_id name def name=(value) @entry.name=value end alias entry_id= name= def primary_accession @entry.accession end def primary_accession=(value) @entry.accession=value end #TODO def secondary_accession # @entry.bioentry_qualifier_values # end def organism @entry.taxon.nil? ? "" : @entry.taxon.taxon_scientific_name.name end alias species organism def organism=(value) taxon_name=TaxonName.find_by_name_and_name_class(value,'scientific name') if taxon_name.nil? puts "Error value doesn't exists in taxon_name table with scientific name constraint." else @entry.taxon_id=taxon_name.taxon_id @entry.save! end end alias species= organism= def database @entry.biodatabase.name end def database_desc @entry.biodatabase.description end def version @entry.version end alias sequence_version version def version=(value) @entry.version=value end alias sequence_version= version= def division @entry.division end def division=(value) @entry.division=value end def description @entry.description end alias definition description def description=(value) @entry.description=value end alias definition= description= def identifier @entry.identifier end def identifier=(value) @entry.identifier=value end bioentry_qualifier_anchor :data_class bioentry_qualifier_anchor :molecule_type, :synonym=>'mol_type' bioentry_qualifier_anchor :topology bioentry_qualifier_anchor :date_created bioentry_qualifier_anchor :date_modified, :synonym=>'date_changed' bioentry_qualifier_anchor :keywords, :synonym=>'keyword' bioentry_qualifier_anchor :secondary_accessions, :synonym=>'secondary_accession' def features Bio::Features.new(@entry.seqfeatures.collect {|sf| self.get_seqfeature(sf)}) end def feature=(feat) #TODO: fix ontology_id and source_term_id type_term = Term.find_or_create_by_name(:name=>feat.feature, :ontology_id=>1) seqfeature = Seqfeature.new(:bioentry=>@entry, :source_term_id=>2, :type_term=>type_term, :rank=>@entry.seqfeatures.count.succ, :display_name=>'') seqfeature.save! feat.locations.each do |loc| location = Location.new(:seqfeature=>seqfeature, :start_pos=>loc.from, :end_pos=>loc.to, :strand=>loc.strand, :rank=>seqfeature.locations.count.succ) location.save! end feat.each do |qualifier| qual_term = Term.find_or_create_by_name(:name=>qualifier.qualifier, :ontology_id=>3) qual = SeqfeatureQualifierValue.new(:seqfeature=>seqfeature, :term=>qual_term, :value=>qualifier.value, :rank=>seqfeature.seqfeature_qualifier_values.count.succ) qual.save! end end def seq Bio::Sequence.auto(@entry.biosequence.seq) unless @entry.biosequence.nil? end def seq=(value) #chk which type of alphabet is, NU/NA/nil #value could be nil ? I think no. if @entry.biosequence.nil? @entry.biosequence = Biosequence.new(:seq=>value) @entry.biosequence.save! else @entry.biosequence.seq=value end self.length=value.length end def taxonomy tax = [] taxon = @entry.taxon while taxon and taxon.taxon_id != taxon.parent_taxon_id tax << taxon.taxon_scientific_name.name #Note: I don't like this call very much, correct with a relationship in the ref class. taxon = Taxon.find(taxon.parent_taxon_id) end tax.reverse end def length @entry.biosequence.length end def references #return and array of hash, hash has these keys ["title", "dbxref_id", "reference_id", "authors", "crc", "location"] #probably would be better to d a class refrence to collect these informations @entry.bioentry_references.collect do |ref| hash = Hash.new hash['authors'] = ref.reference.authors hash['title'] = ref.reference.title hash['embl_gb_record_number'] = ref.reference.rank #about location/journal take a look to hilmar' schema overview. #TODO: solve the problem with specific comment per reference. #TODO: get dbxref hash['journal'] = ref.reference.location hash['xrefs'] = "#{ref.reference.dbxref.dbname}; #{ref.reference.dbxref.accession}." Bio::Reference.new(hash) end end def comments @entry.comments.map do |comment| comment.comment_text end end def save #I should add chks for SQL errors @entry.biosequence.save @entry.save end def to_fasta #prima erano 2 print in stdout, meglio ritornare una stringa in modo che poi ci si possa fare quello che si vuole #print ">" + accession + "\n" #print seq.gsub(Regexp.new(".{1,#{60}}"), "\\0\n") ">" + accession + "\n" + seq.gsub(Regexp.new(".{1,#{60}}"), "\\0\n") end def to_fasta_reverse_complememt ">" + accession + "\n" + seq.reverse_complement.gsub(Regexp.new(".{1,#{60}}"), "\\0\n") end # converts Bio::SQL::Sequence to Bio::Sequence # --- # *Arguments*: # *Returns*:: Bio::Sequence object #TODO: def to_biosequence # sequence = Bio::Sequence.new(seq) # sequence.entry_id = entry_id # # sequence.primary_accession = accession # sequence.secondary_accessions = accession # # sequence.molecule_type = natype # sequence.division = division # sequence.topology = circular # # sequence.sequence_version = version # #sequence.date_created = nil #???? # sequence.date_modified = date # # sequence.definition = definition # sequence.keywords = keywords # sequence.species = organism # sequence.classification = self.taxonomy.to_s.sub(/\.\z/, '').split(/\s*\;\s*/) # #sequence.organnella = nil # not used # sequence.comments = comment # sequence.references = references # sequence.features = features # return sequence # end # # def load_fasta(entry, biodatabase) # result=nil # # if !entry.accession.nil? then # ## pp biodatabase # begin # Bioentry.transaction do # bioentry=Bioentry.new(:biodatabase=>biodatabase, :name=>entry.accession, :accession=>entry.accession, \ # :description=>entry.definition, :version=>0) # # # bioentry=Bioentry.new(:biodatabase=>biodatabase, :name=>entry.accession, :accession=>entry.accession, \ # # :description=>entry.definition, :version=>entry.acc_version.split(/\./).last, :identifier=>entry.gi) # ## pp bioentry # bioentry.save! # result=bioentry # begin # Biosequence.transaction do # bioentry.biosequence = Biosequence.new(:seq=>entry.seq, :version=>0, :length=>entry.seq.length, :alphabet=>'') # bioentry.biosequence.save! # end #Bioseqence.transaction # rescue Exception => exc # puts "Error Biosequence: #{exc.message}" # end #Rescue Biosequence # end #Bioentry.transaction # rescue ActiveRecord::RecordInvalid => e # puts "Error: Transaction Aborted on class #{e.record.class}, table #{e.record.class.table_name} due to:" # e.record.errors.each{|att, msg| # puts "#{att} => #{msg}" # } # rescue Exception => exc # puts "Errore Bioentry: #{exc.message}" # end #Resce Bioentry # # end #entry chk # return result # end #load_fasta # # def load_gb(entry, biodatabase) # ## pp biodatabase # result=nil # # begin # Bioentry.transaction do # bioentry=Bioentry.new(:biodatabase=>biodatabase, :name=>entry.entry_id, :accession=>entry.entry_id, :division=>entry.division, \ # :description=>entry.definition, :version=>entry.version, :identifier=>entry.gi.split(/:/).last.to_i) # ## pp bioentry # bioentry.save! # # result=bioentry # # # end #Bioentry.transaction # ##debug pp ["Bioentry", [:name=>entry.entry_id, :accession=>entry.entry_id, :division=>entry.division, # ## :description=>entry.definition, :version=>entry.version, :identifier=>entry.gi.split(/:/).last.to_i]] # # #delete biodatabase.bioentries << bioentry # #note Alphabet not defined # # begin # rank_comment=1 # Comment.transaction do # if !entry.comment.empty? then # bioentry.comment = Comment.new(:comment_text=>entry.comment, :rank=>rank_comment) # bioentry.comment.save! # rank_comment=rank_comment.next # end # end #Comment.transaction # rescue Exception => exc # puts "Error Comment: #{exc.message}" # end #Rescue Command # #debug pp "Comment" # ##debug pp ["Comment", [:comment_text=>entry.comment]] if !entry.comment.empty? # begin # Biosequence.transaction do # bioentry.biosequence = Biosequence.new(:seq=>entry.seq, :version=>0, :length=>entry.seq.length, :alphabet=>'') # bioentry.biosequence.save! # end #Bioseqence.transaction # rescue Exception => exc # puts "Error Biosequence: #{exc.message}" # end #Rescue Biosequence # #debug pp "Biosequence" # ##debug pp ["Biosequence", :seq=>entry.seq, :version=>0, :length=>entry.seq.length, :alphabet=>''] # begin # rank_seqfeature=1 # Seqfeature.transaction do # entry.features.each do |feature| # #note Rank default to ZERO, display_name String empty # #note Chek if exists term name ##delete puts "Feature #{feature.inspect}" ##delete puts "FeatureFeature #{feature.feature.inspect}" # # type_term = Term.exists?(:name=>feature.feature) ? Term.find_by_name(feature.feature) : Term.create!(:name=>feature.feature, :ontology_id=>1) # # seqfeature = Seqfeature.new(:bioentry=>bioentry, :source_term_id=>2, :typeterm=>Term.find_by_name(feature.feature), :rank=>rank_seqfeature, :display_name=>'') ##delete puts "Type Term #{type_term.inspect}" # seqfeature = Seqfeature.new(:bioentry=>bioentry, :source_term_id=>2, :type_term=>type_term, :rank=>rank_seqfeature, :display_name=>'') ##delete puts "Seqfeature #{seqfeature.inspect}" # seqfeature.save! # ##debug pp ["Seqfeature", [:source_term_id=>2, :typeterm=>Term.find_by_name(feature.feature), :rank=>0, :display_name=>'']] # begin # Location.transaction do # feature.locations.each do |loc| # location = Location.new(:seqfeature=>seqfeature, :start_pos=>loc.from, :end_pos=>loc.to, :strand=>loc.strand) # location.save! # ##debug pp ["Location",[:start_pos=>loc.from, :end_pos=>loc.to, :strand=>loc.strand]] # end #locations # end #Location.transaction # rescue Exception => exc # puts "Error Location: #{exc.message}" # end #Rescue Location # #debug pp "Locations" # #delete bioentry.seqfeatures << seqfeature ##delete if nil # begin # rank_seqfeaturequalifiervalue=0 # rank_qual_qualifier="" # SeqfeatureQualifierValue.transaction do # feature.each do |qual| # # #gestisce il livello dei qualificatori... # if (rank_qual_qualifier==qual.qualifier) then # rank_seqfeaturequalifiervalue=rank_seqfeaturequalifiervalue.next # else # rank_seqfeaturequalifiervalue=1 # rank_qual_qualifier=qual.qualifier # end # # ##debug pp ["SeqfeatureQualifierValue", qual.qualifier, [ :term=>Term.find_by_name(qual.qualifier), :value=>qual.value]] # term = Term.exists?(:name=>qual.qualifier) ? Term.find_by_name(qual.qualifier) : Term.create!(:name=>qual.qualifier, :ontology_id=>3) # # # qual = SeqfeatureQualifierValue.new(:seqfeature=>seqfeature, :term=>Term.find_by_name(qual.qualifier), :value=>qual.value, :rank=>rank_seqfeaturequalifiervalue) # qual = SeqfeatureQualifierValue.new(:seqfeature=>seqfeature, :term=>term, :value=>qual.value, :rank=>rank_seqfeaturequalifiervalue) # qual.save! # end #qualifiers # end #SeqfeatureQualifierValue.transaction # rescue Exception => exc # puts "Error SeqfeatureQualifierValue: #{exc.message}" # end #Rescue SeqfeatureQualifierValue ###delete end #debug if nil # #debug pp "SeqfeatureQualifierValue" # rank_seqfeature=rank_seqfeature.next # end #features # end #Seqfeature.transaction # rescue Exception => exc # puts "Error Seqfeature: #{exc.message}" # end #Rescue Seqfeature # # end #Bioentry.transaction # rescue ActiveRecord::RecordInvalid => e # puts "Error: Transaction Aborted on class #{e.record.class}, table #{e.record.class.table_name} due to:" # e.record.errors.each{|att, msg| # puts "#{att} => #{msg}" # } # rescue Exception => exc # puts "Errore Bioentry: #{exc.message}" # end #Resce Bioentry # return result # end #load_gb # # def load_embl(entry, biodatabase) # # # puts biodatabase # result=nil # # begin # Bioentry.transaction do # bioentry=Bioentry.new(:biodatabase=>biodatabase, :name=>entry.entry_id, :accession=>entry.entry_id, :division=>entry.division, \ # :description=>entry.definition, :version=>entry.version, :identifier=>entry.entry_id) # # puts bioentry # bioentry.save! # result=bioentry # # # end #Bioentry.transaction # # puts ["Bioentry", [:name=>entry.entry_id, :accession=>entry.entry_id, :division=>entry.division,\ # # :description=>entry.definition, :version=>entry.version, :identifier=>entry.entry_id]] # # #delete biodatabase.bioentries << bioentry # #note Alphabet not defined # begin # rank_comment=1 # #qui potrebbero essercene di pi?? # Comment.transaction do # if !entry.cc.empty? # bioentry.comment = Comment.new(:comment_text=>entry.cc, :rank=>rank_comment) # bioentry.comment.save! # rank_comment=rank_comment.next # end # end #Comment.transaction # rescue Exception => exc # puts "Error Comment: #{exc.message}" # end #Rescue Command # # puts "Comment" # # puts ["Comment", [:comment_text=>entry.cc]] if !entry.cc.empty? # begin # Biosequence.transaction do # bioentry.biosequence = Biosequence.new(:seq=>entry.seq, :version=>0, :length=>entry.seq.length, :alphabet=>entry.molecule_type) # bioentry.biosequence.save! # end #Bioseqence.transaction # rescue Exception => exc # puts "Error Biosequence: #{exc.message}" # end #Rescue Biosequence # #debug pp "Biosequence" # ##debug pp ["Biosequence", :seq=>entry.seq, :version=>0, :length=>entry.seq.length, :alphabet=>''] # begin # rank_seqfeature=1 # Seqfeature.transaction do # entry.features.each do |feature| # #note Rank default to ZERO, display_name String empty # #note Chek if exists term name # type_term = Term.exists?(:name=>feature.feature) ? Term.find_by_name(feature.feature) : Term.create!(:name=>feature.feature, :ontology_id=>1) # # seqfeature = Seqfeature.new(:bioentry=>bioentry, :source_term_id=>2, :typeterm=>Term.find_by_name(feature.feature), :rank=>rank_seqfeature, :display_name=>'') # seqfeature = Seqfeature.new(:bioentry=>bioentry, :source_term_id=>2, :type_term=>type_term, :rank=>rank_seqfeature, :display_name=>'') # seqfeature.save! # ##debug pp ["Seqfeature", [:source_term_id=>2, :typeterm=>Term.find_by_name(feature.feature), :rank=>0, :display_name=>'']] # begin # Location.transaction do # feature.locations.each do |loc| # location = Location.new(:seqfeature=>seqfeature, :start_pos=>loc.from, :end_pos=>loc.to, :strand=>loc.strand) # location.save! # ##debug pp ["Location",[:start_pos=>loc.from, :end_pos=>loc.to, :strand=>loc.strand]] # end #locations # end #Location.transaction # rescue Exception => exc # puts "Error Location: #{exc.message}" # end #Rescue Location # #debug pp "Locations" # #delete bioentry.seqfeatures << seqfeature # begin # rank_seqfeaturequalifiervalue=0 # rank_qual_qualifier="" # SeqfeatureQualifierValue.transaction do # feature.each do |qual| # #gestisce il livello dei qualificatori... # if (rank_qual_qualifier==qual.qualifier) then # rank_seqfeaturequalifiervalue=rank_seqfeaturequalifiervalue.next # else # rank_seqfeaturequalifiervalue=1 # rank_qual_qualifier=qual.qualifier # end # # ##debug pp ["SeqfeatureQualifierValue", qual.qualifier, [ :term=>Term.find_by_name(qual.qualifier), :value=>qual.value]] # term = Term.exists?(:name=>qual.qualifier) ? Term.find_by_name(qual.qualifier) : Term.create!(:name=>qual.qualifier, :ontology_id=>3) # # qual = SeqfeatureQualifierValue.new(:seqfeature=>seqfeature, :term=>Term.find_by_name(qual.qualifier), :value=>qual.value, :rank=>rank_seqfeaturequalifiervalue) # qual = SeqfeatureQualifierValue.new(:seqfeature=>seqfeature, :term=>term, :value=>qual.value, :rank=>rank_seqfeaturequalifiervalue) # # qual.save! # end #qualifiers # end #SeqfeatureQualifierValue.transaction # rescue Exception => exc # puts "Error SeqfeatureQualifierValue: #{exc.message}" # end #Rescue SeqfeatureQualifierValue # #debug pp "SeqfeatureQualifierValue" # rank_seqfeature=rank_seqfeature.next # end #features # end #Seqfeature.transaction # rescue Exception => exc # puts "Error Seqfeature: #{exc.message}" # end #Rescue Seqfeature # end #Bioentry.transaction # rescue ActiveRecord::RecordInvalid => e # puts "Error: Transaction Aborted on class #{e.record.class}, table #{e.record.class.table_name} due to:" # e.record.errors.each{|att, msg| # puts "#{att} => #{msg}" # } # rescue Exception => exc # puts "Errore Bioentry: #{exc.message}" # end #Resce Bioentry # # return result # end #load_embl def to_biosequence bio_seq = Bio::Sequence.new(seq) bio_seq.entry_id = entry_id bio_seq.primary_accession = primary_accession bio_seq.secondary_accessions = secondary_accessions bio_seq.molecule_type = molecule_type #TODO: identify where is stored data_class in biosql bio_seq.data_class = data_class bio_seq.definition = description bio_seq.topology = topology bio_seq.date_created = date_created bio_seq.date_modified = date_modified bio_seq.division = division bio_seq.sequence_version = sequence_version bio_seq.keywords = keywords bio_seq.species = species bio_seq.classification = taxonomy bio_seq.references = references bio_seq.features = features return bio_seq end end #Sequence #gb=Bio::FlatcFile.open(Bio::GenBank, "/Development/Projects/Cocco/Data/Riferimenti/Genomi/NC_003098_Cocco_R6.gb") #db=Biodatabase.find_by_name('fake') #gb.each_entry {|entry| Sequence.new(:entry=>entry, :biodatabase=>db)} end #SQL end #Bio #TODO create tests for sequence object, roundtrip of informations if __FILE__ == $0 require 'bio' require 'bio/io/sql' require 'pp' # connection = Bio::SQL.establish_connection('bio/io/biosql/config/database.yml','development') connection = Bio::SQL.establish_connection({'development'=>{'database'=>"biorails_development", 'adapter'=>"postgresql", 'username'=>"rails", 'password'=>nil}},'development') databases = Bio::SQL.list_databases # parser = Bio::FlatFile.auto('/home/febo/Desktop/aj224122.embl') # parser = Bio::FlatFile.auto('/home/febo/Desktop/aj224122.gb') parser = Bio::FlatFile.auto('/home/febo/Desktop/aj224122.fasta') parser.each do |entry| biosequence = entry.to_biosequence result = Bio::SQL::Sequence.new(:biosequence=>biosequence,:biodatabase_id=>databases.first[:id]) unless Bio::SQL.exists_accession(biosequence.primary_accession) if result.nil? pp "The sequence is already present into the biosql database" else # pp "Sequence" puts result.to_biosequence.output(:genbank) #:embl end end #NOTE: ho sistemato le features e le locations, mancano le references e i comments. poi credo che il tutto sia a posto. if false sqlseq = Bio::SQL.fetch_accession('AJ224122') #only output tests. pp "Connection" pp connection pp "Seq in dbs" pp Bio::SQL.list_entries #; NC_003098 #pp sqlseq pp sqlseq.entry.inspect pp "sequence" #pp Bio::Sequence.auto(sqlseq.seq) pp "entry_id" pp sqlseq.entry_id pp "primary" pp sqlseq.accession pp "secondary_accessions" pp sqlseq.secondary_accessions pp "molecule type" pp sqlseq.molecule_type pp "data_class" pp sqlseq.data_class pp "division" pp sqlseq.division # NOTE : Topology is not represented in biosql? pp "topology" #TODO: CIRCULAR this at present maps to bioentry_qualifier_value, though there are plans to make it a column in table biosequence. pp sqlseq.topology pp "version" pp sqlseq.version #sequence.date_created = nil #???? pp "date modified" pp sqlseq.date_modified pp "definition" pp sqlseq.definition pp "keywords" pp sqlseq.keywords pp "species" pp sqlseq.organism #sequence.classification = self.taxonomy.to_s.sub(/\.\z/, '').split(/\s*\;\s*/)" pp "classification" pp sqlseq.taxonomy #sequence.organnella = nil # not used pp "comments" pp sqlseq.comments pp "references" pp sqlseq.references pp "features" pp sqlseq.features puts sqlseq.to_biosequence.output(:embl) end ## end From helios at dev.open-bio.org Tue Mar 25 11:47:05 2008 From: helios at dev.open-bio.org (Raoul Jean Pierre Bonnal) Date: Tue, 25 Mar 2008 15:47:05 -0000 Subject: [BioRuby-cvs] bioruby/lib/bio/io/biosql/config database.yml, NONE, 1.1.2.1 Message-ID: <200803251546.m2PFkYSk013330@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/io/biosql/config In directory dev.open-bio.org:/tmp/cvs-serv13290/lib/bio/io/biosql/config Added Files: Tag: BRANCH-biohackathon2008 database.yml Log Message: BioSQL release "MIFI". biosql->biosequence, biosequence->biosql. Supported formats: Embl, Genbank; support sql stransactions creating new sequences on biosql; does not support references and comments for genbank and embl. Fasta->biosequence->biosql dosn't work. --- NEW FILE: database.yml --- #This is the database configuration specific for BioSQL #User can configure it's db here development: adapter: postgresql database: biorails_development username: rails password: test: adapter: postgresql database: biorails_test username: rails password: production: adapter: postgresql database: biorails_production username: rails password: From helios at dev.open-bio.org Tue Mar 25 11:47:07 2008 From: helios at dev.open-bio.org (Raoul Jean Pierre Bonnal) Date: Tue, 25 Mar 2008 15:47:07 -0000 Subject: [BioRuby-cvs] bioruby/lib/bio/io/biosql ontology.rb, NONE, 1.1.2.1 reference.rb, NONE, 1.1.2.1 term_path.rb, NONE, 1.1.2.1 bioentry_dbxref.rb, NONE, 1.1.2.1 biodatabase.rb, NONE, 1.1.2.1 seqfeature.rb, NONE, 1.1.2.1 term_relationship.rb, NONE, 1.1.2.1 location.rb, NONE, 1.1.2.1 seqfeature_path.rb, NONE, 1.1.2.1 bioentry_relationship.rb, NONE, 1.1.2.1 dbxref_qualifier_value.rb, NONE, 1.1.2.1 dbxref.rb, NONE, 1.1.2.1 term_relationship_term.rb, NONE, 1.1.2.1 bioentry_reference.rb, NONE, 1.1.2.1 taxon_name.rb, NONE, 1.1.2.1 bioentry_path.rb, NONE, 1.1.2.1 biosequence.rb, NONE, 1.1.2.1 term.rb, NONE, 1.1.2.1 term_dbxref.rb, NONE, 1.1.2.1 seqfeature_qualifier_value.rb, NONE, 1.1.2.1 bioentry_qualifier_value.rb, NONE, 1.1.2.1 seqfeature_dbxref.rb, NONE, 1.1.2.1 location_qualifier_value.rb, NONE, 1.1.2.1 seqfeature_relationship.rb, NONE, 1.1.2.1 bioentry.rb, NONE, 1.1.2.1 taxon.rb, NONE, 1.1.2.1 comment.rb, NONE, 1.1.2.1 term_synonym.rb, NONE, 1.1.2.1 Message-ID: <200803251546.m2PFkY03013318@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/io/biosql In directory dev.open-bio.org:/tmp/cvs-serv13290/lib/bio/io/biosql Added Files: Tag: BRANCH-biohackathon2008 ontology.rb reference.rb term_path.rb bioentry_dbxref.rb biodatabase.rb seqfeature.rb term_relationship.rb location.rb seqfeature_path.rb bioentry_relationship.rb dbxref_qualifier_value.rb dbxref.rb term_relationship_term.rb bioentry_reference.rb taxon_name.rb bioentry_path.rb biosequence.rb term.rb term_dbxref.rb seqfeature_qualifier_value.rb bioentry_qualifier_value.rb seqfeature_dbxref.rb location_qualifier_value.rb seqfeature_relationship.rb bioentry.rb taxon.rb comment.rb term_synonym.rb Log Message: BioSQL release "MIFI". biosql->biosequence, biosequence->biosql. Supported formats: Embl, Genbank; support sql stransactions creating new sequences on biosql; does not support references and comments for genbank and embl. Fasta->biosequence->biosql dosn't work. --- NEW FILE: location.rb --- module Bio class SQL class Location < DummyBase #set_sequence_name "location_pk_seq" belongs_to :seqfeature belongs_to :dbxref belongs_to :term has_many :location_qualifier_values def to_s if strand==-1 str="complement("+start_pos.to_s+".."+end_pos.to_s+")" else str=start_pos.to_s+".."+end_pos.to_s end return str end end end #SQL end #Bio --- NEW FILE: bioentry_reference.rb --- module Bio class SQL class BioentryReference < DummyBase set_primary_key :bioentry_reference_id belongs_to :bioentry belongs_to :reference end end #SQL end #Bio --- NEW FILE: bioentry_qualifier_value.rb --- module Bio class SQL class BioentryQualifierValue < DummyBase #NOTE: added rank to primary_keys, now it's finished. set_primary_keys :bioentry_id, :term_id, :rank belongs_to :bioentry belongs_to :term end #BioentryQualifierValue end #SQL end #Bio --- NEW FILE: biosequence.rb --- module Bio class SQL class Biosequence < DummyBase set_primary_key "bioentry_id" #delete set_sequence_name "biosequence_pk_seq" belongs_to :bioentry end end #SQL end #Bio --- NEW FILE: term.rb --- module Bio class SQL class Term < DummyBase set_sequence_name "term_pk_seq" belongs_to :ontology has_many :seqfeature_qualifier_values, :class_name => "SeqfeatureQualifierValue" has_many :dbxref_qualifier_values, :class_name => "DbxrefQualifierValue" has_many :bioentry_qualifer_values, :class_name => "BioentryQualifierValue" has_many :bioentries, :through=>:bioentry_qualifier_values has_many :locations, :class_name => "Location" has_many :seqfeature_relationships, :class_name => "SeqfeatureRelationship" has_many :term_dbxrefs, :class_name => "TermDbxref" has_many :term_relationship_terms, :class_name => "TermRelationshipTerm" has_many :term_synonyms, :class_name => "TermSynonym" has_many :location_qualifier_values, :class_name => "LocationQualifierValue" has_many :seqfeature_types, :class_name => "Seqfeature", :foreign_key => "type_term_id" has_many :seqfeature_sources, :class_name => "Seqfeature", :foreign_key => "source_term_id" has_many :term_path_subjects, :class_name => "TermPath", :foreign_key => "subject_term_id" has_many :term_path_predicates, :class_name => "TermPath", :foreign_key => "predicate_term_id" has_many :term_path_objects, :class_name => "TermPath", :foreign_key => "object_term_id" has_many :term_relationship_subjects, :class_name => "TermRelationship", :foreign_key =>"subject_term_id" has_many :term_relationship_predicates, :class_name => "TermRelationship", :foreign_key =>"predicate_term_id" has_many :term_relationship_objects, :class_name => "TermRelationship", :foreign_key =>"object_term_id" end end #SQL end #Bio --- NEW FILE: bioentry_relationship.rb --- module Bio class SQL class BioentryRelationship < DummyBase #delete set_primary_key "bioentry_relationship_id" set_sequence_name "bieontry_relationship_pk_seq" belongs_to :object_bioentry, :class_name => "Bioentry" belongs_to :subject_bioentry, :class_name => "Bioentry" end end #SQL end #Bio --- NEW FILE: dbxref.rb --- module Bio class SQL class Dbxref < DummyBase #delete set_primary_key "dbxref_id" set_sequence_name "dbxref_pk_seq" has_many :dbxref_qualifier_values, :class_name => "DbxrefQualifierValue" has_many :locations, :class_name => "Location" has_many :references, :class_name=>"Reference" has_many :term_dbxrefs, :class_name => "TermDbxref" has_many :bioentry_dbxrefs, :class_name => "BioentryDbxref" #TODO: check is with bioentry there is an has_and_belongs_to_many relationship has specified in schema overview. end end #SQL end #Bio --- NEW FILE: bioentry_path.rb --- module Bio class SQL class BioentryPath < DummyBase set_primary_key nil #delete set_sequence_name nil belongs_to :term #da sistemare per poter procedere. belongs_to :object_bioentry, :class_name=>"Bioentry" belongs_to :subject_bioentry, :class_name=>"Bioentry" end #BioentryPath end #SQL end #Bio --- NEW FILE: term_dbxref.rb --- module Bio class SQL class TermDbxref < DummyBase set_primary_key nil #term_id, dbxref_id #delete set_sequence_name nil belongs_to :term belongs_to :dbxref end end #SQL end #Bio --- NEW FILE: dbxref_qualifier_value.rb --- module Bio class SQL class DbxrefQualifierValue < DummyBase #think to use composite primary key set_primary_key nil #dbxref_id, term_id, rank #delete set_sequence_name nil belongs_to :dbxref belongs_to :term end end #SQL end #Bio --- NEW FILE: seqfeature_dbxref.rb --- module Bio class SQL class SeqfeatureDbxref < DummyBase set_primary_key nil #seqfeature_id, dbxref_id #delete set_sequence_name nil belongs_to :seqfeature belongs_to :dbxref end end #SQL end #Bio --- NEW FILE: term_relationship_term.rb --- module Bio class SQL class TermRelationshipTerm < DummyBase #delete set_sequence_name nil set_primary_key :term_relationship_id belongs_to :term_relationship belongs_to :term end end #SQL end #Bio --- NEW FILE: location_qualifier_value.rb --- module Bio class SQL class LocationQualifierValue < DummyBase set_primary_key nil #location_id, term_id #delete set_sequence_name nil belongs_to :location belongs_to :term end end #SQL end #Bio --- NEW FILE: taxon_name.rb --- module Bio class SQL class TaxonName < DummyBase set_primary_keys :taxon_id, :name, :name_class belongs_to :taxon end end #SQL end #Bio --- NEW FILE: seqfeature_relationship.rb --- module Bio class SQL class SeqfeatureRelationship "Seqfeature" belongs_to :subject_seqfeature, :class_name => "Seqfeature" end end #SQL end #Bio --- NEW FILE: term_path.rb --- module Bio class SQL class TermPath < DummyBase set_sequence_name "term_path_pk_seq" belongs_to :ontology belongs_to :subject_term, :class_name => "Term" belongs_to :object_term, :class_name => "Term" belongs_to :predicate_term, :class_name => "Term" end end #SQL end #Bio --- NEW FILE: ontology.rb --- module Bio class SQL class Ontology < DummyBase #delete set_primary_key "ontology_id" set_sequence_name "ontology_pk_seq" has_many :terms has_many :term_paths has_many :term_relationships end end #SQL end #Bio --- NEW FILE: term_synonym.rb --- module Bio class SQL class TermSynonym < DummyBase #delete set_sequence_name nil set_primary_key nil belongs_to :term end end #SQL end #Bio --- NEW FILE: seqfeature_qualifier_value.rb --- module Bio class SQL class SeqfeatureQualifierValue < DummyBase set_primary_keys :seqfeature_id, :term_id, :rank set_sequence_name nil belongs_to :seqfeature belongs_to :term end end #SQL end #Bio --- NEW FILE: bioentry.rb --- module Bio class SQL class Bioentry < DummyBase # set_sequence_name "bioentry_pk_seq" belongs_to :biodatabase belongs_to :taxon has_one :biosequence has_many :comments, :class_name =>"Comment", :order =>'rank' has_many :seqfeatures, :order=>'rank' has_many :bioentry_references, :class_name=>"BioentryReference" #, :foreign_key => "bioentry_id" has_many :bioentry_dbxrefs has_many :object_bioentry_relationships, :class_name=>"BioentryRelationship", :foreign_key=>"object_bioentry_id" #non mi convince molto credo non funzioni nel modo corretto has_many :subject_bioentry_relationships, :class_name=>"BioentryRelationship", :foreign_key=>"subject_bioentry_id" #non mi convince molto credo non funzioni nel modo corretto has_many :cdsfeatures, :class_name=>"Seqfeature", :foreign_key =>"bioentry_id", :conditions=>["terms.name='CDS'"], :include=>"typeterm" has_many :terms, :through=>:bioentry_qualifier_values #NOTE: added order_by for multiple hit and manage ranks correctly has_many :bioentry_qualifier_values, :order=>"bioentry_id,term_id,rank" #per la creazione richiesti: #name, accession, version # validates_uniqueness_of :accession, :scope=>[:biodatabase_id] # validates_uniqueness_of :name, :scope=>[:biodatabase_id] # validates_uniqueness_of :identifier, :scope=>[:biodatabase_id] end end #SQL end #Bio --- NEW FILE: reference.rb --- module Bio class SQL class Reference < DummyBase belongs_to :dbxref has_many :bioentry_references, :class_name=>"BioentryRefernce" end end #SQL end #Bio --- NEW FILE: seqfeature.rb --- module Bio class SQL class Seqfeature "Term", :foreign_key => "type_term_id" belongs_to :source_term, :class_name => "Term", :foreign_key =>"source_term_id" has_many :seqfeature_dbxrefs has_many :dbxrefs has_many :seqfeature_qualifier_values, :order=>'rank' has_many :locations, :order=>'rank' has_many :object_seqfeature_paths, :class_name => "SeqfeaturePath", :foreign_key => "object_seqfeature_id" has_many :subject_seqfeature_paths, :class_name => "SeqfeaturePath", :foreign_key => "subject_seqfeature_id" has_many :object_seqfeature_relationships, :class_name => "SeqfeatureRelationship", :foreign_key => "object_seqfeature_id" has_many :subject_seqfeature_relationships, :class_name => "SeqfeatureRelationship", :foreign_key => "subject_seqfeature_id" end end #SQL end #Bio --- NEW FILE: comment.rb --- module Bio class SQL class Comment < DummyBase #delete set_primary_key "comment_id" set_sequence_name "comment_pk_seq" belongs_to :bioentry end end #SQL end #Bio --- NEW FILE: seqfeature_path.rb --- module Bio class SQL class SeqfeaturePath < DummyBase set_primary_key nil set_sequence_name nil belongs_to :object_seqfeature, :class_name => "Seqfeature" belongs_to :subject_seqfeature, :class_name => "Seqfeature" end end #SQL end #Bio --- NEW FILE: bioentry_dbxref.rb --- module Bio class SQL class BioentryDbxref < DummyBase #delete set_sequence_name nil set_primary_key nil #bioentry_id,dbxref_id belongs_to :bioentry belongs_to :dbxref end end #SQL end #Bio --- NEW FILE: term_relationship.rb --- module Bio class SQL class TermRelationship < DummyBase set_sequence_name "term_relationship_pk_seq" belongs_to :ontology belongs_to :subject_term, :class_name => "Term" belongs_to :predicate_term, :class_name => "Term" belongs_to :object_term, :class_name => "Term" has_one :term_relationship_term end end #SQL end #Bio --- NEW FILE: taxon.rb --- module Bio class SQL class Taxon < DummyBase set_sequence_name "taxon_pk_seq" has_many :taxon_names, :class_name => "TaxonName" has_one :taxon_scientific_name, :class_name => "TaxonName", :conditions=>"name_class = 'scientific name'" has_one :bioentry end end #SQL end #Bio --- NEW FILE: biodatabase.rb --- module Bio class SQL class Biodatabase < DummyBase #delete set_primary_key "biodatabase_id" set_sequence_name "biodatabase_pk_seq" has_many :bioentries, :class_name =>"Bioentry", :foreign_key => "biodatabase_id" validates_uniqueness_of :name end end #SQL end #Bio From ngoto at dev.open-bio.org Mon Mar 3 18:30:53 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Mon, 03 Mar 2008 18:30:53 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/genbank common.rb, 1.11.2.1, 1.11.2.2 genbank.rb, 0.40.2.1, 0.40.2.2 Message-ID: <200803031830.m23IUrSe005148@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/genbank In directory dev.open-bio.org:/tmp/cvs-serv5128/lib/bio/db/genbank Modified Files: Tag: BRANCH-biohackathon2008 common.rb genbank.rb Log Message: * lib/bio/db/genbank/common.rb * accessions method was broken * fixed a bug about embl_gb_record_number and sequence_position in references * lib/bio/db/genbank/genbank.rb * fixed some mistaken variable names in to_biosequence() Index: genbank.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/genbank/genbank.rb,v retrieving revision 0.40.2.1 retrieving revision 0.40.2.2 diff -C2 -d -r0.40.2.1 -r0.40.2.2 *** genbank.rb 14 Feb 2008 08:51:45 -0000 0.40.2.1 --- genbank.rb 3 Mar 2008 18:30:50 -0000 0.40.2.2 *************** *** 142,154 **** sequence.sequence_version = self.version ! seq.date_created = nil #???? sequence.date_modified = self.date sequence.keywords = self.keywords sequence.species = self.organism ! sequence.classification = self.taxonomy ! sequence.organnella = nil # not used sequence.comments = self.comment sequence.references = self.references return sequence end --- 142,155 ---- sequence.sequence_version = self.version ! #sequence.date_created = nil #???? sequence.date_modified = self.date sequence.keywords = self.keywords sequence.species = self.organism ! sequence.classification = self.taxonomy.to_s.sub(/\.\z/, '').split(/\s*\;\s*/) ! #sequence.organnella = nil # not used sequence.comments = self.comment sequence.references = self.references + sequence.features = self.features return sequence end Index: common.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/genbank/common.rb,v retrieving revision 1.11.2.1 retrieving revision 1.11.2.2 diff -C2 -d -r1.11.2.1 -r1.11.2.2 *** common.rb 28 Feb 2008 05:54:51 -0000 1.11.2.1 --- common.rb 3 Mar 2008 18:30:50 -0000 1.11.2.2 *************** *** 45,49 **** # ACCESSION -- Returns contents of the ACCESSION record as an Array. def accessions ! accession.split(/\s+/) end --- 45,49 ---- # ACCESSION -- Returns contents of the ACCESSION record as an Array. def accessions ! field_fetch('ACCESSION').strip.split(/\s+/) end *************** *** 141,148 **** subtag2array(ref).each do |field| case tag_get(field) ! when /^\s*REFERENCE\s+(\d+)(\s+\(bases\s+(\d+)\s+to\s+(\d+)\))?/ ! hash['embl_gb_record_number'] = $1.to_i ! if $2 then ! hash['sequence_position'] = "#{$3}-#{$4}" end when /AUTHORS/ --- 141,154 ---- subtag2array(ref).each do |field| case tag_get(field) ! when /REFERENCE/ ! if /(\d+)(\s*\((.+)\))?/m =~ tag_cut(field) then ! hash['embl_gb_record_number'] = $1.to_i ! if $3 and $3 != 'sites' then ! seqpos = $3 ! seqpos.sub!(/\A\s*bases\s+/, '') ! seqpos.gsub!(/(\d+)\s+to\s+(\d+)/, "\\1-\\2") ! seqpos.gsub!(/\s*\;\s*/, ', ') ! hash['sequence_position'] = seqpos ! end end when /AUTHORS/ From ngoto at dev.open-bio.org Tue Mar 4 09:22:38 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Tue, 04 Mar 2008 09:22:38 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/genbank genbank.rb, 0.40.2.2, 0.40.2.3 Message-ID: <200803040922.m249McN4007026@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/genbank In directory dev.open-bio.org:/tmp/cvs-serv7006/lib/bio/db/genbank Modified Files: Tag: BRANCH-biohackathon2008 genbank.rb Log Message: in to_biosequence(), conversion of definition was missing. Index: genbank.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/genbank/genbank.rb,v retrieving revision 0.40.2.2 retrieving revision 0.40.2.3 diff -C2 -d -r0.40.2.2 -r0.40.2.3 *** genbank.rb 3 Mar 2008 18:30:50 -0000 0.40.2.2 --- genbank.rb 4 Mar 2008 09:22:35 -0000 0.40.2.3 *************** *** 145,148 **** --- 145,149 ---- sequence.date_modified = self.date + sequence.definition = self.definition sequence.keywords = self.keywords sequence.species = self.organism From ngoto at dev.open-bio.org Tue Mar 4 09:46:12 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Tue, 04 Mar 2008 09:46:12 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/compat - New directory Message-ID: <200803040946.m249kCjw007182@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/compat In directory dev.open-bio.org:/tmp/cvs-serv7162/lib/bio/compat Log Message: Directory /home/repository/bioruby/bioruby/lib/bio/compat added to the repository --> Using per-directory sticky tag `BRANCH-biohackathon2008' From ngoto at dev.open-bio.org Tue Mar 4 10:07:51 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Tue, 04 Mar 2008 10:07:51 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/compat references.rb,NONE,1.1.2.1 Message-ID: <200803041007.m24A7p8X007317@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/compat In directory dev.open-bio.org:/tmp/cvs-serv7295/lib/bio/compat Added Files: Tag: BRANCH-biohackathon2008 references.rb Log Message: Bio::References and backward-compatibility module (renamed to Bio::References::BackwardCompatibility) is moved to lib/bio/compat/references.rb --- NEW FILE: references.rb --- # # = bio/compat/references.rb - Obsoleted References class # # Copyright:: Copyright (C) 2008 # Toshiaki Katayama , # Ryan Raaum , # Jan Aerts , # Naohisa Goto # License:: The Ruby License # # $Id: references.rb,v 1.1.2.1 2008/03/04 10:07:49 ngoto Exp $ # # == Description # # The Bio::References class was obsoleted after BioRuby 1.2.1. # To keep compatibility, some wrapper methods are provided in this file. # As the compatibility methods (and Bio::References) will soon be removed, # Please change your code not to use Bio::References. # # Note that Bio::Reference is different from Bio::References. # Bio::Reference still exists for storing a reference information # in sequence entries. module Bio # = DESCRIPTION # # This class is OBSOLETED, and will soon be removed. # Instead of this class, an array is to be used. # # # A container class for Bio::Reference objects. # # = USAGE # # This class should NOT be used. # # refs = Bio::References.new # refs.append(Bio::Reference.new(hash)) # refs.each do |reference| # ... # end # class References # module to keep backward compatibility with obsoleted Bio::References module BackwardCompatibility #:nodoc: # Backward compatibility with Bio::References#references. # Now, references are stored in an array, and # you should change your code not to use this method. def references warn 'Bio::References is obsoleted. Now, references are stored in an array.' self end # Backward compatibility with Bio::References#append. # Now, references are stored in an array, and # you should change your code not to use this method. def append(reference) warn 'Bio::References is obsoleted. Now, references are stored in an array.' self.push(reference) if reference.is_a? Reference self end end #module BackwardCompatibility # This method should not be used. # Only for backward compatibility of existing code. # # Since Bio::References is obsoleted, # Bio::References.new not returns Bio::References object, # but modifies given _ary_ and returns the _ary_. # # *Arguments*: # * (optional) __: Array of Bio::Reference objects # *Returns*:: the given array def self.new(ary = []) warn 'Bio::References is obsoleted. Some methods are added to given array to keep backward compatibility.' ary.extend(BackwardCompatibility) ary end # Array of Bio::Reference objects attr_accessor :references # Normally, users can not call this method. # # Create a new Bio::References object # # refs = Bio::References.new # --- # *Arguments*: # * (optional) __: Array of Bio::Reference objects # *Returns*:: Bio::References object def initialize(ary = []) @references = ary end # Add a Bio::Reference object to the container. # # refs.append(reference) # --- # *Arguments*: # * (required) _reference_: Bio::Reference object # *Returns*:: current Bio::References object def append(reference) @references.push(reference) if reference.is_a? Reference return self end # Iterate through Bio::Reference objects. # # refs.each do |reference| # ... # end # --- # *Block*:: yields each Bio::Reference object def each @references.each do |reference| yield reference end end end #class References end #module Bio From ngoto at dev.open-bio.org Tue Mar 4 10:07:51 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Tue, 04 Mar 2008 10:07:51 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio reference.rb,1.24.2.3,1.24.2.4 Message-ID: <200803041007.m24A7pT0007322@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio In directory dev.open-bio.org:/tmp/cvs-serv7295/lib/bio Modified Files: Tag: BRANCH-biohackathon2008 reference.rb Log Message: Bio::References and backward-compatibility module (renamed to Bio::References::BackwardCompatibility) is moved to lib/bio/compat/references.rb Index: reference.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/reference.rb,v retrieving revision 1.24.2.3 retrieving revision 1.24.2.4 diff -C2 -d -r1.24.2.3 -r1.24.2.4 *** reference.rb 28 Feb 2008 05:51:03 -0000 1.24.2.3 --- reference.rb 4 Mar 2008 10:07:49 -0000 1.24.2.4 *************** *** 578,680 **** end - # = DESCRIPTION - # - # This class is OBSOLETED, and will soon be removed. - # Instead of this class, an array is to be used. - # - # - # A container class for Bio::Reference objects. - # - # = USAGE - # - # This class should NOT be used. - # - # refs = Bio::References.new - # refs.append(Bio::Reference.new(hash)) - # refs.each do |reference| - # ... - # end - # - class References - - # module to keep backward compatibility with obsoleted Bio::References - module BackwardCompatibilityForBioReferences #:nodoc: - - # Backward compatibility with Bio::References#references. - # Now, references are stored in an array, and - # you should change your code not to use this method. - def references - warn 'Bio::References is obsoleted. Now, references are stored in an array.' - self - end - - # Backward compatibility with Bio::References#append. - # Now, references are stored in an array, and - # you should change your code not to use this method. - def append(reference) - warn 'Bio::References is obsoleted. Now, references are stored in an array.' - self.push(reference) if reference.is_a? Reference - self - end - end #module BackwardCompatibilityForBioReferences - - # This method should not be used. - # Only for backward compatibility of existing code. - # - # Since Bio::References is obsoleted, - # Bio::References.new not returns Bio::References object, - # but modifies given _ary_ and returns the _ary_. - # - # *Arguments*: - # * (optional) __: Array of Bio::Reference objects - # *Returns*:: the given array - def self.new(ary = []) - warn 'Bio::References is obsoleted. Some methods are added to given array to keep backward compatibility.' - ary.extend(BackwardCompatibilityForBioReferences) - ary - end - - # Array of Bio::Reference objects - attr_accessor :references - - # Create a new Bio::References object - # - # refs = Bio::References.new - # --- - # *Arguments*: - # * (optional) __: Array of Bio::Reference objects - # *Returns*:: Bio::References object - def initialize(ary = []) - @references = ary - end - - - # Add a Bio::Reference object to the container. - # - # refs.append(reference) - # --- - # *Arguments*: - # * (required) _reference_: Bio::Reference object - # *Returns*:: current Bio::References object - def append(reference) - @references.push(reference) if reference.is_a? Reference - return self - end - - # Iterate through Bio::Reference objects. - # - # refs.each do |reference| - # ... - # end - # --- - # *Block*:: yields each Bio::Reference object - def each - @references.each do |reference| - yield reference - end - end - - end - end --- 578,581 ---- From ngoto at dev.open-bio.org Tue Mar 4 10:12:24 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Tue, 04 Mar 2008 10:12:24 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/compat features.rb,NONE,1.1.2.1 Message-ID: <200803041012.m24ACObW007373@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/compat In directory dev.open-bio.org:/tmp/cvs-serv7351/lib/bio/compat Added Files: Tag: BRANCH-biohackathon2008 features.rb Log Message: Bio::Features is moved to lib/bio/compat/features.rb, and a module to keep backward compatibility (Bio::Features::BackwardCompatibility) is added. --- NEW FILE: features.rb --- # # = bio/compat/features.rb - Obsoleted Features class # # Copyright:: Copyright (c) 2002, 2005 Toshiaki Katayama # 2006 Jan Aerts # 2008 Naohisa Goto # License:: The Ruby License # # $Id: features.rb,v 1.1.2.1 2008/03/04 10:12:22 ngoto Exp $ # # == Description # # The Bio::Features class was obsoleted after BioRuby 1.2.1. # To keep compatibility, some wrapper methods are provided in this file. # As the compatibility methods (and Bio::Features) will soon be removed, # Please change your code not to use Bio::Features. # # Note that Bio::Feature is different from the Bio::Features. # Bio::Feature still exists to store DDBJ/GenBank/EMBL feature information. require 'bio/location' module Bio # = DESCRIPTION # # This class is OBSOLETED, and will soon be removed. # Instead of this class, an array is to be used. # # # Container for a list of Feature objects. # # = USAGE # # First, create some Bio::Feature objects # feature1 = Bio::Feature.new('intron','3627..4059') # feature2 = Bio::Feature.new('exon','4060..4236') # feature3 = Bio::Feature.new('intron','4237..4426') # feature4 = Bio::Feature.new('CDS','join(2538..3626,4060..4236)', # [ Bio::Feature::Qualifier.new('gene', 'CYP2D6'), # Bio::Feature::Qualifier.new('translation','MGXXTVMHLL...') # ]) # # # And create a container for them # feature_container = Bio::Features.new([ feature1, feature2, feature3, feature4 ]) # # # Iterate over all features and print # feature_container.each do |feature| # puts feature.feature + "\t" + feature.position # feature.each do |qualifier| # puts "- " + qualifier.qualifier + ": " + qualifier.value # end # end # # # Iterate only over CDS features and extract translated amino acid sequences # features.each("CDS") do |feature| # hash = feature.to_hash # name = hash["gene"] || hash["product"] || hash["note"] # aaseq = hash["translation"] # pos = feature.position # if name and seq # puts ">#{gene} #{feature.position}" # puts aaseq # end # end class Features # module to keep backward compatibility with obsoleted Bio::Features module BackwardCompatibility #:nodoc: # Backward compatibility with Bio::Features#features. # Now, features are stored in an array, and # you should change your code not to use this method. def features warn 'Bio::Features is obsoleted. Now, features are stored in an array.' self end # Backward compatibility with Bio::Features#append. # Now, references are stored in an array, and # you should change your code not to use this method. def append(feature) warn 'Bio::Features is obsoleted. Now, features are stored in an array.' self.push(feature) if feature.is_a? Feature self end end #module BackwardCompatibility # This method should not be used. # Only for backward compatibility of existing code. # # Since Bio::Features is obsoleted, # Bio::Features.new not returns Bio::Features object, # but modifies given _ary_ and returns the _ary_. # # *Arguments*: # * (optional) __: Array of Bio::Feature objects # *Returns*:: the given array def self.new(ary = []) warn 'Bio::Feature is obsoleted. Some methods are added to given array to keep backward compatibility.' ary.extend(BackwardCompatibility) ary end # Normally, users can not call this method. # # Create a new Bio::Features object. # # *Arguments*: # * (optional) _list of features_: list of Bio::Feature objects # *Returns*:: Bio::Features object def initialize(ary = []) @features = ary end # Returns an Array of Feature objects. attr_accessor :features # Appends a Feature object to Features. # # *Arguments*: # * (required) _feature_: Bio::Feature object # *Returns*:: Bio::Features object def append(a) @features.push(a) if a.is_a? Feature return self end # Iterates on each feature object. # # *Arguments*: # * (optional) _key_: if specified, only iterates over features with this key def each(arg = nil) @features.each do |x| next if arg and x.feature != arg yield x end end # Short cut for the Features#features[n] def [](*arg) @features[*arg] end # Short cut for the Features#features.first def first @features.first end # Short cut for the Features#features.last def last @features.last end end # Features end # Bio From ngoto at dev.open-bio.org Tue Mar 4 10:12:24 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Tue, 04 Mar 2008 10:12:24 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio feature.rb,1.13,1.13.2.1 Message-ID: <200803041012.m24ACO7D007378@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio In directory dev.open-bio.org:/tmp/cvs-serv7351/lib/bio Modified Files: Tag: BRANCH-biohackathon2008 feature.rb Log Message: Bio::Features is moved to lib/bio/compat/features.rb, and a module to keep backward compatibility (Bio::Features::BackwardCompatibility) is added. Index: feature.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/feature.rb,v retrieving revision 1.13 retrieving revision 1.13.2.1 diff -C2 -d -r1.13 -r1.13.2.1 *** feature.rb 5 Apr 2007 23:35:39 -0000 1.13 --- feature.rb 4 Mar 2008 10:12:22 -0000 1.13.2.1 *************** *** 136,226 **** end #Feature - - # = DESCRIPTION - # Container for a list of Feature objects. - # - # = USAGE - # # First, create some Bio::Feature objects - # feature1 = Bio::Feature.new('intron','3627..4059') - # feature2 = Bio::Feature.new('exon','4060..4236') - # feature3 = Bio::Feature.new('intron','4237..4426') - # feature4 = Bio::Feature.new('CDS','join(2538..3626,4060..4236)', - # [ Bio::Feature::Qualifier.new('gene', 'CYP2D6'), - # Bio::Feature::Qualifier.new('translation','MGXXTVMHLL...') - # ]) - # - # # And create a container for them - # feature_container = Bio::Features.new([ feature1, feature2, feature3, feature4 ]) - # - # # Iterate over all features and print - # feature_container.each do |feature| - # puts feature.feature + "\t" + feature.position - # feature.each do |qualifier| - # puts "- " + qualifier.qualifier + ": " + qualifier.value - # end - # end - # - # # Iterate only over CDS features and extract translated amino acid sequences - # features.each("CDS") do |feature| - # hash = feature.to_hash - # name = hash["gene"] || hash["product"] || hash["note"] - # aaseq = hash["translation"] - # pos = feature.position - # if name and seq - # puts ">#{gene} #{feature.position}" - # puts aaseq - # end - # end - class Features - # Create a new Bio::Features object. - # - # *Arguments*: - # * (optional) _list of features_: list of Bio::Feature objects - # *Returns*:: Bio::Features object - def initialize(ary = []) - @features = ary - end - - # Returns an Array of Feature objects. - attr_accessor :features - - # Appends a Feature object to Features. - # - # *Arguments*: - # * (required) _feature_: Bio::Feature object - # *Returns*:: Bio::Features object - def append(a) - @features.push(a) if a.is_a? Feature - return self - end - - # Iterates on each feature object. - # - # *Arguments*: - # * (optional) _key_: if specified, only iterates over features with this key - def each(arg = nil) - @features.each do |x| - next if arg and x.feature != arg - yield x - end - end - - # Short cut for the Features#features[n] - def [](*arg) - @features[*arg] - end - - # Short cut for the Features#features.first - def first - @features.first - end - - # Short cut for the Features#features.last - def last - @features.last - end - - end # Features - end # Bio --- 136,139 ---- From ngoto at dev.open-bio.org Tue Mar 4 10:32:57 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Tue, 04 Mar 2008 10:32:57 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/genbank common.rb, 1.11.2.2, 1.11.2.3 Message-ID: <200803041032.m24AWvnU007490@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/genbank In directory dev.open-bio.org:/tmp/cvs-serv7470/lib/bio/db/genbank Modified Files: Tag: BRANCH-biohackathon2008 common.rb Log Message: Changed not to use Bio::References and Bio::Features. To keep backward compatibility, BackwardCompatibility modules is used to extend an array. Index: common.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/genbank/common.rb,v retrieving revision 1.11.2.2 retrieving revision 1.11.2.3 diff -C2 -d -r1.11.2.2 -r1.11.2.3 *** common.rb 3 Mar 2008 18:30:50 -0000 1.11.2.2 --- common.rb 4 Mar 2008 10:32:55 -0000 1.11.2.3 *************** *** 179,183 **** ary.push(Reference.new(hash)) end ! @data['REFERENCE'] = References.new(ary) end if block_given? --- 179,183 ---- ary.push(Reference.new(hash)) end ! @data['REFERENCE'] = ary.extend(Bio::References::BackwardCompatibility) end if block_given? *************** *** 197,202 **** ! # FEATURES -- Returns contents of the FEATURES record as a Bio::Features ! # object. def features unless @data['FEATURES'] --- 197,202 ---- ! # FEATURES -- Returns contents of the FEATURES record as an array of ! # Bio::Feature objects. def features unless @data['FEATURES'] *************** *** 240,244 **** end ! @data['FEATURES'] = Features.new(ary) end if block_given? --- 240,244 ---- end ! @data['FEATURES'] = ary.extend(Bio::Features::BackwardCompatibility) end if block_given? From ngoto at dev.open-bio.org Tue Mar 4 10:56:44 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Tue, 04 Mar 2008 10:56:44 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/embl embl.rb,1.29.2.2,1.29.2.3 Message-ID: <200803041056.m24AuiM8007583@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/embl In directory dev.open-bio.org:/tmp/cvs-serv7563/lib/bio/db/embl Modified Files: Tag: BRANCH-biohackathon2008 embl.rb Log Message: In Bio::EMBL#ft(), added "extend Bio::Features::BackwardCompatibility" to keep backward compatibility. Index: embl.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/embl/embl.rb,v retrieving revision 1.29.2.2 retrieving revision 1.29.2.3 diff -C2 -d -r1.29.2.2 -r1.29.2.3 *** embl.rb 20 Feb 2008 09:56:22 -0000 1.29.2.2 --- embl.rb 4 Mar 2008 10:56:42 -0000 1.29.2.3 *************** *** 257,261 **** def ft unless @data['FT'] ! @data['FT'] = Array.new in_quote = false @orig['FT'].each_line do |line| --- 257,261 ---- def ft unless @data['FT'] ! ary = Array.new in_quote = false @orig['FT'].each_line do |line| *************** *** 265,271 **** body = line[20,60].chomp # feature value (position, /qualifier=) if line =~ /^FT {3}(\S+)/ ! @data['FT'].push([ $1, body ]) # [ feature, position, /q="data", ... ] elsif body =~ /^ \// and not in_quote ! @data['FT'].last.push(body) # /q="data..., /q=data, /q if body =~ /=" / and body !~ /"$/ --- 265,271 ---- body = line[20,60].chomp # feature value (position, /qualifier=) if line =~ /^FT {3}(\S+)/ ! ary.push([ $1, body ]) # [ feature, position, /q="data", ... ] elsif body =~ /^ \// and not in_quote ! ary.last.push(body) # /q="data..., /q=data, /q if body =~ /=" / and body !~ /"$/ *************** *** 274,278 **** else ! @data['FT'].last.last << body # ...data..., ...data..." if body =~ /"$/ --- 274,278 ---- else ! ary.last.last << body # ...data..., ...data..." if body =~ /"$/ *************** *** 282,289 **** end ! @data['FT'].map! do |subary| parse_qualifiers(subary) end end if block_given? --- 282,290 ---- end ! ary.map! do |subary| parse_qualifiers(subary) end + @data['FT'] = ary.extend(Bio::Features::BackwardCompatibility) end if block_given? *************** *** 445,447 **** puts entry.to_biosequence.output(:embl) end ! end \ No newline at end of file --- 446,448 ---- puts entry.to_biosequence.output(:embl) end ! end From ngoto at dev.open-bio.org Tue Mar 4 11:10:30 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Tue, 04 Mar 2008 11:10:30 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/sequence format.rb,1.4.2.6,1.4.2.7 Message-ID: <200803041110.m24BAU07007698@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/sequence In directory dev.open-bio.org:/tmp/cvs-serv7656/lib/bio/sequence Modified Files: Tag: BRANCH-biohackathon2008 format.rb Log Message: * lib/bio/sequence.rb Bio::Sequence#output is moved to lib/bio/sequence/format.rb. * lib/bio/sequence/format.rb * Bio::Sequence#output is changed not to directly read erb file. * Bio::Sequence::Format::FormatterBase class, a base class of formatter, is newly added. * Bio::Sequence::Format::Formatter, NucFormatter, AminoFormatter are newly added to store formatter classes. * Bio::Sequence#list_output_formats is added. * (The names of above classes/modules/methods might be changed if more appropriate names are given.) Index: format.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/sequence/format.rb,v retrieving revision 1.4.2.6 retrieving revision 1.4.2.7 diff -C2 -d -r1.4.2.6 -r1.4.2.7 *** format.rb 22 Feb 2008 14:30:44 -0000 1.4.2.6 --- format.rb 4 Mar 2008 11:10:28 -0000 1.4.2.7 *************** *** 2,9 **** # = bio/sequence/format.rb - various output format of the biological sequence # ! # Copyright:: Copyright (C) 2006 # Toshiaki Katayama , # Naohisa Goto , ! # Ryan Raaum # License:: The Ruby License # --- 2,10 ---- # = bio/sequence/format.rb - various output format of the biological sequence # ! # Copyright:: Copyright (C) 2006-2008 # Toshiaki Katayama , # Naohisa Goto , ! # Ryan Raaum , ! # Jan Aerts # License:: The Ruby License # *************** *** 15,18 **** --- 16,20 ---- # + require 'erb' module Bio *************** *** 32,62 **** module Format ! # INTERNAL USE ONLY, YOU SHOULD NOT CALL THIS METHOD. (And in any ! # case, it would be difficult to successfully call this method outside ! # its expected context). ! # ! # Output the FASTA format string of the sequence. ! # ! # UNFORTUNATLY, the current implementation of Bio::Sequence is incapable of ! # using either the header or width arguments. So something needs to be ! # changed... # ! # Currently, this method is used in Bio::Sequence#output like so, # # s = Bio::Sequence.new('atgc') # puts s.output(:fasta) #=> "> \natgc\n" # --- ! # *Arguments*: ! # * (optional) _header_: String (default nil) ! # * (optional) _width_: Fixnum (default nil) # *Returns*:: String object ! def format_fasta(header = nil, width = nil) ! header ||= "#{@entry_id} #{@definition}" ! ">#{header}\n" + ! if width ! @seq.to_s.gsub(Regexp.new(".{1,#{width}}"), "\\0\n") else ! @seq.to_s + "\n" end end --- 34,181 ---- module Format ! # Repository of generic (or both nucleotide and protein) sequence ! # formatter classes ! module Formatter ! ! # Raw format generatar ! autoload :Raw, 'bio/sequence/format_raw' ! ! # Fasta format generater ! autoload :Fasta, 'bio/db/fasta/format_fasta' ! ! # NCBI-style Fasta format generatar ! # (resemble to EMBOSS "ncbi" format) ! autoload :Fasta_ncbi, 'bio/db/fasta/format_fasta' ! ! end #module Formatter ! ! # Repository of nucleotide sequence formatter classes ! module NucFormatter ! ! # GenBank format generater ! # Note that the name is 'Genbank' and NOT 'GenBank' ! autoload :Genbank, 'bio/db/genbank/format_genbank' ! ! # EMBL format generater ! # Note that the name is 'Embl' and NOT 'EMBL' ! autoload :Embl, 'bio/db/embl/format_embl' ! ! end #module NucFormatter ! ! # Repository of protein sequence formatter classes ! module AminoFormatter ! # currently no formats available ! end #module AminoFormatter ! ! # Formatter base class. ! # Any formatter class should inherit this class. ! class FormatterBase ! ! # Returns a formatterd string of the given sequence ! # --- ! # *Arguments*: ! # * (required) _sequence_: Bio::Sequence object ! # * (optional) _options_: a Hash object ! # *Returns*:: String object ! def self.output(sequence, options = {}) ! self.new(sequence, options).output ! end ! ! # register new Erb template ! def self.erb_template(str) ! erb = ERB.new(str) ! erb.def_method(self, 'output') ! true ! end ! private_class_method :erb_template ! ! # generates output data ! # --- ! # *Returns*:: String object ! def output ! raise NotImplementedError, 'should be implemented in subclass' ! end ! ! # creates a new formatter object for output ! def initialize(sequence, options = {}) ! @sequence = sequence ! @options = options ! end ! ! private ! ! # any unknown methods are delegated to the sequence object ! def method_missing(sym, *args, &block) #:nodoc: ! begin ! @sequence.__send__(sym, *args, &block) ! rescue NoMethodError => evar ! lineno = __LINE__ - 2 ! file = __FILE__ ! bt_here = [ "#{file}:#{lineno}:in \`__send__\'", ! "#{file}:#{lineno}:in \`method_missing\'" ! ] ! if bt_here == evar.backtrace[0, 2] then ! bt = evar.backtrace[2..-1] ! evar = evar.class.new("undefined method \`#{sym.to_s}\' for #{self.inspect}") ! evar.set_backtrace(bt) ! end ! raise(evar) ! end ! end ! end #class FormatterBase ! ! # Using Bio::Sequence::Format, return a String with the Bio::Sequence ! # object formatted in the given style. # ! # Formats currently implemented are: 'fasta', 'genbank', and 'embl' # # s = Bio::Sequence.new('atgc') # puts s.output(:fasta) #=> "> \natgc\n" + # + # The style argument is given as a Ruby + # Symbol(http://www.ruby-doc.org/core/classes/Symbol.html) # --- ! # *Arguments*: ! # * (required) _format_: :fasta, :genbank, *or* :embl # *Returns*:: String object ! def output(format = :fasta, options = {}) ! formatter_const = format.to_s.capitalize.intern ! formatter_class = nil ! get_formatter_repositories.each do |mod| ! begin ! formatter_class = mod.const_get(formatter_const) ! rescue NameError ! end ! break if formatter_class ! end ! unless formatter_class then ! raise "unknown format name #{format.inspect}" ! end ! ! formatter_class.output(self, options) ! end ! ! # Returns a list of available output formats for the sequence ! # --- ! # *Arguments*: ! # *Returns*:: Array of Symbols ! def list_output_formats ! a = get_formatter_repositories.collect { |mod| mod.constants } ! a.flatten! ! a.collect! { |x| x.to_s.downcase.intern } ! a ! end ! ! private ! ! # returns formatter repository modules ! def get_formatter_repositories ! if self.moltype == Bio::Sequence::NA then ! [ NucFormatter, Formatter ] ! elsif self.moltype == Bio::Sequence::AA then ! [ AminoFormatter, Formatter ] else ! [ NucFormatter, AminoFormatter, Formatter ] end end *************** *** 72,90 **** #end # INTERNAL USE ONLY, YOU SHOULD NOT CALL THIS METHOD. (And in any # case, it would be difficult to successfully call this method outside # its expected context). # ! # Output the Genbank format string of the sequence. # Used in Bio::Sequence#output. # --- # *Returns*:: String object ! #def format_genbank ! # prefix = ' ' * 5 ! # indent = prefix + ' ' * 16 ! # fwidth = 79 - indent.length ! # ! # format_features(prefix, indent, fwidth) ! #end # INTERNAL USE ONLY, YOU SHOULD NOT CALL THIS METHOD. (And in any --- 191,215 ---- #end + #+++ + + # Formatting helper methods for INSD (NCBI, EMBL, DDBJ) feature table + module INSDFeatureHelper + private + # INTERNAL USE ONLY, YOU SHOULD NOT CALL THIS METHOD. (And in any # case, it would be difficult to successfully call this method outside # its expected context). # ! # Output the Genbank feature format string of the sequence. # Used in Bio::Sequence#output. # --- # *Returns*:: String object ! def format_features_genbank(features) ! prefix = ' ' * 5 ! indent = prefix + ' ' * 16 ! fwidth = 79 - indent.length ! ! format_features(features, prefix, indent, fwidth) ! end # INTERNAL USE ONLY, YOU SHOULD NOT CALL THIS METHOD. (And in any *************** *** 92,130 **** # its expected context). # ! # Output the EMBL format string of the sequence. # Used in Bio::Sequence#output. # --- # *Returns*:: String object ! #def format_embl ! # prefix = 'FT ' ! # indent = prefix + ' ' * 16 ! # fwidth = 80 - indent.length ! # ! # format_features(prefix, indent, fwidth) ! #end ! ! #+++ ! ! private ! def format_features(prefix, indent, width) ! result = '' ! @features.each do |feature| ! result << prefix + sprintf("%-16s", feature.feature) ! position = feature.position ! #position = feature.locations.to_s ! head = '' ! wrap(position, width).each_line do |line| ! result << head << line ! head = indent ! end ! result << format_qualifiers(feature.qualifiers, indent, width) ! end return result end def format_qualifiers(qualifiers, indent, width) qualifiers.collect do |qualifier| --- 217,255 ---- # its expected context). # ! # Output the EMBL feature format string of the sequence. # Used in Bio::Sequence#output. # --- # *Returns*:: String object ! def format_features_embl(features) ! prefix = 'FT ' ! indent = prefix + ' ' * 16 ! fwidth = 80 - indent.length ! ! format_features(features, prefix, indent, fwidth) ! end ! # format INSD featurs ! def format_features(features, prefix, indent, width) ! result = [] ! features.each do |feature| ! result.push format_feature(feature, prefix, indent, width) ! end ! return result.join('') ! end ! # format an INSD feature ! def format_feature(feature, prefix, indent, width) ! result = prefix + sprintf("%-16s", feature.feature) ! position = feature.position ! #position = feature.locations.to_s ! result << wrap_and_split_lines(position, width).join("\n" + indent) ! result << "\n" ! result << format_qualifiers(feature.qualifiers, indent, width) return result end + # format qualifiers def format_qualifiers(qualifiers, indent, width) qualifiers.collect do |qualifier| *************** *** 133,137 **** if v == true ! lines = wrap('/' + q, width) elsif q == 'translation' lines = fold("/#{q}=\"#{v}\"", width) --- 258,262 ---- if v == true ! lines = wrap_with_newline('/' + q, width) elsif q == 'translation' lines = fold("/#{q}=\"#{v}\"", width) *************** *** 142,146 **** v = '"' + v + '"' end ! lines = wrap('/' + q + '=' + v, width) end --- 267,271 ---- v = '"' + v + '"' end ! lines = wrap_with_newline('/' + q + '=' + v, width) end *************** *** 154,158 **** end ! def wrap(str, width) result = [] left = str.dup --- 279,287 ---- end ! def fold_and_split_lines(str, width) ! str.scan(Regexp.new(".{1,#{width}}")) ! end ! ! def wrap_and_split_lines(str, width) result = [] left = str.dup *************** *** 172,176 **** result << line end ! result << left if left result_string = result.join("\n") result_string << "\n" unless result_string.empty? --- 301,310 ---- result << line end ! result << left if left and !(left.to_s.empty?) ! return result ! end ! ! def wrap_with_newline(str, width) ! result = wrap_and_split_lines(str, width) result_string = result.join("\n") result_string << "\n" unless result_string.empty? *************** *** 178,185 **** end ! end # Format ! end # Sequence ! end # Bio --- 312,329 ---- end ! def wrap(str, width = 80, prefix = '') ! actual_width = width - prefix.length ! result = wrap_and_split_lines(str, actual_width) ! result_string = result.join("\n#{prefix}") ! result_string = prefix + result_string unless result_string.empty? ! return result_string ! end ! end #module INSDFeatureHelper ! end #module Format ! ! end #class Sequence ! ! end #module Bio From ngoto at dev.open-bio.org Tue Mar 4 11:10:30 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Tue, 04 Mar 2008 11:10:30 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio sequence.rb,0.58.2.7,0.58.2.8 Message-ID: <200803041110.m24BAUBl007703@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio In directory dev.open-bio.org:/tmp/cvs-serv7656/lib/bio Modified Files: Tag: BRANCH-biohackathon2008 sequence.rb Log Message: * lib/bio/sequence.rb Bio::Sequence#output is moved to lib/bio/sequence/format.rb. * lib/bio/sequence/format.rb * Bio::Sequence#output is changed not to directly read erb file. * Bio::Sequence::Format::FormatterBase class, a base class of formatter, is newly added. * Bio::Sequence::Format::Formatter, NucFormatter, AminoFormatter are newly added to store formatter classes. * Bio::Sequence#list_output_formats is added. * (The names of above classes/modules/methods might be changed if more appropriate names are given.) Index: sequence.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/sequence.rb,v retrieving revision 0.58.2.7 retrieving revision 0.58.2.8 diff -C2 -d -r0.58.2.7 -r0.58.2.8 *** sequence.rb 20 Feb 2008 09:56:22 -0000 0.58.2.7 --- sequence.rb 4 Mar 2008 11:10:28 -0000 0.58.2.8 *************** *** 13,17 **** # - require 'erb' require 'bio/sequence/compat' --- 13,16 ---- *************** *** 156,178 **** attr_accessor :seq - # Using Bio::Sequence::Format, return a String with the Bio::Sequence - # object formatted in the given style. - # - # Formats currently implemented are: 'fasta', 'genbank', and 'embl' - # - # s = Bio::Sequence.new('atgc') - # puts s.output(:fasta) #=> "> \natgc\n" - # - # The style argument is given as a Ruby - # Symbol(http://www.ruby-doc.org/core/classes/Symbol.html) - # --- - # *Arguments*: - # * (required) _format_: :fasta, :genbank, *or* :embl - # *Returns*:: String object - def output(format = :fasta) - record_template = ERB.new(File.read(File.dirname(__FILE__) + "/db/#{format.to_s}/format.erb")) - record_template.result(binding) - end - # Guess the type of sequence, Amino Acid or Nucleic Acid, and create a # new sequence object (Bio::Sequence::AA or Bio::Sequence::NA) on the basis --- 155,158 ---- From ngoto at dev.open-bio.org Tue Mar 4 11:14:05 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Tue, 04 Mar 2008 11:14:05 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/sequence common.rb,1.6.2.1,1.6.2.2 Message-ID: <200803041114.m24BE5Oh007773@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/sequence In directory dev.open-bio.org:/tmp/cvs-serv7753/lib/bio/sequence Modified Files: Tag: BRANCH-biohackathon2008 common.rb Log Message: format_embl is moved to lib/bio/embl/format_embl.rb Index: common.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/sequence/common.rb,v retrieving revision 1.6.2.1 retrieving revision 1.6.2.2 diff -C2 -d -r1.6.2.1 -r1.6.2.2 *** common.rb 20 Feb 2008 09:56:22 -0000 1.6.2.1 --- common.rb 4 Mar 2008 11:14:03 -0000 1.6.2.2 *************** *** 67,87 **** end - def format_embl - output_lines = Array.new - counter = 0 - remainder = self.window_search(60,60) do |subseq| - counter += 60 - subseq.gsub!(/(.{10})/, '\1 ') - output_lines.push(' '*5 + subseq + counter.to_s.rjust(9)) - end - counter += remainder.length - remainder = (remainder.to_s + ' '*(60-remainder.length)) - remainder.gsub!(/(.{10})/, '\1 ') - output_lines.push(' '*5 + remainder + counter.to_s.rjust(9)) - return output_lines.join("\n") - end - - - # Normalize the current sequence, removing all whitespace and # transforming all positions to uppercase if the sequence is AA or --- 67,70 ---- From ngoto at dev.open-bio.org Tue Mar 4 11:16:59 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Tue, 04 Mar 2008 11:16:59 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/embl format_embl.rb,NONE,1.1.2.1 Message-ID: <200803041116.m24BGxSm007801@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/embl In directory dev.open-bio.org:/tmp/cvs-serv7781/lib/bio/db/embl Added Files: Tag: BRANCH-biohackathon2008 format_embl.rb Log Message: EMBL formatter class, internally used by Bio::Sequence, is newly added. --- NEW FILE: format_embl.rb --- # # = bio/db/embl/format_embl.rb - EMBL format generater # # Copyright:: Copyright (C) 2008 Jan Aerts # License:: The Ruby License # # $Id: format_embl.rb,v 1.1.2.1 2008/03/04 11:16:57 ngoto Exp $ # require 'bio/sequence/format' module Bio::Sequence::Format::NucFormatter # INTERNAL USE ONLY, YOU SHOULD NOT USE THIS CLASS. # Embl format output class for Bio::Sequence. class Embl < Bio::Sequence::Format::FormatterBase # helper methods include Bio::Sequence::Format::INSDFeatureHelper private def embl_wrap(prefix, str) wrap(str.to_s, 80, prefix) end def seq_format_embl(seq) output_lines = Array.new counter = 0 remainder = seq.window_search(60,60) do |subseq| counter += 60 subseq.gsub!(/(.{10})/, '\1 ') output_lines.push(' '*5 + subseq + counter.to_s.rjust(9)) end counter += remainder.length remainder = (remainder.to_s + ' '*(60-remainder.length)) remainder.gsub!(/(.{10})/, '\1 ') output_lines.push(' '*5 + remainder + counter.to_s.rjust(9)) return output_lines.join("\n") end # Erb template of EMBL format for Bio::Sequence erb_template <<'__END_OF_TEMPLATE__' ID <%= entry_id %>; SV <%= sequence_version %>; <%= topology %>; <%= molecule_type %>; <%= data_class %>; <%= division %>; <%= seq.length %> BP. XX <%= embl_wrap('AC ', accessions.reject{|a| a.nil?}.join('; ') + ';') %> XX DT <%= date_created %> DT <%= date_modified %> XX <%= embl_wrap('DE ', definition) %> XX <%= embl_wrap('KW ', keywords.join('; ') + '.') %> XX OS <%= species %> <%= embl_wrap('OC ', classification.join('; ') + '.') %> XX <%= references.collect{|ref| ref.format('embl')}.join("\n") %> XX FH Key Location/Qualifiers FH <%= format_features_embl(features) %>XX SQ Sequence <%= seq.length %> BP; <%= seq.composition.collect{|k,v| "#{v} #{k.upcase}"}.join('; ') + '; ' + (seq.gsub(/[ACTGactg]/, '').length.to_s ) + ' other;' %> <%= seq_format_embl(seq) %> // __END_OF_TEMPLATE__ end #class Embl end #module Bio::Sequence::Format::NucFormatter From ngoto at dev.open-bio.org Tue Mar 4 11:19:18 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Tue, 04 Mar 2008 11:19:18 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/genbank format_genbank.rb, NONE, 1.1.2.1 Message-ID: <200803041119.m24BJIvh007829@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/genbank In directory dev.open-bio.org:/tmp/cvs-serv7809/lib/bio/db/genbank Added Files: Tag: BRANCH-biohackathon2008 format_genbank.rb Log Message: Bio::Sequence::Format::NucFormatter::Genbank, GenBank sequence format generater class, is newly added. Note that this class is currently internal use only and users should not use it directly. --- NEW FILE: format_genbank.rb --- # # = bio/db/genbank/format_genbank.rb - GenBank format generater # # Copyright:: Copyright (C) 2008 Naohisa Goto # License:: The Ruby License # # $Id: format_genbank.rb,v 1.1.2.1 2008/03/04 11:19:16 ngoto Exp $ # require 'bio/sequence/format' module Bio::Sequence::Format::NucFormatter # INTERNAL USE ONLY, YOU SHOULD NOT USE THIS CLASS. # GenBank format output class for Bio::Sequence. class Genbank < Bio::Sequence::Format::FormatterBase # helper methods include Bio::Sequence::Format::INSDFeatureHelper private # string wrapper for GenBank format def genbank_wrap(str) wrap(str.to_s, 67).gsub(/\n/, "\n" + " " * 12) end # string wrap with adding a dot at the end of the string def genbank_wrap_dot(str) str = str.to_s str = str + '.' unless /\.\z/ =~ str genbank_wrap(str) end # formats sequence lines as GenBank def each_genbank_seqline(str) #:yields: counter, seqline i = 1 a = str.scan(/.{1,60}/) do |s| yield i, s.gsub(/(.{1,10})/, " \\1") i += 60 end end # Erb template of GenBank format for Bio::Sequence erb_template <<'__END_OF_TEMPLATE__' LOCUS <%= sprintf("%-16s", entry_id) %> <%= sprintf("%11d", length) %> bp <%= sprintf("%3s", '') %><%= sprintf("%-6s", molecule_type) %> <%= sprintf("%-8s", topology) %><%= sprintf("%4s", division) %> <%= sprintf("%-11s", date_modified) %> DEFINITION <%= genbank_wrap_dot(definition.to_s) %> ACCESSION <%= genbank_wrap(([ primary_accession ] + (secondary_accessions or [])).join(" ")) %> VERSION <%= primary_accession %>.<%= sequence_version %><% unless true or gi_number.to_s.empty? %>GI:<%= gi_number %><% end %> KEYWORDS <%= genbank_wrap_dot((keywords or []).join('; ')) %> SOURCE <%= genbank_wrap(species) %> ORGANISM <%= genbank_wrap(species) %> <%= genbank_wrap_dot((classification or []).join('; ')) %> <% n = 0 (references or []).each do |ref| n += 1 pos = ref.sequence_position.to_s.gsub(/\s/, '') pos.gsub!(/(\d+)\-(\d+)/, "\\1 to \\2") pos.gsub!(/\s*\,\s*/, '; ') if pos.empty? pos = '' else pos = " (bases #{pos})" end journal = ref.journal.to_s volissue = ref.volume.to_s volissue += " (#{ref.issue})" unless ref.issue.to_s.empty? journal += " #{volissue}," unless volissue.empty? journal += " #{ref.pages}" unless ref.pages.to_s.empty? journal += " (#{ref.year})" unless ref.year.to_s.empty? alist = ref.authors.collect { |x| x.gsub(/\, /, ',') } lastauthor = alist.pop authorsline = alist.join(', ') authorsline.concat(" and ") unless alist.empty? authorsline.concat lastauthor.to_s %>REFERENCE <%= genbank_wrap(sprintf('%-2d%s', n, pos)) %> AUTHORS <%= genbank_wrap(authorsline) %> TITLE <%= genbank_wrap(ref.title.to_s) %> JOURNAL <%= genbank_wrap(journal) %> <% unless ref.pubmed.to_s.empty? %> PUBMED <%= ref.pubmed %> <% end end %>FEATURES Location/Qualifiers <%= format_features_genbank(features || []) %>ORIGIN <% each_genbank_seqline(seq) do |i, s| %><%= sprintf('%9d', i) %><%= s %> <% end %>// __END_OF_TEMPLATE__ end #class Genbank end #module Bio::Sequence::Format::NucFormatter From ngoto at dev.open-bio.org Tue Mar 4 11:27:01 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Tue, 04 Mar 2008 11:27:01 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/fasta format_fasta.rb, NONE, 1.1.2.1 Message-ID: <200803041127.m24BR14P007878@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/fasta In directory dev.open-bio.org:/tmp/cvs-serv7858/lib/bio/db/fasta Added Files: Tag: BRANCH-biohackathon2008 format_fasta.rb Log Message: Bio::Sequence::Format::Formatter::Fasta and Fasta_ncbi are newly added. Both are FASTA sequence format generater classes. (Fasta_ncbi is experimental, and would be removed if we determine it is not needed.) Note that these classes are currently internal use only and users should not use them directly. --- NEW FILE: format_fasta.rb --- # # = bio/db/fasta/format_fasta.rb - Fasta format generater # # Copyright:: Copyright (C) 2006-2008 # Toshiaki Katayama , # Naohisa Goto , # Jan Aerts # License:: The Ruby License # # $Id: format_fasta.rb,v 1.1.2.1 2008/03/04 11:26:59 ngoto Exp $ # require 'bio/sequence/format' module Bio::Sequence::Format::Formatter # INTERNAL USE ONLY, YOU SHOULD NOT USE THIS CLASS. # Simple Fasta format output class for Bio::Sequence. class Fasta < Bio::Sequence::Format::FormatterBase # INTERNAL USE ONLY, YOU SHOULD NOT CALL THIS METHOD. # # Creates a new Fasta format generater object from the sequence. # # --- # *Arguments*: # * _sequence_: Bio::Sequence object # * (optional) :header => _header_: String (default nil) # * (optional) :width => _width_: Fixnum (default 70) def initialize; end if false # dummy for RDoc # INTERNAL USE ONLY, YOU SHOULD NOT CALL THIS METHOD. # # Output the FASTA format string of the sequence. # # Currently, this method is used in Bio::Sequence#output like so, # # s = Bio::Sequence.new('atgc') # puts s.output(:fasta) #=> "> \natgc\n" # --- # *Returns*:: String object def output header = @options[:header] width = @options.has_key?(:width) ? @options[:width] : 70 seq = @sequence.seq entry_id = @sequence.entry_id || "#{@sequence.primary_accession}.#{@sequence.sequence_version}" definition = @sequence.definition header ||= "#{entry_id} #{definition}" ">#{header}\n" + if width seq.to_s.gsub(Regexp.new(".{1,#{width}}"), "\\0\n") else seq.to_s + "\n" end end end #class Fasta # INTERNAL USE ONLY, YOU SHOULD NOT USE THIS CLASS. # NCBI-Style Fasta format output class for Bio::Sequence. # (like "ncbi" format in EMBOSS) # # Note that this class is under construction. class Fasta_ncbi < Bio::Sequence::Format::FormatterBase # INTERNAL USE ONLY, YOU SHOULD NOT CALL THIS METHOD. # # Output the FASTA format string of the sequence. # # Currently, this method is used in Bio::Sequence#output like so, # # s = Bio::Sequence.new('atgc') # puts s.output(:ncbi) #=> "> \natgc\n" # --- # *Returns*:: String object def output width = 70 seq = @sequence.seq #gi = @sequence.gi_number dbname = 'lcl' if @sequence.primary_accession.to_s.empty? then idstr = @sequence.entry_id else idstr = "#{@sequence.primary_accession}.#{@sequence.sequence_version}" end definition = @sequence.definition header = "#{dbname}|#{idstr} #{definition}" ">#{header}\n" + seq.to_s.gsub(Regexp.new(".{1,#{width}}"), "\\0\n") end end #class Ncbi end #module Bio::Sequence::Format::Formatter From ngoto at dev.open-bio.org Tue Mar 4 11:28:49 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Tue, 04 Mar 2008 11:28:49 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/sequence format_raw.rb,NONE,1.1.2.1 Message-ID: <200803041128.m24BSnON007906@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/sequence In directory dev.open-bio.org:/tmp/cvs-serv7886/lib/bio/sequence Added Files: Tag: BRANCH-biohackathon2008 format_raw.rb Log Message: Raw sequence format (sequence only; without any newline and white-spaces) formatter class is newly added. (Internal use only) --- NEW FILE: format_raw.rb --- # # = bio/sequence/format_raw.rb - Raw sequence formatter # # Copyright:: Copyright (C) 2008 Naohisa Goto # License:: The Ruby License # # $Id: format_raw.rb,v 1.1.2.1 2008/03/04 11:28:46 ngoto Exp $ # require 'bio/sequence/format' module Bio::Sequence::Format::Formatter # Raw sequence output formatter class class Raw < Bio::Sequence::Format::FormatterBase # output raw sequence data def output "#{@sequence.seq}" end end #class Raw end #module Bio::Sequence::Format::Formatter From ngoto at dev.open-bio.org Tue Mar 4 11:29:38 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Tue, 04 Mar 2008 11:29:38 +0000 Subject: [BioRuby-cvs] bioruby/lib bio.rb,1.89.2.3,1.89.2.4 Message-ID: <200803041129.m24BTcRt007955@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib In directory dev.open-bio.org:/tmp/cvs-serv7935/lib Modified Files: Tag: BRANCH-biohackathon2008 bio.rb Log Message: changed autoload file path of Bio::References and Bio::Features Index: bio.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio.rb,v retrieving revision 1.89.2.3 retrieving revision 1.89.2.4 diff -C2 -d -r1.89.2.3 -r1.89.2.4 *** bio.rb 22 Feb 2008 14:26:16 -0000 1.89.2.3 --- bio.rb 4 Mar 2008 11:29:36 -0000 1.89.2.4 *************** *** 27,36 **** autoload :Feature, 'bio/feature' ! autoload :Features, 'bio/feature' ## References/Reference autoload :Reference, 'bio/reference' ! autoload :References, 'bio/reference' ## Pathway/Relation --- 27,36 ---- autoload :Feature, 'bio/feature' ! autoload :Features, 'bio/compat/features' ## References/Reference autoload :Reference, 'bio/reference' ! autoload :References, 'bio/compat/references' ## Pathway/Relation From ngoto at dev.open-bio.org Tue Mar 4 11:31:47 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Tue, 04 Mar 2008 11:31:47 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio reference.rb,1.24.2.4,1.24.2.5 Message-ID: <200803041131.m24BVloU008025@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio In directory dev.open-bio.org:/tmp/cvs-serv8005/lib/bio Modified Files: Tag: BRANCH-biohackathon2008 reference.rb Log Message: changed to use Bio::Sequence::Format::INSDFeatureHelper#wrap(). Index: reference.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/reference.rb,v retrieving revision 1.24.2.4 retrieving revision 1.24.2.5 diff -C2 -d -r1.24.2.4 -r1.24.2.5 *** reference.rb 4 Mar 2008 10:07:49 -0000 1.24.2.4 --- reference.rb 4 Mar 2008 11:31:45 -0000 1.24.2.5 *************** *** 42,45 **** --- 42,47 ---- class Reference + include Bio::Sequence::Format::INSDFeatureHelper + # Author names in an Array, [ "Hoge, J.P.", "Fuga, F.B." ]. attr_reader :authors *************** *** 288,294 **** end end ! lines << @authors.join(', ').wrap(80, 'RA ') + ';' unless @authors.nil? ! lines << (@title == '' ? 'RT ;' : ('"' + @title + '"').wrap(80, 'RT ') + ';') ! lines << @journal.wrap(80, 'RL ') unless @journal == '' lines << "XX" return lines.join("\n") --- 290,296 ---- end end ! lines << wrap(@authors.join(', '), 80, 'RA ') + ';' unless @authors.nil? ! lines << (@title == '' ? 'RT ;' : wrap('"' + @title + '"', 80, 'RT ') + ';') ! lines << wrap(@journal, 80, 'RL ') unless @journal == '' lines << "XX" return lines.join("\n") From ngoto at dev.open-bio.org Mon Mar 10 13:42:28 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Mon, 10 Mar 2008 13:42:28 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/compat features.rb,1.1.2.1,1.1.2.2 Message-ID: <200803101342.m2ADgSYs009554@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/compat In directory dev.open-bio.org:/tmp/cvs-serv9534/lib/bio/compat Modified Files: Tag: BRANCH-biohackathon2008 features.rb Log Message: fixed typo in warning message Index: features.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/compat/Attic/features.rb,v retrieving revision 1.1.2.1 retrieving revision 1.1.2.2 diff -C2 -d -r1.1.2.1 -r1.1.2.2 *** features.rb 4 Mar 2008 10:12:22 -0000 1.1.2.1 --- features.rb 10 Mar 2008 13:42:26 -0000 1.1.2.2 *************** *** 97,101 **** # *Returns*:: the given array def self.new(ary = []) ! warn 'Bio::Feature is obsoleted. Some methods are added to given array to keep backward compatibility.' ary.extend(BackwardCompatibility) ary --- 97,101 ---- # *Returns*:: the given array def self.new(ary = []) ! warn 'Bio::Features is obsoleted. Some methods are added to given array to keep backward compatibility.' ary.extend(BackwardCompatibility) ary From ngoto at dev.open-bio.org Fri Mar 21 06:24:45 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Fri, 21 Mar 2008 06:24:45 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/embl embl.rb,1.29.2.3,1.29.2.4 Message-ID: <200803210624.m2L6OjlR031776@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/embl In directory dev.open-bio.org:/tmp/cvs-serv31756/lib/bio/db/embl Modified Files: Tag: BRANCH-biohackathon2008 embl.rb Log Message: "require 'bio/compat/features'" and "require 'bio/compat/references'" are added, and example code in the bottom of the file is removed to avoid possible confusion with unit tests. Index: embl.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/embl/embl.rb,v retrieving revision 1.29.2.3 retrieving revision 1.29.2.4 diff -C2 -d -r1.29.2.3 -r1.29.2.4 *** embl.rb 4 Mar 2008 10:56:42 -0000 1.29.2.3 --- embl.rb 21 Mar 2008 06:24:42 -0000 1.29.2.4 *************** *** 34,37 **** --- 34,39 ---- require 'bio/db' require 'bio/db/embl/common' + require 'bio/compat/features' + require 'bio/compat/references' module Bio *************** *** 432,448 **** end # module Bio - if __FILE__ == $0 - require '../../../bio' - require 'yaml' - - prefix = 'FT ' - indent = prefix + ' ' * 16 - fwidth = 80 - indent.length - - # parser = Bio::FlatFile.auto('/home/aertsj/LocalDocuments/bioruby_biohackathon/bioruby/test/data/embl/AB090716.embl') - parser = Bio::FlatFile.auto('/home/aertsj/LocalDocuments/hackathon/aj224122.embl') - parser.each do |entry| - # entry.ref - puts entry.to_biosequence.output(:embl) - end - end --- 434,435 ---- From ngoto at dev.open-bio.org Wed Mar 26 11:34:08 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Wed, 26 Mar 2008 11:34:08 +0000 Subject: [BioRuby-cvs] bioruby .project,1.1.2.1,NONE Message-ID: <200803261134.m2QBY7Im016555@dev.open-bio.org> Update of /home/repository/bioruby/bioruby In directory dev.open-bio.org:/tmp/cvs-serv16535 Removed Files: Tag: BRANCH-biohackathon2008 .project Log Message: Removed mistakenly added file .project. --- .project DELETED --- From ngoto at dev.open-bio.org Thu Mar 27 13:07:21 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Thu, 27 Mar 2008 13:07:21 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio sequence.rb,0.58.2.8,0.58.2.9 Message-ID: <200803271307.m2RD7LcR020772@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio In directory dev.open-bio.org:/tmp/cvs-serv20752/lib/bio Modified Files: Tag: BRANCH-biohackathon2008 sequence.rb Log Message: Added documents for attributes added during Biohackathon2008. Index: sequence.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/sequence.rb,v retrieving revision 0.58.2.8 retrieving revision 0.58.2.9 diff -C2 -d -r0.58.2.8 -r0.58.2.9 *** sequence.rb 4 Mar 2008 11:10:28 -0000 0.58.2.8 --- sequence.rb 27 Mar 2008 13:07:19 -0000 0.58.2.9 *************** *** 73,78 **** include Format - attr_accessor :sequence_version, :topology, :molecule_type, :data_class, :division, :primary_accession, :secondary_accessions, :date_created, :date_modified, :species, :classification - # Create a new Bio::Sequence object # --- 73,76 ---- *************** *** 154,158 **** --- 152,196 ---- # but could be a simple String attr_accessor :seq + + #--- + # Attributes below have been added during BioHackathon2008 + #+++ + # Version number of the sequence (String). + attr_accessor :sequence_version + + # Topology (String). "circular" or "linear". + attr_accessor :topology + + # molecular type (String). "DNA" or "RNA" for nucleotide sequence. + attr_accessor :molecule_type + + # Data Class defined by EMBL (String) + # See http://www.ebi.ac.uk/embl/Documentation/User_manual/usrman.html#3_1 + attr_accessor :data_class + + # Taxonomic Division defined by EMBL/GenBank/DDBJ (String) + # See http://www.ebi.ac.uk/embl/Documentation/User_manual/usrman.html#3_2 + attr_accessor :division + + # Primary accession number (String) + attr_accessor :primary_accession + + # Secondary accession numbers (Array of String) + attr_accessor :secondary_accessions + + # Created date of the sequence entry (String) + attr_accessor :date_created + + # Last modified date of the sequence entry (String) + attr_accessor :date_modified + + # Organism species (String). For example, "Escherichia coli". + attr_accessor :species + + # Organism classification, taxonomic classification of the source organism. + # (Array of String) + attr_accessor :classification + # Guess the type of sequence, Amino Acid or Nucleic Acid, and create a # new sequence object (Bio::Sequence::AA or Bio::Sequence::NA) on the basis From ngoto at dev.open-bio.org Thu Mar 27 13:32:30 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Thu, 27 Mar 2008 13:32:30 +0000 Subject: [BioRuby-cvs] bioruby/test/functional/bio/sequence - New directory Message-ID: <200803271332.m2RDWUOj020821@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/test/functional/bio/sequence In directory dev.open-bio.org:/tmp/cvs-serv20801/test/functional/bio/sequence Log Message: Directory /home/repository/bioruby/bioruby/test/functional/bio/sequence added to the repository --> Using per-directory sticky tag `BRANCH-biohackathon2008' From ngoto at dev.open-bio.org Thu Mar 27 13:38:33 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Thu, 27 Mar 2008 13:38:33 +0000 Subject: [BioRuby-cvs] bioruby/test/functional/bio/sequence test_output_embl.rb, NONE, 1.1.2.1 Message-ID: <200803271338.m2RDcX8k020926@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/test/functional/bio/sequence In directory dev.open-bio.org:/tmp/cvs-serv20870/test/functional/bio/sequence Added Files: Tag: BRANCH-biohackathon2008 test_output_embl.rb Log Message: Example code in sequence.rb written by Jan Aerts is moved to test/functional/bio/sequence/test_output_embl.rb. Fixed a bug in lib/bio/db/embl/format_embl.rb: failed to output when features or references are nil. This bug is found by above test code. --- NEW FILE: test_output_embl.rb --- # # test/functional/bio/sequence/test_output_embl.rb - Functional test for Bio::Sequence#output(:embl) # # Copyright:: Copyright (C) 2008 # Jan Aerts # License:: The Ruby License # # $Id: test_output_embl.rb,v 1.1.2.1 2008/03/27 13:38:31 ngoto Exp $ # require 'pathname' libpath = Pathname.new(File.join(File.dirname(__FILE__), ['..'] * 4, 'lib')).cleanpath.to_s $:.unshift(libpath) unless $:.include?(libpath) require 'test/unit' require 'bio' module Bio class FuncTestSequenceOutputEMBL < Test::Unit::TestCase def setup @seq = Bio::Sequence.auto('aattaaaacgccacgcaaggcgattctaggaaatcaaaacgacacgaaatgtggggtgggtgtttgggtaggaaagacagttgtcaacatcagggatttggattgaatcaaaaaaaaagtccttagatttcataaaagctaatcacgcctcaaaactggggcctatctcttcttttttgtcgcttcctgtcggtccttctctatttcttctccaacccctcatttttgaatatttacataacaaaccgttttactttctttggtcaaaattagacccaaaattctatattagtttaagatatgtggtctgtaatttattgttgtattgatataaaaattagttataagcgattatatttttatgctcaagtaactggtgttagttaactatattccaccacgataacctgattacataaaatatgattttaatcattttagtaaaccatatcgcacgttggatgattaattttaacggtttaataacacgtgattaaattatttttagaatgattatttacaaacggaaaagctatatgtgacacaataactcgtgcagtattgttagtttgaaaagtgtatttggtttcttatatttggcctcgattttcagtttatgtgctttttacaaagttttattttcgttatctgtttaacgcgacatttgttgtatggctttaccgatttgagaataaaatcatattacctttatgtagccatgtgtggtgtaatatataataatggtccttctacgaaaaaagcagatcacaattgaaataaagggtgaaatttggtgtcccttttcttcgtcgaaataacagaactaaataaaagaaagtgttatagtatattacgtccgaagaataatccatattcctgaaatacagtcaacatattatatatttagtactttatataaagttaggaattaaatcatatgttttatcgaccatattaagt! cacaactttatcataaattaatctgtaattagaattccaagttcgccaccgaatttcgtaacctaatctacatataatagataaaatatatatatgtagagtaattatgatatctatgtatgtagtcatggtatatgaattttgaaattggcaaggtaacattgacggatcgtaacccaacaaataatattaattacaaaatgggtgggcgggaatagtatacaactcataattccactcactttttgtattattaggatatgaaataagagtaatcaacatgcataataaagatgtataatttcttcatcttaaaaaacataactacatggtttaatacacaattttaccttttatcaaaaaagtatttcacaattcactcgcaaattacgaaatgatggctagtgcttcaactccaaatttcgaatattttaaatcacgatgtgtagaaccttttatttactggatactaatcactagtttattgagccaaccaattagttaaatagaacaatcaatattatagccagatattttttcctttaaaaatatttaaaagaggggccagaaaagaaccagagagggaggccatgagacattattatcactagtcaaaaacaacaaaccctccttttgctttttcatataaattattatattttattttgcaggtttcttctcttcttcttcttcttcttcttcttcttcctcttggctgctttctttcatcatccataaagtgaaagctaacgcatagagagagccatatcgtcccaaaaaaagcaaaagtccaaaaaaaaacaactccaaaacattctctcttagctctttactctttagtttctctctctctctctgcctttctctttgttgaagttcatggatgctacgaagtggactcaggtacgtaaaaagatatctctctgctatatctgtttgtttgtagcttctccccgactctcacgctctctctctctctctctctctc! tttgtgtatctctctactcacataaatatatacatgtgtgtgtatgcatgtttatatgtatgtatgaaac cagtagtggttatacagatagtctatatagagatatcaatatgatgtgttttaatttagactttttatatatccgtttgaaacttccgaagttctcgaatggagttaaggaagttttgttctctacaagttcaatttttcttgtcattaattataaaactctgataactaatggataaaaaaggtatgctttgttagttaccttttgttcttggtgctcaggtcttaccatttttttcctaaattttaattagtctcctttctttaattaattttatgttaacgcactgacgatttaacgttaacaaaaaaacctagattctttttcttttcaatagagcataattattacttcaatttcatttatctcacactaaaccctaatcttggcgaaattccttttatatatataaatttaattaatttttccacaatcttggcggaattcaggactcggttttgcttgttattgttctctcttttaatttgacatggttagggaatacttaaagtatgtcttaattttatagggttttcaagaaatgataaacgtaaagccaatggagcaaatgatttctagcaccaacaacaacacaccgcaacaacaaccaacattcatcgccaccaacacaaggccaaacgccaccgcatccaatggtggctccggaggaaataccaacaacacggctacgatggaaactagaaaggcgaggccacaagagaaagtaaattgtccaagatgcaactcaacaaacacaaagttctgttattacaacaactacagtctcacgcaaccaagatacttctgcaaaggttgtcgaaggtattggaccgaaggtggctctcttcgtaacgtcccagtcggaggtagctcaagaaagaacaagagatcctctacacctttagcttcaccttctaatcccaaacttccagatctaaacccaccgattcttttctcaagccaaatccctaataagtcaaataaagatc! tcaacttgctatctttcccggtcatgcaagatcatcatcatcatggtatgtctcatttttttcatatgcccaagatagagaacaacaatacttcatcctcaatctatgcttcatcatctcctgtctcagctcttgagcttctaagatccaatggagtctcttcaagaggcatgaacacgttcttgcctggtcaaatgatggattcaaactcagtcctgtactcatctttagggtttccaacaatgcctgattacaaacagagtaataacaacctttcattctccattgatcatcatcaagggattggacataacaccatcaacagtaaccaaagagctcaagataacaatgatgacatgaatggagcaagtagggttttgttccctttttcagacatgaaagagctttcaagcacaacccaagagaagagtcatggtaataatacatattggaatgggatgttcagtaatacaggaggatcttcatggtgaaaaaaggttaaaaagagctcatgaactatcagctttcttctctttttctgtttttttctcctattttattatagtttttactttgatgatcttttgttttttctcacatggggaactttacttaaagttgtcagaacttagtttacagattgtctttttattccttctttctggttttccttttttcctttttttatcagtctttttaaaatatgtatttcataattgggtttgatcattcatatttattagtatcaaaatagagtctatgttcatgagggagtgttaaggggtgtgagggtagaagaataagtgaatacgggggcccg') @seq.entry_id = 'AJ224122' @seq.sequence_version = 3 @seq.topology = 'linear' @seq.molecule_type = 'genomic DNA' @seq.data_class = 'STD' @seq.division = 'PLN' @seq.primary_accession = 'AJ224122' @seq.secondary_accessions = [] @seq.date_created = '27-FEB-1998 (Rel. 54, Created)' @seq.date_modified = '14-NOV-2006 (Rel. 89, Last updated, Version 6)' @seq.definition = 'Arabidopsis thaliana DAG1 gene' @seq.keywords = ['BBFa gene', 'transcription factor'] @seq.species = 'Arabidopsis thaliana (thale cress)' @seq.classification = ['Eukaryota', 'Viridiplantae', 'Streptophyta', 'Embryophyta', 'Tracheophyta', 'Spermatophyta', 'Magnoliophyta', 'eudicotyledons', 'core eudicotyledons', 'rosids', 'eurosids II', 'Brassicales', 'Brassicaceae', 'Arabidopsis'] end def test_output_embl assert_nothing_raised { puts @seq.output(:embl) } end def test_output_fasta assert_nothing_raised { @seq.output(:fasta) } end end #class FuncTestSequenceOutputEMBL end #module Bio From ngoto at dev.open-bio.org Thu Mar 27 13:38:33 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Thu, 27 Mar 2008 13:38:33 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/embl format_embl.rb, 1.1.2.1, 1.1.2.2 Message-ID: <200803271338.m2RDcXg2020921@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/embl In directory dev.open-bio.org:/tmp/cvs-serv20870/lib/bio/db/embl Modified Files: Tag: BRANCH-biohackathon2008 format_embl.rb Log Message: Example code in sequence.rb written by Jan Aerts is moved to test/functional/bio/sequence/test_output_embl.rb. Fixed a bug in lib/bio/db/embl/format_embl.rb: failed to output when features or references are nil. This bug is found by above test code. Index: format_embl.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/db/embl/Attic/format_embl.rb,v retrieving revision 1.1.2.1 retrieving revision 1.1.2.2 diff -C2 -d -r1.1.2.1 -r1.1.2.2 *** format_embl.rb 4 Mar 2008 11:16:57 -0000 1.1.2.1 --- format_embl.rb 27 Mar 2008 13:38:31 -0000 1.1.2.2 *************** *** 56,64 **** <%= embl_wrap('OC ', classification.join('; ') + '.') %> XX ! <%= references.collect{|ref| ref.format('embl')}.join("\n") %> XX FH Key Location/Qualifiers FH ! <%= format_features_embl(features) %>XX SQ Sequence <%= seq.length %> BP; <%= seq.composition.collect{|k,v| "#{v} #{k.upcase}"}.join('; ') + '; ' + (seq.gsub(/[ACTGactg]/, '').length.to_s ) + ' other;' %> <%= seq_format_embl(seq) %> --- 56,64 ---- <%= embl_wrap('OC ', classification.join('; ') + '.') %> XX ! <%= (references || []).collect{|ref| ref.format('embl')}.join("\n") %> XX FH Key Location/Qualifiers FH ! <%= format_features_embl(features || []) %>XX SQ Sequence <%= seq.length %> BP; <%= seq.composition.collect{|k,v| "#{v} #{k.upcase}"}.join('; ') + '; ' + (seq.gsub(/[ACTGactg]/, '').length.to_s ) + ' other;' %> <%= seq_format_embl(seq) %> From ngoto at dev.open-bio.org Thu Mar 27 13:38:33 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Thu, 27 Mar 2008 13:38:33 +0000 Subject: [BioRuby-cvs] bioruby/lib/bio sequence.rb,0.58.2.9,0.58.2.10 Message-ID: <200803271338.m2RDcXox020916@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio In directory dev.open-bio.org:/tmp/cvs-serv20870/lib/bio Modified Files: Tag: BRANCH-biohackathon2008 sequence.rb Log Message: Example code in sequence.rb written by Jan Aerts is moved to test/functional/bio/sequence/test_output_embl.rb. Fixed a bug in lib/bio/db/embl/format_embl.rb: failed to output when features or references are nil. This bug is found by above test code. Index: sequence.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/sequence.rb,v retrieving revision 0.58.2.9 retrieving revision 0.58.2.10 diff -C2 -d -r0.58.2.9 -r0.58.2.10 *** sequence.rb 27 Mar 2008 13:07:19 -0000 0.58.2.9 --- sequence.rb 27 Mar 2008 13:38:31 -0000 0.58.2.10 *************** *** 395,424 **** end # Bio - - if __FILE__ == $0 - - require 'bio' - seq = Bio::Sequence.new('aattaaaacgccacgcaaggcgattctaggaaatcaaaacgacacgaaatgtggggtgggtgtttgggtaggaaagacagttgtcaacatcagggatttggattgaatcaaaaaaaaagtccttagatttcataaaagctaatcacgcctcaaaactggggcctatctcttcttttttgtcgcttcctgtcggtccttctctatttcttctccaacccctcatttttgaatatttacataacaaaccgttttactttctttggtcaaaattagacccaaaattctatattagtttaagatatgtggtctgtaatttattgttgtattgatataaaaattagttataagcgattatatttttatgctcaagtaactggtgttagttaactatattccaccacgataacctgattacataaaatatgattttaatcattttagtaaaccatatcgcacgttggatgattaattttaacggtttaataacacgtgattaaattatttttagaatgattatttacaaacggaaaagctatatgtgacacaataactcgtgcagtattgttagtttgaaaagtgtatttggtttcttatatttggcctcgattttcagtttatgtgctttttacaaagttttattttcgttatctgtttaacgcgacatttgttgtatggctttaccgatttgagaataaaatcatattacctttatgtagccatgtgtggtgtaatatataataatggtccttctacgaaaaaagcagatcacaattgaaataaagggtgaaatttggtgtcccttttcttcgtcgaaataacagaactaaataaaagaaagtgttatagtatattacgtccgaagaataatccatattcctgaaatacagtcaacatattatatatttagtactttatataaagttaggaattaaatcatatgttttatcgaccatattaagt! cacaactttatcataaattaatctgtaattagaattccaagttcgccaccgaatttcgtaacctaatctacatataatagataaaatatatatatgtagagtaattatgatatctatgtatgtagtcatggtatatgaattttgaaattggcaaggtaacattgacggatcgtaacccaacaaataatattaattacaaaatgggtgggcgggaatagtatacaactcataattccactcactttttgtattattaggatatgaaataagagtaatcaacatgcataataaagatgtataatttcttcatcttaaaaaacataactacatggtttaatacacaattttaccttttatcaaaaaagtatttcacaattcactcgcaaattacgaaatgatggctagtgcttcaactccaaatttcgaatattttaaatcacgatgtgtagaaccttttatttactggatactaatcactagtttattgagccaaccaattagttaaatagaacaatcaatattatagccagatattttttcctttaaaaatatttaaaagaggggccagaaaagaaccagagagggaggccatgagacattattatcactagtcaaaaacaacaaaccctccttttgctttttcatataaattattatattttattttgcaggtttcttctcttcttcttcttcttcttcttcttcttcctcttggctgctttctttcatcatccataaagtgaaagctaacgcatagagagagccatatcgtcccaaaaaaagcaaaagtccaaaaaaaaacaactccaaaacattctctcttagctctttactctttagtttctctctctctctctgcctttctctttgttgaagttcatggatgctacgaagtggactcaggtacgtaaaaagatatctctctgctatatctgtttgtttgtagcttctccccgactctcacgctctctctctctctctctctctc! tttgtgtatctctctactcacataaatatatacatgtgtgtgtatgcatgtttatatgtatgtatgaaac cagtagtggttatacagatagtctatatagagatatcaatatgatgtgttttaatttagactttttatatatccgtttgaaacttccgaagttctcgaatggagttaaggaagttttgttctctacaagttcaatttttcttgtcattaattataaaactctgataactaatggataaaaaaggtatgctttgttagttaccttttgttcttggtgctcaggtcttaccatttttttcctaaattttaattagtctcctttctttaattaattttatgttaacgcactgacgatttaacgttaacaaaaaaacctagattctttttcttttcaatagagcataattattacttcaatttcatttatctcacactaaaccctaatcttggcgaaattccttttatatatataaatttaattaatttttccacaatcttggcggaattcaggactcggttttgcttgttattgttctctcttttaatttgacatggttagggaatacttaaagtatgtcttaattttatagggttttcaagaaatgataaacgtaaagccaatggagcaaatgatttctagcaccaacaacaacacaccgcaacaacaaccaacattcatcgccaccaacacaaggccaaacgccaccgcatccaatggtggctccggaggaaataccaacaacacggctacgatggaaactagaaaggcgaggccacaagagaaagtaaattgtccaagatgcaactcaacaaacacaaagttctgttattacaacaactacagtctcacgcaaccaagatacttctgcaaaggttgtcgaaggtattggaccgaaggtggctctcttcgtaacgtcccagtcggaggtagctcaagaaagaacaagagatcctctacacctttagcttcaccttctaatcccaaacttccagatctaaacccaccgattcttttctcaagccaaatccctaataagtcaaataaagatc! tcaacttgctatctttcccggtcatgcaagatcatcatcatcatggtatgtctcatttttttcatatgcccaagatagagaacaacaatacttcatcctcaatctatgcttcatcatctcctgtctcagctcttgagcttctaagatccaatggagtctcttcaagaggcatgaacacgttcttgcctggtcaaatgatggattcaaactcagtcctgtactcatctttagggtttccaacaatgcctgattacaaacagagtaataacaacctttcattctccattgatcatcatcaagggattggacataacaccatcaacagtaaccaaagagctcaagataacaatgatgacatgaatggagcaagtagggttttgttccctttttcagacatgaaagagctttcaagcacaacccaagagaagagtcatggtaataatacatattggaatgggatgttcagtaatacaggaggatcttcatggtgaaaaaaggttaaaaagagctcatgaactatcagctttcttctctttttctgtttttttctcctattttattatagtttttactttgatgatcttttgttttttctcacatggggaactttacttaaagttgtcagaacttagtttacagattgtctttttattccttctttctggttttccttttttcctttttttatcagtctttttaaaatatgtatttcataattgggtttgatcattcatatttattagtatcaaaatagagtctatgttcatgagggagtgttaaggggtgtgagggtagaagaataagtgaatacgggggcccg') - seq.entry_id = 'AJ224122' - seq.sequence_version = 3 - seq.topology = 'linear' - seq.molecule_type = 'genomic DNA' - seq.data_class = 'STD' - seq.division = 'PLN' - seq.primary_accession = 'AJ224122' - seq.secondary_accessions = [] - seq.date_created = '27-FEB-1998 (Rel. 54, Created)' - seq.date_modified = '14-NOV-2006 (Rel. 89, Last updated, Version 6)' - seq.definition = 'Arabidopsis thaliana DAG1 gene' - seq.keywords = ['BBFa gene', 'transcription factor'] - seq.species = 'Arabidopsis thaliana (thale cress)' - seq.classification = ['Eukaryota', 'Viridiplantae', 'Streptophyta', 'Embryophyta', 'Tracheophyta', - 'Spermatophyta', 'Magnoliophyta', 'eudicotyledons', 'core eudicotyledons', 'rosids', - 'eurosids II', 'Brassicales', 'Brassicaceae', 'Arabidopsis'] - - # puts seq.output(:embl) - puts seq.output(:fasta) - - end - - --- 395,396 ---- From ngoto at dev.open-bio.org Fri Mar 28 00:56:29 2008 From: ngoto at dev.open-bio.org (Naohisa Goto) Date: Fri, 28 Mar 2008 00:56:29 +0000 Subject: [BioRuby-cvs] bioruby/test/functional/bio/sequence test_output_embl.rb, 1.1.2.1, 1.1.2.2 Message-ID: <200803280056.m2S0uTTt022850@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/test/functional/bio/sequence In directory dev.open-bio.org:/tmp/cvs-serv22830/test/functional/bio/sequence Modified Files: Tag: BRANCH-biohackathon2008 test_output_embl.rb Log Message: removed unwanted puts Index: test_output_embl.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/test/functional/bio/sequence/Attic/test_output_embl.rb,v retrieving revision 1.1.2.1 retrieving revision 1.1.2.2 diff -C2 -d -r1.1.2.1 -r1.1.2.2 *** test_output_embl.rb 27 Mar 2008 13:38:31 -0000 1.1.2.1 --- test_output_embl.rb 28 Mar 2008 00:56:27 -0000 1.1.2.2 *************** *** 39,43 **** def test_output_embl ! assert_nothing_raised { puts @seq.output(:embl) } end --- 39,43 ---- def test_output_embl ! assert_nothing_raised { @seq.output(:embl) } end From helios at dev.open-bio.org Tue Mar 25 15:46:37 2008 From: helios at dev.open-bio.org (Raoul Jean Pierre Bonnal) Date: Tue, 25 Mar 2008 15:46:37 -0000 Subject: [BioRuby-cvs] bioruby/lib/bio/io/biosql/config - New directory Message-ID: <200803251546.m2PFkTuN013246@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/io/biosql/config In directory dev.open-bio.org:/tmp/cvs-serv13217/lib/bio/io/biosql/config Log Message: Directory /home/repository/bioruby/bioruby/lib/bio/io/biosql/config added to the repository From helios at dev.open-bio.org Tue Mar 25 15:46:38 2008 From: helios at dev.open-bio.org (Raoul Jean Pierre Bonnal) Date: Tue, 25 Mar 2008 15:46:38 -0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/biosql - New directory Message-ID: <200803251546.m2PFkS7R013241@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/biosql In directory dev.open-bio.org:/tmp/cvs-serv13217/lib/bio/db/biosql Log Message: Directory /home/repository/bioruby/bioruby/lib/bio/db/biosql added to the repository --> Using per-directory sticky tag `BRANCH-biohackathon2008' From helios at dev.open-bio.org Tue Mar 25 15:46:40 2008 From: helios at dev.open-bio.org (Raoul Jean Pierre Bonnal) Date: Tue, 25 Mar 2008 15:46:40 -0000 Subject: [BioRuby-cvs] bioruby/lib/bio/io/biosql - New directory Message-ID: <200803251546.m2PFkSDD013243@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/io/biosql In directory dev.open-bio.org:/tmp/cvs-serv13217/lib/bio/io/biosql Log Message: Directory /home/repository/bioruby/bioruby/lib/bio/io/biosql added to the repository --> Using per-directory sticky tag `BRANCH-biohackathon2008' From helios at dev.open-bio.org Tue Mar 25 15:46:59 2008 From: helios at dev.open-bio.org (Raoul Jean Pierre Bonnal) Date: Tue, 25 Mar 2008 15:46:59 -0000 Subject: [BioRuby-cvs] bioruby .project,NONE,1.1.2.1 Message-ID: <200803251546.m2PFkYHd013326@dev.open-bio.org> Update of /home/repository/bioruby/bioruby In directory dev.open-bio.org:/tmp/cvs-serv13290 Added Files: Tag: BRANCH-biohackathon2008 .project Log Message: BioSQL release "MIFI". biosql->biosequence, biosequence->biosql. Supported formats: Embl, Genbank; support sql stransactions creating new sequences on biosql; does not support references and comments for genbank and embl. Fasta->biosequence->biosql dosn't work. --- NEW FILE: .project --- bioruby org.rubypeople.rdt.core.rubybuilder org.rubypeople.rdt.core.rubynature From helios at dev.open-bio.org Tue Mar 25 15:47:01 2008 From: helios at dev.open-bio.org (Raoul Jean Pierre Bonnal) Date: Tue, 25 Mar 2008 15:47:01 -0000 Subject: [BioRuby-cvs] bioruby/lib/bio/io sql.rb,1.8,1.8.2.1 Message-ID: <200803251546.m2PFkYcK013334@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/io In directory dev.open-bio.org:/tmp/cvs-serv13290/lib/bio/io Modified Files: Tag: BRANCH-biohackathon2008 sql.rb Log Message: BioSQL release "MIFI". biosql->biosequence, biosequence->biosql. Supported formats: Embl, Genbank; support sql stransactions creating new sequences on biosql; does not support references and comments for genbank and embl. Fasta->biosequence->biosql dosn't work. Index: sql.rb =================================================================== RCS file: /home/repository/bioruby/bioruby/lib/bio/io/sql.rb,v retrieving revision 1.8 retrieving revision 1.8.2.1 diff -C2 -d -r1.8 -r1.8.2.1 *** sql.rb 5 Apr 2007 23:35:41 -0000 1.8 --- sql.rb 25 Mar 2008 15:46:32 -0000 1.8.2.1 *************** *** 1,365 **** - # - # = bio/io/sql.rb - BioSQL access module - # - # Copyright:: Copyright (C) 2002 Toshiaki Katayama - # Copyright:: Copyright (C) 2006 Raoul Jean Pierre Bonnal - # License:: The Ruby License - # - # $Id$ - # - - begin - require 'dbi' - rescue LoadError - end - require 'bio/sequence' - require 'bio/feature' - - - module Bio - - class SQL - - def initialize(db = 'dbi:Mysql:biosql', user = nil, pass = nil) - @dbh = DBI.connect(db, user, pass) - end - - def close - @dbh.disconnect - end - - # Returns Bio::SQL::Sequence object. - def fetch(accession) # or display_id for fall back - query = "select * from bioentry where accession = ?" - entry = @dbh.execute(query, accession).fetch - return Sequence.new(@dbh, entry) if entry - - query = "select * from bioentry where display_id = ?" - entry = @dbh.execute(query, accession).fetch - return Sequence.new(@dbh, entry) if entry - end - alias get_by_id fetch - - - # for lazy fetching - - class Sequence - - def initialize(dbh, entry) - @dbh = dbh - @bioentry_id = entry['bioentry_id'] - @database_id = entry['biodatabase_id'] - @entry_id = entry['display_id'] - @accession = entry['accession'] - @version = entry['entry_version'] - @division = entry['division'] - end - attr_reader :accession, :division, :entry_id, :version - - - def to_fasta - if seq = seq - return seq.to_fasta(@accession) - end - end - - # Returns Bio::Sequence::NA or AA object. - def seq - query = "select * from biosequence where bioentry_id = ?" - row = @dbh.execute(query, @bioentry_id).fetch - return unless row - - mol = row['alphabet'] - seq = row['seq'] - - case mol - when /.na/i # 'dna' or 'rna' - Bio::Sequence::NA.new(seq) - else # 'protein' - Bio::Sequence::AA.new(seq) - end - end - - # Returns Bio::Sequence::NA or AA object (by lazy fetching). - def subseq(from, to) - length = to - from + 1 - query = "select alphabet, substring(seq, ?, ?) as subseq" + - " from biosequence where bioentry_id = ?" - row = @dbh.execute(query, from, length, @bioentry_id).fetch - return unless row - - mol = row['alphabet'] - seq = row['subseq'] - - case mol - when /.na/i # 'dna' or 'rna' - Bio::Sequence::NA.new(seq) - else # 'protein' - Bio::Sequence::AA.new(seq) - end - end - - - # Returns Bio::Features object. - def features - array = [] - query = "select * from seqfeature where bioentry_id = ?" - @dbh.execute(query, @bioentry_id).fetch_all.each do |row| - next unless row - - f_id = row['seqfeature_id'] - k_id = row['type_term_id'] - s_id = row['source_term_id'] - rank = row['rank'].to_i - 1 ! # key : type (gene, CDS, ...) ! type = feature_key(k_id) ! ! # source : database (EMBL/GenBank/SwissProt) ! database = feature_source(s_id) ! ! # location : position ! locations = feature_locations(f_id) ! ! # qualifier ! qualifiers = feature_qualifiers(f_id) ! ! # rank ! array[rank] = Bio::Feature.new(type, locations, qualifiers) ! end ! return Bio::Features.new(array) ! end ! ! ! # Returns reference informations in Array of Hash (not Bio::Reference). ! def references ! array = [] ! query = <<-END ! select * from bioentry_reference, reference ! where bioentry_id = ? and ! bioentry_reference.reference_id = reference.reference_id ! END ! @dbh.execute(query, @bioentry_id).fetch_all.each do |row| ! next unless row ! ! hash = { ! 'start' => row['start_pos'], ! 'end' => row['end_pos'], ! 'journal' => row['location'], ! 'title' => row['title'], ! 'authors' => row['authors'], ! 'medline' => row['crc'] ! } ! hash.default = '' ! ! rank = row['rank'].to_i - 1 ! array[rank] = hash ! end ! return array ! end ! ! ! # Returns the first comment. For complete comments, use comments method. ! def comment ! query = "select * from comment where bioentry_id = ?" ! row = @dbh.execute(query, @bioentry_id).fetch ! row ? row['comment_text'] : '' ! end ! ! # Returns comments in an Array of Strings. ! def comments ! array = [] ! query = "select * from comment where bioentry_id = ?" ! @dbh.execute(query, @bioentry_id).fetch_all.each do |row| ! next unless row ! rank = row['rank'].to_i - 1 ! array[rank] = row['comment_text'] ! end ! return array ! end ! ! def database ! query = "select * from biodatabase where biodatabase_id = ?" ! row = @dbh.execute(query, @database_id).fetch ! row ? row['name'] : '' ! end ! ! def date ! query = "select * from bioentry_date where bioentry_id = ?" ! row = @dbh.execute(query, @bioentry_id).fetch ! row ? row['date'] : '' ! end ! ! def dblink ! query = "select * from bioentry_direct_links where source_bioentry_id = ?" ! row = @dbh.execute(query, @bioentry_id).fetch ! row ? [row['dbname'], row['accession']] : [] ! end ! ! def definition ! query = "select * from bioentry_description where bioentry_id = ?" ! row = @dbh.execute(query, @bioentry_id).fetch ! row ? row['description'] : '' ! end ! ! def keyword ! query = "select * from bioentry_keywords where bioentry_id = ?" ! row = @dbh.execute(query, @bioentry_id).fetch ! row ? row['keywords'] : '' ! end ! ! # Use lineage, common_name, ncbi_taxa_id methods to extract in detail. ! def taxonomy ! query = <<-END ! select taxon_name.name, taxon.ncbi_taxon_id from bioentry ! join taxon_name using(taxon_id) join taxon using (taxon_id) ! where bioentry_id = ? ! END ! row = @dbh.execute(query, @bioentry_id).fetch ! # @lineage = row ? row['full_lineage'] : '' ! @common_name = row ? row['name'] : '' ! @ncbi_taxa_id = row ? row['ncbi_taxon_id'] : '' ! row ? [@lineage, @common_name, @ncbi_taxa_id] : [] ! end ! def lineage ! taxonomy unless @lineage ! return @lineage ! end - def common_name - taxonomy unless @common_name - return @common_name - end ! def ncbi_taxa_id ! taxonomy unless @ncbi_taxa_id ! return @ncbi_taxa_id end ! ! ! private ! ! def feature_key(k_id) ! query = "select * from term where term_id= ?" ! row = @dbh.execute(query, k_id).fetch ! row ? row['name'] : '' end ! ! def feature_source(s_id) ! query = "select * from term where term_id = ?" ! row = @dbh.execute(query, s_id).fetch ! row ? row['name'] : '' end ! ! def feature_locations(f_id) ! locations = [] ! query = "select * from location where seqfeature_id = ?" ! @dbh.execute(query, f_id).fetch_all.each do |row| ! next unless row ! ! location = Bio::Location.new ! location.strand = row['strand'] ! location.from = row['start_pos'] ! location.to = row['end_pos'] ! ! xref = feature_locations_remote(row['dbxref_if']) ! location.xref_id = xref.shift unless xref.empty? ! ! # just omit fuzzy location for now... ! #feature_locations_qv(row['seqfeature_location_id']) ! ! rank = row['rank'].to_i - 1 ! locations[rank] = location ! end ! return Bio::Locations.new(locations) end ! ! def feature_locations_remote(l_id) ! query = "select * from dbxref where dbxref_id = ?" ! row = @dbh.execute(query, l_id).fetch ! row ? [row['accession'], row['version']] : [] end ! ! def feature_locations_qv(l_id) ! query = "select * from location_qualifier_value where location_id = ?" ! row = @dbh.execute(query, l_id).fetch ! row ? [row['value'], row['int_value']] : [] end ! ! def feature_qualifiers(f_id) ! qualifiers = [] ! query = "select * from seqfeature_qualifier_value where seqfeature_id = ?" ! @dbh.execute(query, f_id).fetch_all.each do |row| ! next unless row ! ! key = feature_qualifiers_key(row['seqfeature_id']) ! value = row['value'] ! qualifier = Bio::Feature::Qualifier.new(key, value) ! ! rank = row['rank'].to_i - 1 ! qualifiers[rank] = qualifier ! end ! return qualifiers.compact # .compact is nasty hack for a while end ! ! def feature_qualifiers_key(q_id) ! query = <<-END ! select * from seqfeature_qualifier_value ! join term using(term_id) where seqfeature_id = ? ! END ! row = @dbh.execute(query, q_id).fetch ! row ? row['name'] : '' end ! end ! ! end # SQL ! ! end # Bio ! if __FILE__ == $0 ! begin ! require 'pp' ! alias p pp ! rescue LoadError end ! ! db = ARGV.empty? ? 'dbi:Mysql:database=biosql;host=localhost' : ARGV.shift ! serv = Bio::SQL.new(db, 'root') ! ! ent0 = serv.fetch('X76706') ! ent0 = serv.fetch('A15H9FIB') ! ent1 = serv.fetch('J01902') ! ent2 = serv.fetch('X04311') ! ! pp ent0.features ! pp ent0.references ! ! pp ent1.seq ! pp ent1.seq.translate ! pp ent1.seq.gc ! pp ent1.subseq(1,20) ! ! pp ent2.accession ! pp ent2.comment ! pp ent2.comments ! pp ent2.common_name ! pp ent2.database ! pp ent2.date ! pp ent2.dblink ! pp ent2.definition ! pp ent2.division ! pp ent2.entry_id ! pp ent2.features ! pp ent2.keyword ! pp ent2.lineage ! pp ent2.ncbi_taxa_id ! pp ent2.references ! pp ent2.seq ! pp ent2.subseq(1,10) ! pp ent2.taxonomy ! pp ent2.version ! end - --- 1,145 ---- ! require 'rubygems' ! require 'erb' ! require 'composite_primary_keys' ! # BiosqlPlug ! =begin ! Ok Hilmar gives to me some clarification ! 1) "EMBL/GenBank/SwissProt" name in term table, is only a convention assuming data loaded by genbank embl ans swissprot formats. ! If your features come from others ways for example blast or alignment ... whatever.. the user as to take care about the source. ! =end ! =begin ! TODO: ! 1) source_term_id => surce_term and check before if the source term is present or not and the level, the root should always be something "EMBL/GenBank/SwissProt" or contestualized. ! 2) Into DummyBase class delete connection there and use Bio::ArSQL.establish_connection which reads info from a yml file. ! 3) Chk Locations in Biofeatures ArSQL ! =end ! module Bio ! class SQL ! #no check is made ! def self.establish_connection(configurations, env) ! #configurations is an hash similar what YAML returns. ! #{:database=>"biorails_development", :adapter=>"postgresql", :username=>"rails", :password=>nil} ! configurations.assert_valid_keys('development', 'production','test') ! configurations[env].assert_valid_keys('database','adapter','username','password') ! DummyBase.configurations = configurations ! DummyBase.establish_connection "#{env}" end ! ! def self.fetch_id(id) ! Bio::SQL::Bioentry.find(id) end ! ! def self.fetch_accession(accession) ! accession.upcase! ! Bio::SQL::Bioentry.exists?(:accession => accession) ? Bio::SQL::Sequence.new(:entry=>Bio::SQL::Bioentry.find_by_accession(accession)) : nil end ! ! def self.exists_accession(accession) ! Bio::SQL::Bioentry.find_by_accession(accession.upcase).nil? ? false : true end ! ! def self.list_entries ! Bio::SQL::Bioentry.find(:all).collect{|entry| ! {:id=>entry.bioentry_id, :accession=>entry.accession} ! } end ! ! def self.list_databases ! Bio::SQL::Biodatabase.find(:all).collect{|entry| ! {:id=>entry.biodatabase_id, :name => entry.name} ! } end ! ! def self.delete_entry_id(id) ! Bioentry.delete(id) end ! ! def self.delete_entry_accession(accession) ! Bioentry.delete(Bioentry.find_by_accession(accession)) end ! ! ! class DummyBase < ActiveRecord::Base ! #NOTE: Using postgresql, not setting sequence name, system will discover the name by default. ! #NOTE: this class will not establish the connection automatically ! self.abstract_class = true ! self.pluralize_table_names = false ! #prepend table name to the usual id, avoid to specify primary id for every table ! self.primary_key_prefix_type = :table_name_with_underscore ! #biosql_configurations=YAML::load(ERB.new(IO.read(File.join(File.dirname(__FILE__),'../config', 'database.yml'))).result) ! #self.configurations=biosql_configurations ! #self.establish_connection "development" ! end #DummyBase ! ! autoload :Biodatabase, 'bio/io/biosql/biodatabase' ! autoload :Bioentry, 'bio/io/biosql/bioentry' ! autoload :BioentryDbxref, 'bio/io/biosql/bioentry_dbxref' ! autoload :BioentryPath, 'bio/io/biosql/bioentry_path' ! autoload :BioentryQualifierValue, 'bio/io/biosql/bioentry_qualifier_value' ! autoload :BioentryReference, 'bio/io/biosql/bioentry_reference' ! autoload :BioentryRelationship, 'bio/io/biosql/bioentry_relationship' ! autoload :Biosequence, 'bio/io/biosql/biosequence' ! autoload :Comment, 'bio/io/biosql/comment' ! autoload :Dbxref, 'bio/io/biosql/dbxref' ! autoload :DbxrefQualifierValue, 'bio/io/biosql/dbxref_qualifier_value' ! autoload :Location, 'bio/io/biosql/location' ! autoload :LocationQualifierValue, 'bio/io/biosql/location_qualifier_value' ! autoload :Ontology, 'bio/io/biosql/ontology' ! autoload :Reference, 'bio/io/biosql/reference' ! autoload :Seqfeature, 'bio/io/biosql/seqfeature' ! autoload :SeqfeatureDbxref, 'bio/io/biosql/seqfeature_dbxref' ! autoload :SeqfeaturePath, 'bio/io/biosql/seqfeature_path' ! autoload :SeqfeatureQualifierValue, 'bio/io/biosql/seqfeature_qualifier_value' ! autoload :SeqfeatureRelationship, 'bio/io/biosql/seqfeature_relationship' ! autoload :Taxon, 'bio/io/biosql/taxon' ! autoload :TaxonName, 'bio/io/biosql/taxon_name' ! autoload :Term, 'bio/io/biosql/term' ! autoload :TermDbxref, 'bio/io/biosql/term_dbxref' ! autoload :TermPath, 'bio/io/biosql/term_path' ! autoload :TermRelationship, 'bio/io/biosql/term_relationship' ! autoload :TermRelationshipTerm, 'bio/io/biosql/term_relationship_term' ! autoload :Sequence, 'bio/db/biosql/sequence' ! end #biosql ! ! end #Bio if __FILE__ == $0 ! require 'rubygems' ! require 'composite_primary_keys' ! require 'bio' ! require 'pp' ! ! # pp connection = Bio::SQL.establish_connection('bio/io/biosql/config/database.yml','development') ! pp connection = Bio::SQL.establish_connection({'development'=>{'database'=>"biorails_development", 'adapter'=>"postgresql", 'username'=>"rails", 'password'=>nil}},'development') ! #pp YAML::load(ERB.new(IO.read('bio/io/biosql/config/database.yml')).result) ! ! if nil ! pp Bio::SQL.list_entries ! bioseq = Bio::SQL.fetch_accession('AJ224122') ! pp bioseq ! pp bioseq.entry_id ! #TODO create a test only for tables not sequence here ! pp bioseq.molecule_type ! #pp bioseq.molecule_type.class ! #bioseq.molecule_type_update('dna', 1) ! pp Bio::SQL::Taxon.find(8121).taxon_names end ! #pp bioseq.molecule_type ! #term = Bio::SQL::Term.find_by_name('mol_type') ! #pp term ! #pp bioseq.entry.bioentry_qualifier_values.create(:term=>term, :rank=>2, :value=>'pippo') ! #pp bioseq.entry.bioentry_qualifier_values.inspect ! #pp bioseq.entry.bioentry_qualifier_values.find_all_by_term_id(26) ! #pp primo.class ! # pp primo.value='dna' ! # pp primo.save ! #pp bioseq.molecule_type= 'prova' ! ! #Bio::SQL::BioentryQualifierValue.delete(delete.bioentry_id,delete.term_id,delete.rank) ! ! end From helios at dev.open-bio.org Tue Mar 25 15:47:04 2008 From: helios at dev.open-bio.org (Raoul Jean Pierre Bonnal) Date: Tue, 25 Mar 2008 15:47:04 -0000 Subject: [BioRuby-cvs] bioruby/lib/bio/db/biosql sequence.rb,NONE,1.1.2.1 Message-ID: <200803251546.m2PFkYrY013322@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/db/biosql In directory dev.open-bio.org:/tmp/cvs-serv13290/lib/bio/db/biosql Added Files: Tag: BRANCH-biohackathon2008 sequence.rb Log Message: BioSQL release "MIFI". biosql->biosequence, biosequence->biosql. Supported formats: Embl, Genbank; support sql stransactions creating new sequences on biosql; does not support references and comments for genbank and embl. Fasta->biosequence->biosql dosn't work. --- NEW FILE: sequence.rb --- #TODO save on db reading from a genbank or embl object module Bio class SQL class Sequence private # example # bioentry_qualifier_anchor :molecule_type, :synonym=>'mol_type' # this function creates other 3 functions, molecule_type, molecule_type=, molecule_type_update #molecule_type => return an array of strings, where each string is the value associated with the qualifier, ordered by rank. #molecule_type=value add a bioentry_qualifier value to the table #molecule_type_update(value, rank) update an entry of the table with an existing rank #the method inferr the qualifier term from the name of the first symbol, or you can specify a synonym to use #creating an object with to_biosql is transaction safe. #TODO: implement setting for more than a qualifier-vale. def self.bioentry_qualifier_anchor(sym, *args) options = args.first || Hash.new #options.assert_valid_keys(:rank,:synonym,:multi) method_reader = sym.to_s.to_sym method_writer_operator = (sym.to_s+"=").to_sym method_writer_modder = (sym.to_s+"_update").to_sym synonym = options[:synonym].nil? ? sym.to_s : options[:synonym] #Bio::SQL::Term.create(:name=>synonym, :ontology=> Bio::SQL::Ontology.find_by_name('Annotation Tags')) unless Bio::SQL::Term.exists?(:name =>synonym) send :define_method, method_reader do #return an array of bioentry_qualifier_values begin term = Term.find_or_create_by_name(:name => synonym, :ontology=> Ontology.find_by_name('Annotation Tags')) bioentry_qualifier_values = @entry.bioentry_qualifier_values.find_all_by_term_id(term) bioentry_qualifier_values.map{|row| row.value} unless bioentry_qualifier_values.nil? rescue Exception => e puts "Reader Error: #{synonym} #{e.message}" end end send :define_method, method_writer_operator do |value| begin term = Term.find_or_create_by_name(:name => synonym, :ontology=> Ontology.find_by_name('Annotation Tags')) datas = @entry.bioentry_qualifier_values.find_all_by_term_id(term.term_id) #add an element incrementing the rank or setting the first to 1 @entry.bioentry_qualifier_values.create(:term_id=>term.term_id, :rank=>datas.empty? ? 1 : datas.last.rank.succ, :value=>value) rescue Exception => e puts "WriterOperator= Error: #{synonym} #{e.message}" end end send :define_method, method_writer_modder do |value, rank| begin term = Term.find_or_create_by_name(:name => synonym, :ontology=> Ontology.find_by_name('Annotation Tags')) data = @entry.bioentry_qualifier_values.find_by_term_id_and_rank(term.term_id, rank) if data.nil? send method_writer_operator, value else data.value=value data.save! end rescue Exception => e puts "WriterModder Error: #{synonym} #{e.message}" end end end public attr_reader :entry def delete @entry.destroy end def get_seqfeature(sf) #in seqfeature BioSQL class locations_str = sf.locations.map{|loc| loc.to_s}.join(',') #pp sf.locations.inspect locations_str = "join(#{locations_str})" if sf.locations.count>1 Bio::Feature.new(sf.type_term.name, locations_str,sf.seqfeature_qualifier_values.collect{|sfqv| Bio::Feature::Qualifier.new(sfqv.term.name,sfqv.value)}) end def length=(len) @entry.biosequence.length=len end def initialize(options={}) options.assert_valid_keys(:entry, :biodatabase_id,:biosequence) return @entry = options[:entry] unless options[:entry].nil? return to_biosql(options[:biosequence], options[:biodatabase_id]) unless options[:biosequence].nil? or options[:biodatabase_id].nil? end def to_biosql(bs,biodatabase_id) #Transcaction works greatly!!! # begin Bioentry.transaction do @entry = Bioentry.new(:biodatabase_id=>biodatabase_id, :name=>bs.entry_id) # pp "primary" self.primary_accession = bs.primary_accession # pp "def" self.definition = bs.definition unless bs.definition.nil? # pp "seqver" self.sequence_version = bs.sequence_version # pp "divi" self.division = bs.division unless bs.division.nil? @entry.save! # pp "secacc" bs.secondary_accessions.each do |sa| #write as qualifier every secondary accession into the array self.secondary_accessions = sa end #to create the sequence entry needs to exists # pp "seq" self.seq = bs.seq unless bs.seq.nil? # pp "mol" self.molecule_type = bs.molecule_type unless bs.molecule_type.nil? # pp "dc" self.data_class = bs.data_class unless bs.data_class.nil? # pp "top" self.topology = bs.topology unless bs.topology.nil? # pp "datec" self.date_created = bs.date_created unless bs.date_created.nil? # pp "datemod" self.date_modified = bs.date_modified unless bs.date_modified.nil? # pp "key" bs.keywords.each do |kw| #write as qualifier every secondary accessions into the array self.keywords = kw end #FIX: problem settinf texon_name: embl has "Arabidopsis thaliana (thale cress)" but in taxon_name table there isn't this name. I must check if there is a new version of the table #pp "spec" self.species = bs.species unless bs.species.nil? # pp "Debug: #{bs.species}" # pp "feat" bs.features.each do |feat| self.feature=feat end #TODO: add comments and references end #transaction return self rescue Exception => e pp "to_biosql exception: #{e}" end end #to_biosql def name @entry.name end alias entry_id name def name=(value) @entry.name=value end alias entry_id= name= def primary_accession @entry.accession end def primary_accession=(value) @entry.accession=value end #TODO def secondary_accession # @entry.bioentry_qualifier_values # end def organism @entry.taxon.nil? ? "" : @entry.taxon.taxon_scientific_name.name end alias species organism def organism=(value) taxon_name=TaxonName.find_by_name_and_name_class(value,'scientific name') if taxon_name.nil? puts "Error value doesn't exists in taxon_name table with scientific name constraint." else @entry.taxon_id=taxon_name.taxon_id @entry.save! end end alias species= organism= def database @entry.biodatabase.name end def database_desc @entry.biodatabase.description end def version @entry.version end alias sequence_version version def version=(value) @entry.version=value end alias sequence_version= version= def division @entry.division end def division=(value) @entry.division=value end def description @entry.description end alias definition description def description=(value) @entry.description=value end alias definition= description= def identifier @entry.identifier end def identifier=(value) @entry.identifier=value end bioentry_qualifier_anchor :data_class bioentry_qualifier_anchor :molecule_type, :synonym=>'mol_type' bioentry_qualifier_anchor :topology bioentry_qualifier_anchor :date_created bioentry_qualifier_anchor :date_modified, :synonym=>'date_changed' bioentry_qualifier_anchor :keywords, :synonym=>'keyword' bioentry_qualifier_anchor :secondary_accessions, :synonym=>'secondary_accession' def features Bio::Features.new(@entry.seqfeatures.collect {|sf| self.get_seqfeature(sf)}) end def feature=(feat) #TODO: fix ontology_id and source_term_id type_term = Term.find_or_create_by_name(:name=>feat.feature, :ontology_id=>1) seqfeature = Seqfeature.new(:bioentry=>@entry, :source_term_id=>2, :type_term=>type_term, :rank=>@entry.seqfeatures.count.succ, :display_name=>'') seqfeature.save! feat.locations.each do |loc| location = Location.new(:seqfeature=>seqfeature, :start_pos=>loc.from, :end_pos=>loc.to, :strand=>loc.strand, :rank=>seqfeature.locations.count.succ) location.save! end feat.each do |qualifier| qual_term = Term.find_or_create_by_name(:name=>qualifier.qualifier, :ontology_id=>3) qual = SeqfeatureQualifierValue.new(:seqfeature=>seqfeature, :term=>qual_term, :value=>qualifier.value, :rank=>seqfeature.seqfeature_qualifier_values.count.succ) qual.save! end end def seq Bio::Sequence.auto(@entry.biosequence.seq) unless @entry.biosequence.nil? end def seq=(value) #chk which type of alphabet is, NU/NA/nil #value could be nil ? I think no. if @entry.biosequence.nil? @entry.biosequence = Biosequence.new(:seq=>value) @entry.biosequence.save! else @entry.biosequence.seq=value end self.length=value.length end def taxonomy tax = [] taxon = @entry.taxon while taxon and taxon.taxon_id != taxon.parent_taxon_id tax << taxon.taxon_scientific_name.name #Note: I don't like this call very much, correct with a relationship in the ref class. taxon = Taxon.find(taxon.parent_taxon_id) end tax.reverse end def length @entry.biosequence.length end def references #return and array of hash, hash has these keys ["title", "dbxref_id", "reference_id", "authors", "crc", "location"] #probably would be better to d a class refrence to collect these informations @entry.bioentry_references.collect do |ref| hash = Hash.new hash['authors'] = ref.reference.authors hash['title'] = ref.reference.title hash['embl_gb_record_number'] = ref.reference.rank #about location/journal take a look to hilmar' schema overview. #TODO: solve the problem with specific comment per reference. #TODO: get dbxref hash['journal'] = ref.reference.location hash['xrefs'] = "#{ref.reference.dbxref.dbname}; #{ref.reference.dbxref.accession}." Bio::Reference.new(hash) end end def comments @entry.comments.map do |comment| comment.comment_text end end def save #I should add chks for SQL errors @entry.biosequence.save @entry.save end def to_fasta #prima erano 2 print in stdout, meglio ritornare una stringa in modo che poi ci si possa fare quello che si vuole #print ">" + accession + "\n" #print seq.gsub(Regexp.new(".{1,#{60}}"), "\\0\n") ">" + accession + "\n" + seq.gsub(Regexp.new(".{1,#{60}}"), "\\0\n") end def to_fasta_reverse_complememt ">" + accession + "\n" + seq.reverse_complement.gsub(Regexp.new(".{1,#{60}}"), "\\0\n") end # converts Bio::SQL::Sequence to Bio::Sequence # --- # *Arguments*: # *Returns*:: Bio::Sequence object #TODO: def to_biosequence # sequence = Bio::Sequence.new(seq) # sequence.entry_id = entry_id # # sequence.primary_accession = accession # sequence.secondary_accessions = accession # # sequence.molecule_type = natype # sequence.division = division # sequence.topology = circular # # sequence.sequence_version = version # #sequence.date_created = nil #???? # sequence.date_modified = date # # sequence.definition = definition # sequence.keywords = keywords # sequence.species = organism # sequence.classification = self.taxonomy.to_s.sub(/\.\z/, '').split(/\s*\;\s*/) # #sequence.organnella = nil # not used # sequence.comments = comment # sequence.references = references # sequence.features = features # return sequence # end # # def load_fasta(entry, biodatabase) # result=nil # # if !entry.accession.nil? then # ## pp biodatabase # begin # Bioentry.transaction do # bioentry=Bioentry.new(:biodatabase=>biodatabase, :name=>entry.accession, :accession=>entry.accession, \ # :description=>entry.definition, :version=>0) # # # bioentry=Bioentry.new(:biodatabase=>biodatabase, :name=>entry.accession, :accession=>entry.accession, \ # # :description=>entry.definition, :version=>entry.acc_version.split(/\./).last, :identifier=>entry.gi) # ## pp bioentry # bioentry.save! # result=bioentry # begin # Biosequence.transaction do # bioentry.biosequence = Biosequence.new(:seq=>entry.seq, :version=>0, :length=>entry.seq.length, :alphabet=>'') # bioentry.biosequence.save! # end #Bioseqence.transaction # rescue Exception => exc # puts "Error Biosequence: #{exc.message}" # end #Rescue Biosequence # end #Bioentry.transaction # rescue ActiveRecord::RecordInvalid => e # puts "Error: Transaction Aborted on class #{e.record.class}, table #{e.record.class.table_name} due to:" # e.record.errors.each{|att, msg| # puts "#{att} => #{msg}" # } # rescue Exception => exc # puts "Errore Bioentry: #{exc.message}" # end #Resce Bioentry # # end #entry chk # return result # end #load_fasta # # def load_gb(entry, biodatabase) # ## pp biodatabase # result=nil # # begin # Bioentry.transaction do # bioentry=Bioentry.new(:biodatabase=>biodatabase, :name=>entry.entry_id, :accession=>entry.entry_id, :division=>entry.division, \ # :description=>entry.definition, :version=>entry.version, :identifier=>entry.gi.split(/:/).last.to_i) # ## pp bioentry # bioentry.save! # # result=bioentry # # # end #Bioentry.transaction # ##debug pp ["Bioentry", [:name=>entry.entry_id, :accession=>entry.entry_id, :division=>entry.division, # ## :description=>entry.definition, :version=>entry.version, :identifier=>entry.gi.split(/:/).last.to_i]] # # #delete biodatabase.bioentries << bioentry # #note Alphabet not defined # # begin # rank_comment=1 # Comment.transaction do # if !entry.comment.empty? then # bioentry.comment = Comment.new(:comment_text=>entry.comment, :rank=>rank_comment) # bioentry.comment.save! # rank_comment=rank_comment.next # end # end #Comment.transaction # rescue Exception => exc # puts "Error Comment: #{exc.message}" # end #Rescue Command # #debug pp "Comment" # ##debug pp ["Comment", [:comment_text=>entry.comment]] if !entry.comment.empty? # begin # Biosequence.transaction do # bioentry.biosequence = Biosequence.new(:seq=>entry.seq, :version=>0, :length=>entry.seq.length, :alphabet=>'') # bioentry.biosequence.save! # end #Bioseqence.transaction # rescue Exception => exc # puts "Error Biosequence: #{exc.message}" # end #Rescue Biosequence # #debug pp "Biosequence" # ##debug pp ["Biosequence", :seq=>entry.seq, :version=>0, :length=>entry.seq.length, :alphabet=>''] # begin # rank_seqfeature=1 # Seqfeature.transaction do # entry.features.each do |feature| # #note Rank default to ZERO, display_name String empty # #note Chek if exists term name ##delete puts "Feature #{feature.inspect}" ##delete puts "FeatureFeature #{feature.feature.inspect}" # # type_term = Term.exists?(:name=>feature.feature) ? Term.find_by_name(feature.feature) : Term.create!(:name=>feature.feature, :ontology_id=>1) # # seqfeature = Seqfeature.new(:bioentry=>bioentry, :source_term_id=>2, :typeterm=>Term.find_by_name(feature.feature), :rank=>rank_seqfeature, :display_name=>'') ##delete puts "Type Term #{type_term.inspect}" # seqfeature = Seqfeature.new(:bioentry=>bioentry, :source_term_id=>2, :type_term=>type_term, :rank=>rank_seqfeature, :display_name=>'') ##delete puts "Seqfeature #{seqfeature.inspect}" # seqfeature.save! # ##debug pp ["Seqfeature", [:source_term_id=>2, :typeterm=>Term.find_by_name(feature.feature), :rank=>0, :display_name=>'']] # begin # Location.transaction do # feature.locations.each do |loc| # location = Location.new(:seqfeature=>seqfeature, :start_pos=>loc.from, :end_pos=>loc.to, :strand=>loc.strand) # location.save! # ##debug pp ["Location",[:start_pos=>loc.from, :end_pos=>loc.to, :strand=>loc.strand]] # end #locations # end #Location.transaction # rescue Exception => exc # puts "Error Location: #{exc.message}" # end #Rescue Location # #debug pp "Locations" # #delete bioentry.seqfeatures << seqfeature ##delete if nil # begin # rank_seqfeaturequalifiervalue=0 # rank_qual_qualifier="" # SeqfeatureQualifierValue.transaction do # feature.each do |qual| # # #gestisce il livello dei qualificatori... # if (rank_qual_qualifier==qual.qualifier) then # rank_seqfeaturequalifiervalue=rank_seqfeaturequalifiervalue.next # else # rank_seqfeaturequalifiervalue=1 # rank_qual_qualifier=qual.qualifier # end # # ##debug pp ["SeqfeatureQualifierValue", qual.qualifier, [ :term=>Term.find_by_name(qual.qualifier), :value=>qual.value]] # term = Term.exists?(:name=>qual.qualifier) ? Term.find_by_name(qual.qualifier) : Term.create!(:name=>qual.qualifier, :ontology_id=>3) # # # qual = SeqfeatureQualifierValue.new(:seqfeature=>seqfeature, :term=>Term.find_by_name(qual.qualifier), :value=>qual.value, :rank=>rank_seqfeaturequalifiervalue) # qual = SeqfeatureQualifierValue.new(:seqfeature=>seqfeature, :term=>term, :value=>qual.value, :rank=>rank_seqfeaturequalifiervalue) # qual.save! # end #qualifiers # end #SeqfeatureQualifierValue.transaction # rescue Exception => exc # puts "Error SeqfeatureQualifierValue: #{exc.message}" # end #Rescue SeqfeatureQualifierValue ###delete end #debug if nil # #debug pp "SeqfeatureQualifierValue" # rank_seqfeature=rank_seqfeature.next # end #features # end #Seqfeature.transaction # rescue Exception => exc # puts "Error Seqfeature: #{exc.message}" # end #Rescue Seqfeature # # end #Bioentry.transaction # rescue ActiveRecord::RecordInvalid => e # puts "Error: Transaction Aborted on class #{e.record.class}, table #{e.record.class.table_name} due to:" # e.record.errors.each{|att, msg| # puts "#{att} => #{msg}" # } # rescue Exception => exc # puts "Errore Bioentry: #{exc.message}" # end #Resce Bioentry # return result # end #load_gb # # def load_embl(entry, biodatabase) # # # puts biodatabase # result=nil # # begin # Bioentry.transaction do # bioentry=Bioentry.new(:biodatabase=>biodatabase, :name=>entry.entry_id, :accession=>entry.entry_id, :division=>entry.division, \ # :description=>entry.definition, :version=>entry.version, :identifier=>entry.entry_id) # # puts bioentry # bioentry.save! # result=bioentry # # # end #Bioentry.transaction # # puts ["Bioentry", [:name=>entry.entry_id, :accession=>entry.entry_id, :division=>entry.division,\ # # :description=>entry.definition, :version=>entry.version, :identifier=>entry.entry_id]] # # #delete biodatabase.bioentries << bioentry # #note Alphabet not defined # begin # rank_comment=1 # #qui potrebbero essercene di pi?? # Comment.transaction do # if !entry.cc.empty? # bioentry.comment = Comment.new(:comment_text=>entry.cc, :rank=>rank_comment) # bioentry.comment.save! # rank_comment=rank_comment.next # end # end #Comment.transaction # rescue Exception => exc # puts "Error Comment: #{exc.message}" # end #Rescue Command # # puts "Comment" # # puts ["Comment", [:comment_text=>entry.cc]] if !entry.cc.empty? # begin # Biosequence.transaction do # bioentry.biosequence = Biosequence.new(:seq=>entry.seq, :version=>0, :length=>entry.seq.length, :alphabet=>entry.molecule_type) # bioentry.biosequence.save! # end #Bioseqence.transaction # rescue Exception => exc # puts "Error Biosequence: #{exc.message}" # end #Rescue Biosequence # #debug pp "Biosequence" # ##debug pp ["Biosequence", :seq=>entry.seq, :version=>0, :length=>entry.seq.length, :alphabet=>''] # begin # rank_seqfeature=1 # Seqfeature.transaction do # entry.features.each do |feature| # #note Rank default to ZERO, display_name String empty # #note Chek if exists term name # type_term = Term.exists?(:name=>feature.feature) ? Term.find_by_name(feature.feature) : Term.create!(:name=>feature.feature, :ontology_id=>1) # # seqfeature = Seqfeature.new(:bioentry=>bioentry, :source_term_id=>2, :typeterm=>Term.find_by_name(feature.feature), :rank=>rank_seqfeature, :display_name=>'') # seqfeature = Seqfeature.new(:bioentry=>bioentry, :source_term_id=>2, :type_term=>type_term, :rank=>rank_seqfeature, :display_name=>'') # seqfeature.save! # ##debug pp ["Seqfeature", [:source_term_id=>2, :typeterm=>Term.find_by_name(feature.feature), :rank=>0, :display_name=>'']] # begin # Location.transaction do # feature.locations.each do |loc| # location = Location.new(:seqfeature=>seqfeature, :start_pos=>loc.from, :end_pos=>loc.to, :strand=>loc.strand) # location.save! # ##debug pp ["Location",[:start_pos=>loc.from, :end_pos=>loc.to, :strand=>loc.strand]] # end #locations # end #Location.transaction # rescue Exception => exc # puts "Error Location: #{exc.message}" # end #Rescue Location # #debug pp "Locations" # #delete bioentry.seqfeatures << seqfeature # begin # rank_seqfeaturequalifiervalue=0 # rank_qual_qualifier="" # SeqfeatureQualifierValue.transaction do # feature.each do |qual| # #gestisce il livello dei qualificatori... # if (rank_qual_qualifier==qual.qualifier) then # rank_seqfeaturequalifiervalue=rank_seqfeaturequalifiervalue.next # else # rank_seqfeaturequalifiervalue=1 # rank_qual_qualifier=qual.qualifier # end # # ##debug pp ["SeqfeatureQualifierValue", qual.qualifier, [ :term=>Term.find_by_name(qual.qualifier), :value=>qual.value]] # term = Term.exists?(:name=>qual.qualifier) ? Term.find_by_name(qual.qualifier) : Term.create!(:name=>qual.qualifier, :ontology_id=>3) # # qual = SeqfeatureQualifierValue.new(:seqfeature=>seqfeature, :term=>Term.find_by_name(qual.qualifier), :value=>qual.value, :rank=>rank_seqfeaturequalifiervalue) # qual = SeqfeatureQualifierValue.new(:seqfeature=>seqfeature, :term=>term, :value=>qual.value, :rank=>rank_seqfeaturequalifiervalue) # # qual.save! # end #qualifiers # end #SeqfeatureQualifierValue.transaction # rescue Exception => exc # puts "Error SeqfeatureQualifierValue: #{exc.message}" # end #Rescue SeqfeatureQualifierValue # #debug pp "SeqfeatureQualifierValue" # rank_seqfeature=rank_seqfeature.next # end #features # end #Seqfeature.transaction # rescue Exception => exc # puts "Error Seqfeature: #{exc.message}" # end #Rescue Seqfeature # end #Bioentry.transaction # rescue ActiveRecord::RecordInvalid => e # puts "Error: Transaction Aborted on class #{e.record.class}, table #{e.record.class.table_name} due to:" # e.record.errors.each{|att, msg| # puts "#{att} => #{msg}" # } # rescue Exception => exc # puts "Errore Bioentry: #{exc.message}" # end #Resce Bioentry # # return result # end #load_embl def to_biosequence bio_seq = Bio::Sequence.new(seq) bio_seq.entry_id = entry_id bio_seq.primary_accession = primary_accession bio_seq.secondary_accessions = secondary_accessions bio_seq.molecule_type = molecule_type #TODO: identify where is stored data_class in biosql bio_seq.data_class = data_class bio_seq.definition = description bio_seq.topology = topology bio_seq.date_created = date_created bio_seq.date_modified = date_modified bio_seq.division = division bio_seq.sequence_version = sequence_version bio_seq.keywords = keywords bio_seq.species = species bio_seq.classification = taxonomy bio_seq.references = references bio_seq.features = features return bio_seq end end #Sequence #gb=Bio::FlatcFile.open(Bio::GenBank, "/Development/Projects/Cocco/Data/Riferimenti/Genomi/NC_003098_Cocco_R6.gb") #db=Biodatabase.find_by_name('fake') #gb.each_entry {|entry| Sequence.new(:entry=>entry, :biodatabase=>db)} end #SQL end #Bio #TODO create tests for sequence object, roundtrip of informations if __FILE__ == $0 require 'bio' require 'bio/io/sql' require 'pp' # connection = Bio::SQL.establish_connection('bio/io/biosql/config/database.yml','development') connection = Bio::SQL.establish_connection({'development'=>{'database'=>"biorails_development", 'adapter'=>"postgresql", 'username'=>"rails", 'password'=>nil}},'development') databases = Bio::SQL.list_databases # parser = Bio::FlatFile.auto('/home/febo/Desktop/aj224122.embl') # parser = Bio::FlatFile.auto('/home/febo/Desktop/aj224122.gb') parser = Bio::FlatFile.auto('/home/febo/Desktop/aj224122.fasta') parser.each do |entry| biosequence = entry.to_biosequence result = Bio::SQL::Sequence.new(:biosequence=>biosequence,:biodatabase_id=>databases.first[:id]) unless Bio::SQL.exists_accession(biosequence.primary_accession) if result.nil? pp "The sequence is already present into the biosql database" else # pp "Sequence" puts result.to_biosequence.output(:genbank) #:embl end end #NOTE: ho sistemato le features e le locations, mancano le references e i comments. poi credo che il tutto sia a posto. if false sqlseq = Bio::SQL.fetch_accession('AJ224122') #only output tests. pp "Connection" pp connection pp "Seq in dbs" pp Bio::SQL.list_entries #; NC_003098 #pp sqlseq pp sqlseq.entry.inspect pp "sequence" #pp Bio::Sequence.auto(sqlseq.seq) pp "entry_id" pp sqlseq.entry_id pp "primary" pp sqlseq.accession pp "secondary_accessions" pp sqlseq.secondary_accessions pp "molecule type" pp sqlseq.molecule_type pp "data_class" pp sqlseq.data_class pp "division" pp sqlseq.division # NOTE : Topology is not represented in biosql? pp "topology" #TODO: CIRCULAR this at present maps to bioentry_qualifier_value, though there are plans to make it a column in table biosequence. pp sqlseq.topology pp "version" pp sqlseq.version #sequence.date_created = nil #???? pp "date modified" pp sqlseq.date_modified pp "definition" pp sqlseq.definition pp "keywords" pp sqlseq.keywords pp "species" pp sqlseq.organism #sequence.classification = self.taxonomy.to_s.sub(/\.\z/, '').split(/\s*\;\s*/)" pp "classification" pp sqlseq.taxonomy #sequence.organnella = nil # not used pp "comments" pp sqlseq.comments pp "references" pp sqlseq.references pp "features" pp sqlseq.features puts sqlseq.to_biosequence.output(:embl) end ## end From helios at dev.open-bio.org Tue Mar 25 15:47:05 2008 From: helios at dev.open-bio.org (Raoul Jean Pierre Bonnal) Date: Tue, 25 Mar 2008 15:47:05 -0000 Subject: [BioRuby-cvs] bioruby/lib/bio/io/biosql/config database.yml, NONE, 1.1.2.1 Message-ID: <200803251546.m2PFkYSk013330@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/io/biosql/config In directory dev.open-bio.org:/tmp/cvs-serv13290/lib/bio/io/biosql/config Added Files: Tag: BRANCH-biohackathon2008 database.yml Log Message: BioSQL release "MIFI". biosql->biosequence, biosequence->biosql. Supported formats: Embl, Genbank; support sql stransactions creating new sequences on biosql; does not support references and comments for genbank and embl. Fasta->biosequence->biosql dosn't work. --- NEW FILE: database.yml --- #This is the database configuration specific for BioSQL #User can configure it's db here development: adapter: postgresql database: biorails_development username: rails password: test: adapter: postgresql database: biorails_test username: rails password: production: adapter: postgresql database: biorails_production username: rails password: From helios at dev.open-bio.org Tue Mar 25 15:47:07 2008 From: helios at dev.open-bio.org (Raoul Jean Pierre Bonnal) Date: Tue, 25 Mar 2008 15:47:07 -0000 Subject: [BioRuby-cvs] bioruby/lib/bio/io/biosql ontology.rb, NONE, 1.1.2.1 reference.rb, NONE, 1.1.2.1 term_path.rb, NONE, 1.1.2.1 bioentry_dbxref.rb, NONE, 1.1.2.1 biodatabase.rb, NONE, 1.1.2.1 seqfeature.rb, NONE, 1.1.2.1 term_relationship.rb, NONE, 1.1.2.1 location.rb, NONE, 1.1.2.1 seqfeature_path.rb, NONE, 1.1.2.1 bioentry_relationship.rb, NONE, 1.1.2.1 dbxref_qualifier_value.rb, NONE, 1.1.2.1 dbxref.rb, NONE, 1.1.2.1 term_relationship_term.rb, NONE, 1.1.2.1 bioentry_reference.rb, NONE, 1.1.2.1 taxon_name.rb, NONE, 1.1.2.1 bioentry_path.rb, NONE, 1.1.2.1 biosequence.rb, NONE, 1.1.2.1 term.rb, NONE, 1.1.2.1 term_dbxref.rb, NONE, 1.1.2.1 seqfeature_qualifier_value.rb, NONE, 1.1.2.1 bioentry_qualifier_value.rb, NONE, 1.1.2.1 seqfeature_dbxref.rb, NONE, 1.1.2.1 location_qualifier_value.rb, NONE, 1.1.2.1 seqfeature_relationship.rb, NONE, 1.1.2.1 bioentry.rb, NONE, 1.1.2.1 taxon.rb, NONE, 1.1.2.1 comment.rb, NONE, 1.1.2.1 term_synonym.rb, NONE, 1.1.2.1 Message-ID: <200803251546.m2PFkY03013318@dev.open-bio.org> Update of /home/repository/bioruby/bioruby/lib/bio/io/biosql In directory dev.open-bio.org:/tmp/cvs-serv13290/lib/bio/io/biosql Added Files: Tag: BRANCH-biohackathon2008 ontology.rb reference.rb term_path.rb bioentry_dbxref.rb biodatabase.rb seqfeature.rb term_relationship.rb location.rb seqfeature_path.rb bioentry_relationship.rb dbxref_qualifier_value.rb dbxref.rb term_relationship_term.rb bioentry_reference.rb taxon_name.rb bioentry_path.rb biosequence.rb term.rb term_dbxref.rb seqfeature_qualifier_value.rb bioentry_qualifier_value.rb seqfeature_dbxref.rb location_qualifier_value.rb seqfeature_relationship.rb bioentry.rb taxon.rb comment.rb term_synonym.rb Log Message: BioSQL release "MIFI". biosql->biosequence, biosequence->biosql. Supported formats: Embl, Genbank; support sql stransactions creating new sequences on biosql; does not support references and comments for genbank and embl. Fasta->biosequence->biosql dosn't work. --- NEW FILE: location.rb --- module Bio class SQL class Location < DummyBase #set_sequence_name "location_pk_seq" belongs_to :seqfeature belongs_to :dbxref belongs_to :term has_many :location_qualifier_values def to_s if strand==-1 str="complement("+start_pos.to_s+".."+end_pos.to_s+")" else str=start_pos.to_s+".."+end_pos.to_s end return str end end end #SQL end #Bio --- NEW FILE: bioentry_reference.rb --- module Bio class SQL class BioentryReference < DummyBase set_primary_key :bioentry_reference_id belongs_to :bioentry belongs_to :reference end end #SQL end #Bio --- NEW FILE: bioentry_qualifier_value.rb --- module Bio class SQL class BioentryQualifierValue < DummyBase #NOTE: added rank to primary_keys, now it's finished. set_primary_keys :bioentry_id, :term_id, :rank belongs_to :bioentry belongs_to :term end #BioentryQualifierValue end #SQL end #Bio --- NEW FILE: biosequence.rb --- module Bio class SQL class Biosequence < DummyBase set_primary_key "bioentry_id" #delete set_sequence_name "biosequence_pk_seq" belongs_to :bioentry end end #SQL end #Bio --- NEW FILE: term.rb --- module Bio class SQL class Term < DummyBase set_sequence_name "term_pk_seq" belongs_to :ontology has_many :seqfeature_qualifier_values, :class_name => "SeqfeatureQualifierValue" has_many :dbxref_qualifier_values, :class_name => "DbxrefQualifierValue" has_many :bioentry_qualifer_values, :class_name => "BioentryQualifierValue" has_many :bioentries, :through=>:bioentry_qualifier_values has_many :locations, :class_name => "Location" has_many :seqfeature_relationships, :class_name => "SeqfeatureRelationship" has_many :term_dbxrefs, :class_name => "TermDbxref" has_many :term_relationship_terms, :class_name => "TermRelationshipTerm" has_many :term_synonyms, :class_name => "TermSynonym" has_many :location_qualifier_values, :class_name => "LocationQualifierValue" has_many :seqfeature_types, :class_name => "Seqfeature", :foreign_key => "type_term_id" has_many :seqfeature_sources, :class_name => "Seqfeature", :foreign_key => "source_term_id" has_many :term_path_subjects, :class_name => "TermPath", :foreign_key => "subject_term_id" has_many :term_path_predicates, :class_name => "TermPath", :foreign_key => "predicate_term_id" has_many :term_path_objects, :class_name => "TermPath", :foreign_key => "object_term_id" has_many :term_relationship_subjects, :class_name => "TermRelationship", :foreign_key =>"subject_term_id" has_many :term_relationship_predicates, :class_name => "TermRelationship", :foreign_key =>"predicate_term_id" has_many :term_relationship_objects, :class_name => "TermRelationship", :foreign_key =>"object_term_id" end end #SQL end #Bio --- NEW FILE: bioentry_relationship.rb --- module Bio class SQL class BioentryRelationship < DummyBase #delete set_primary_key "bioentry_relationship_id" set_sequence_name "bieontry_relationship_pk_seq" belongs_to :object_bioentry, :class_name => "Bioentry" belongs_to :subject_bioentry, :class_name => "Bioentry" end end #SQL end #Bio --- NEW FILE: dbxref.rb --- module Bio class SQL class Dbxref < DummyBase #delete set_primary_key "dbxref_id" set_sequence_name "dbxref_pk_seq" has_many :dbxref_qualifier_values, :class_name => "DbxrefQualifierValue" has_many :locations, :class_name => "Location" has_many :references, :class_name=>"Reference" has_many :term_dbxrefs, :class_name => "TermDbxref" has_many :bioentry_dbxrefs, :class_name => "BioentryDbxref" #TODO: check is with bioentry there is an has_and_belongs_to_many relationship has specified in schema overview. end end #SQL end #Bio --- NEW FILE: bioentry_path.rb --- module Bio class SQL class BioentryPath < DummyBase set_primary_key nil #delete set_sequence_name nil belongs_to :term #da sistemare per poter procedere. belongs_to :object_bioentry, :class_name=>"Bioentry" belongs_to :subject_bioentry, :class_name=>"Bioentry" end #BioentryPath end #SQL end #Bio --- NEW FILE: term_dbxref.rb --- module Bio class SQL class TermDbxref < DummyBase set_primary_key nil #term_id, dbxref_id #delete set_sequence_name nil belongs_to :term belongs_to :dbxref end end #SQL end #Bio --- NEW FILE: dbxref_qualifier_value.rb --- module Bio class SQL class DbxrefQualifierValue < DummyBase #think to use composite primary key set_primary_key nil #dbxref_id, term_id, rank #delete set_sequence_name nil belongs_to :dbxref belongs_to :term end end #SQL end #Bio --- NEW FILE: seqfeature_dbxref.rb --- module Bio class SQL class SeqfeatureDbxref < DummyBase set_primary_key nil #seqfeature_id, dbxref_id #delete set_sequence_name nil belongs_to :seqfeature belongs_to :dbxref end end #SQL end #Bio --- NEW FILE: term_relationship_term.rb --- module Bio class SQL class TermRelationshipTerm < DummyBase #delete set_sequence_name nil set_primary_key :term_relationship_id belongs_to :term_relationship belongs_to :term end end #SQL end #Bio --- NEW FILE: location_qualifier_value.rb --- module Bio class SQL class LocationQualifierValue < DummyBase set_primary_key nil #location_id, term_id #delete set_sequence_name nil belongs_to :location belongs_to :term end end #SQL end #Bio --- NEW FILE: taxon_name.rb --- module Bio class SQL class TaxonName < DummyBase set_primary_keys :taxon_id, :name, :name_class belongs_to :taxon end end #SQL end #Bio --- NEW FILE: seqfeature_relationship.rb --- module Bio class SQL class SeqfeatureRelationship "Seqfeature" belongs_to :subject_seqfeature, :class_name => "Seqfeature" end end #SQL end #Bio --- NEW FILE: term_path.rb --- module Bio class SQL class TermPath < DummyBase set_sequence_name "term_path_pk_seq" belongs_to :ontology belongs_to :subject_term, :class_name => "Term" belongs_to :object_term, :class_name => "Term" belongs_to :predicate_term, :class_name => "Term" end end #SQL end #Bio --- NEW FILE: ontology.rb --- module Bio class SQL class Ontology < DummyBase #delete set_primary_key "ontology_id" set_sequence_name "ontology_pk_seq" has_many :terms has_many :term_paths has_many :term_relationships end end #SQL end #Bio --- NEW FILE: term_synonym.rb --- module Bio class SQL class TermSynonym < DummyBase #delete set_sequence_name nil set_primary_key nil belongs_to :term end end #SQL end #Bio --- NEW FILE: seqfeature_qualifier_value.rb --- module Bio class SQL class SeqfeatureQualifierValue < DummyBase set_primary_keys :seqfeature_id, :term_id, :rank set_sequence_name nil belongs_to :seqfeature belongs_to :term end end #SQL end #Bio --- NEW FILE: bioentry.rb --- module Bio class SQL class Bioentry < DummyBase # set_sequence_name "bioentry_pk_seq" belongs_to :biodatabase belongs_to :taxon has_one :biosequence has_many :comments, :class_name =>"Comment", :order =>'rank' has_many :seqfeatures, :order=>'rank' has_many :bioentry_references, :class_name=>"BioentryReference" #, :foreign_key => "bioentry_id" has_many :bioentry_dbxrefs has_many :object_bioentry_relationships, :class_name=>"BioentryRelationship", :foreign_key=>"object_bioentry_id" #non mi convince molto credo non funzioni nel modo corretto has_many :subject_bioentry_relationships, :class_name=>"BioentryRelationship", :foreign_key=>"subject_bioentry_id" #non mi convince molto credo non funzioni nel modo corretto has_many :cdsfeatures, :class_name=>"Seqfeature", :foreign_key =>"bioentry_id", :conditions=>["terms.name='CDS'"], :include=>"typeterm" has_many :terms, :through=>:bioentry_qualifier_values #NOTE: added order_by for multiple hit and manage ranks correctly has_many :bioentry_qualifier_values, :order=>"bioentry_id,term_id,rank" #per la creazione richiesti: #name, accession, version # validates_uniqueness_of :accession, :scope=>[:biodatabase_id] # validates_uniqueness_of :name, :scope=>[:biodatabase_id] # validates_uniqueness_of :identifier, :scope=>[:biodatabase_id] end end #SQL end #Bio --- NEW FILE: reference.rb --- module Bio class SQL class Reference < DummyBase belongs_to :dbxref has_many :bioentry_references, :class_name=>"BioentryRefernce" end end #SQL end #Bio --- NEW FILE: seqfeature.rb --- module Bio class SQL class Seqfeature "Term", :foreign_key => "type_term_id" belongs_to :source_term, :class_name => "Term", :foreign_key =>"source_term_id" has_many :seqfeature_dbxrefs has_many :dbxrefs has_many :seqfeature_qualifier_values, :order=>'rank' has_many :locations, :order=>'rank' has_many :object_seqfeature_paths, :class_name => "SeqfeaturePath", :foreign_key => "object_seqfeature_id" has_many :subject_seqfeature_paths, :class_name => "SeqfeaturePath", :foreign_key => "subject_seqfeature_id" has_many :object_seqfeature_relationships, :class_name => "SeqfeatureRelationship", :foreign_key => "object_seqfeature_id" has_many :subject_seqfeature_relationships, :class_name => "SeqfeatureRelationship", :foreign_key => "subject_seqfeature_id" end end #SQL end #Bio --- NEW FILE: comment.rb --- module Bio class SQL class Comment < DummyBase #delete set_primary_key "comment_id" set_sequence_name "comment_pk_seq" belongs_to :bioentry end end #SQL end #Bio --- NEW FILE: seqfeature_path.rb --- module Bio class SQL class SeqfeaturePath < DummyBase set_primary_key nil set_sequence_name nil belongs_to :object_seqfeature, :class_name => "Seqfeature" belongs_to :subject_seqfeature, :class_name => "Seqfeature" end end #SQL end #Bio --- NEW FILE: bioentry_dbxref.rb --- module Bio class SQL class BioentryDbxref < DummyBase #delete set_sequence_name nil set_primary_key nil #bioentry_id,dbxref_id belongs_to :bioentry belongs_to :dbxref end end #SQL end #Bio --- NEW FILE: term_relationship.rb --- module Bio class SQL class TermRelationship < DummyBase set_sequence_name "term_relationship_pk_seq" belongs_to :ontology belongs_to :subject_term, :class_name => "Term" belongs_to :predicate_term, :class_name => "Term" belongs_to :object_term, :class_name => "Term" has_one :term_relationship_term end end #SQL end #Bio --- NEW FILE: taxon.rb --- module Bio class SQL class Taxon < DummyBase set_sequence_name "taxon_pk_seq" has_many :taxon_names, :class_name => "TaxonName" has_one :taxon_scientific_name, :class_name => "TaxonName", :conditions=>"name_class = 'scientific name'" has_one :bioentry end end #SQL end #Bio --- NEW FILE: biodatabase.rb --- module Bio class SQL class Biodatabase < DummyBase #delete set_primary_key "biodatabase_id" set_sequence_name "biodatabase_pk_seq" has_many :bioentries, :class_name =>"Bioentry", :foreign_key => "biodatabase_id" validates_uniqueness_of :name end end #SQL end #Bio