[BioRuby-cvs] bioruby/doc Tutorial.rd,1.18,1.19

Mon Feb 11 07:08:56 UTC 2008

Update of /home/repository/bioruby/bioruby/doc
In directory dev.open-bio.org:/tmp/cvs-serv7263/doc

Modified Files:
	Tutorial.rd 
Log Message:
Expanding on the Tutorial

Index: Tutorial.rd
===================================================================
RCS file: /home/repository/bioruby/bioruby/doc/Tutorial.rd,v
retrieving revision 1.18
retrieving revision 1.19
diff -C2 -d -r1.18 -r1.19
*** Tutorial.rd	5 Feb 2008 12:01:16 -0000	1.18
--- Tutorial.rd	11 Feb 2008 07:08:46 -0000	1.19
***************
*** 1,5 ****
  # This document is generated with a version of rd2html (part of Hiki)
  #
! # A possible test run could be from rdtool:
  #
  #   ruby -I lib ./bin/rd2 ~/cvs/opensource/bioruby/doc/Tutorial.rd
--- 1,5 ----
  # This document is generated with a version of rd2html (part of Hiki)
  #
! # A possible test run could be from rdtool (on Debian package rdtool)
  #
  #   ruby -I lib ./bin/rd2 ~/cvs/opensource/bioruby/doc/Tutorial.rd
***************
*** 10,14 ****
  ss=bioruby.css ~/cvs/opensource/bioruby/doc/Tutorial.rd > ~/bioruby.html
  #
! # A common problem is tabs in the text file!

  =begin
--- 10,23 ----
  ss=bioruby.css ~/cvs/opensource/bioruby/doc/Tutorial.rd > ~/bioruby.html
  #
! # in Debian:
! #
! #   rd2 -r rd/rd2html-lib  --with-css="/home/wrk/izip/cvs/opensource/bioruby/lib/bio/shell/rails/vendor/plugins/bioruby/generators/bioruby/templates/bioruby.css" Tutorial.rd > index.html
! #
! # A common problem is tabs in the text file! TABs are not allowed.
! #
! # To add tests run Toshiaki's bioruby shell and paste in the query plus
! # results.
! #
! # To run the embedded Ruby doctests you can get the doctest.rb from Pjotr.

  =begin
***************
*** 36,41 ****
  ((<here|URL:http://www.rubycentral.com/pickaxe/>)).

! For BioRuby you need to install
! Ruby and the BioRuby package on your computer.

  You can check whether Ruby is installed on your computer and what
--- 45,49 ----
  ((<here|URL:http://www.rubycentral.com/pickaxe/>)).

! For BioRuby you need to install Ruby and the BioRuby package on your computer

  You can check whether Ruby is installed on your computer and what
***************
*** 80,83 ****
--- 88,95 ----
    ==> "ttttgcatgcat"

+ See the the Bioruby shell section below for more tweaking. If you have trouble running
+ examples also check the section below on trouble shooting. You can also post a 
+ question to the mailing list. BioRuby developers usually try to help.
+ 
  == Working with nucleic / amino acid sequences (Bio::Sequence class)

***************
*** 171,181 ****
  through a variable named +s+.

! * Shows average percentage of GC content for 20 bases (stepping the default one base at a time)

    bioruby> seq = Bio::Sequence::NA.new("atgcatgcaattaagctaatcccaattagatcatcccgatcatcaaaaaaaaaa")
    ==> "atgcatgcaattaagctaatcccaattagatcatcccgatcatcaaaaaaaaaa"

!   bioruby>  seq.window_search(20) { |s| print s.gc_percent,',' } 
!   30,35,40,40,35,35,35,30,25,30,30,30,35,35,35,35,35,40,45,45,45,45,40,35,40,40,40,40,40,35,35,35,30,30,30,  ==> ""

  Since the class of each subsequence is the same as original sequence
--- 183,195 ----
  through a variable named +s+.

! * Show average percentage of GC content for 20 bases (stepping the default one base at a time)

    bioruby> seq = Bio::Sequence::NA.new("atgcatgcaattaagctaatcccaattagatcatcccgatcatcaaaaaaaaaa")
    ==> "atgcatgcaattaagctaatcccaattagatcatcccgatcatcaaaaaaaaaa"

!   bioruby> a=[]; seq.window_search(20) { |s| a.push s.gc_percent } 
!   bioruby> a
!   ==> [30, 35, 40, 40, 35, 35, 35, 30, 25, 30, 30, 30, 35, 35, 35, 35, 35, 40, 45, 45, 45, 45, 40, 35, 40, 40, 40, 40, 40, 35, 35, 35, 30, 30, 30]
! 

  Since the class of each subsequence is the same as original sequence
***************
*** 185,191 ****
  * Shows translation results for 15 bases shifting a codon at a time

!     seq.window_search(15, 3) do |s|
!       puts s.translate
!     end

  Finally, the window_search method returns the last leftover
--- 199,209 ----
  * Shows translation results for 15 bases shifting a codon at a time

!   bioruby> a = []
!   bioruby> seq.window_search(15, 3) do |s|
!   bioruby>   a.push s.translate
!   bioruby> end
!   bioruby> a
!   ==> ["MHAIK", "HAIKL", "AIKLI", "IKLIP", "KLIPI", "LIPIR", "IPIRS", "PIRSS", "IRSSR", "RSSRS", "SSRSS", "SRSSK", "RSSKK", "SSKKK"]
! 

  Finally, the window_search method returns the last leftover
***************
*** 193,206 ****

  * Divide a genome sequence into sections of 10000bp and
!   output FASTA formatted sequences. The 1000bp at the start and end of
!   each subsequence overlapped. At the 3' end of the sequence the
!   leftover subsequence shorter than 10000bp is also added

      i = 1
      remainder = seq.window_search(10000, 9000) do |s|
!       puts s.to_fasta("segment #{i}", 60)
        i += 1
      end
!     puts remainder.to_fasta("segment #{i}", 60)

  If you don't want the overlapping window, set window size and stepping
--- 211,227 ----

  * Divide a genome sequence into sections of 10000bp and
!   output FASTA formatted sequences (line width 60 chars). The 1000bp at the
!   start and end of each subsequence overlapped. At the 3' end of the sequence
!   the leftover is also added:

      i = 1
+     textwidth=60
      remainder = seq.window_search(10000, 9000) do |s|
!       puts s.to_fasta("segment #{i}", textwidth)
        i += 1
      end
!     if remainder
!       puts remainder.to_fasta("segment #{i}", textwidth) 
!     end

  If you don't want the overlapping window, set window size and stepping
***************
*** 211,224 ****
  * Count the codon usage

!     codon_usage = Hash.new(0)
!     seq.window_search(3, 3) do |s|
!       codon_usage[s] += 1
!     end

  * Calculate molecular weight for each 10-aa peptide (or 10-nt nucleic acid)

!     seq.window_search(10, 10) do |s|
!       puts s.molecular_weight
!     end

  In most cases, sequences are read from files or retrieved from databases.
--- 232,251 ----
  * Count the codon usage

!   bioruby> codon_usage = Hash.new(0)
!   bioruby> seq.window_search(3, 3) do |s|
!   bioruby>   codon_usage[s] += 1
!   bioruby> end
!   bioruby> codon_usage
!   ==> {"cat"=>1, "aaa"=>3, "cca"=>1, "att"=>2, "aga"=>1, "atc"=>1, "cta"=>1, "gca"=>1, "cga"=>1, "tca"=>3, "aag"=>1, "tcc"=>1, "atg"=>1}
! 

  * Calculate molecular weight for each 10-aa peptide (or 10-nt nucleic acid)

!   bioruby> a = []
!   bioruby> seq.window_search(10, 10) do |s|
!   bioruby>   a.push s.molecular_weight
!   bioruby> end
!   bioruby> a
!   ==> [3096.2062, 3086.1962, 3056.1762, 3023.1262, 3073.2262]

  In most cases, sequences are read from files or retrieved from databases.
***************
*** 246,249 ****
--- 273,280 ----
      % ruby na2aa.rb my_naseq.txt

+ or use a pipe!
+ 
+     % cat my_naseq.txt|ruby na2aa.rb
+ 
  Outputs

***************
*** 254,259 ****
      % ruby -r bio -e 'p Bio::Sequence::NA.new($<.read).translate' my_naseq.txt

! In the next section we will retrieve data from databases instead of
! using raw sequence files.

  == Parsing GenBank data (Bio::GenBank class)
--- 285,291 ----
      % ruby -r bio -e 'p Bio::Sequence::NA.new($<.read).translate' my_naseq.txt

! In the next section we will retrieve data from databases instead of using raw
! sequence files. One generic example of the above can be found in
! ./sample/na2aa.rb.

  == Parsing GenBank data (Bio::GenBank class)
***************
*** 460,474 ****
  Array and BioPerl's Bio::SimpleAlign.  A very simple example is:

!   require 'bio'
! 
!   seqs = [ 'atgca', 'aagca', 'acgca', 'acgcg' ]
!   seqs = seqs.collect{ |x| Bio::Sequence::NA.new(x) }
! 
    # creates alignment object
!   a = Bio::Alignment.new(seqs)
! 
!   # shows consensus sequence
!   p a.consensus             # ==> "a?gc?"
! 
    # shows IUPAC consensus
    p a.consensus_iupac       # ==> "ahgcr"
--- 492,501 ----
  Array and BioPerl's Bio::SimpleAlign.  A very simple example is:

!   bioruby> seqs = [ 'atgca', 'aagca', 'acgca', 'acgcg' ]
!   bioruby> seqs = seqs.collect{ |x| Bio::Sequence::NA.new(x) }
    # creates alignment object
!   bioruby> a = Bio::Alignment.new(seqs)
!   bioruby> a.consensus 
!   ==> "xa?gc?"
    # shows IUPAC consensus
    p a.consensus_iupac       # ==> "ahgcr"
***************
*** 1168,1179 ****
  == The BioRuby example programs

! Some sample programs are stored in samples/ directry.
! Some programs are obsolete. Since samples are not enough,
! practical and interesting samples are welcome.
! 
! to be written...

! (EDITOR's NOTE: I would like some examples automatically
! included - with output)

  == Unit testing and doctests
--- 1195,1201 ----
  == The BioRuby example programs

! Some sample programs are stored in ./samples/ directory. Run for example:

!   ./sample/na2aa.rb test/data/fasta/example1.txt 

  == Unit testing and doctests
***************
*** 1195,1198 ****
--- 1217,1242 ----
  ((<URL:http://bioruby.org/rdoc/>)).

+ == BioRuby Shell
+ 
+ The BioRuby shell implementation you find in ./lib/bio/shell. It is very interesting
+ as it uses IRB (the Ruby intepreter) which is a powerful environment described in
+ ((<Programming Ruby's irb chapter|URL:http://ruby-doc.org/docs/ProgrammingRuby/html/irb.html>)). IRB commands can directly be typed in the shell, e.g.
+ 
+   bioruby!> IRB.conf[:PROMPT_MODE]
+   ==!> :PROMPT_C
+ 
+ optionally you also may want to install the optional Ruby readline support -
+ with Debian libreadline-ruby. To edit a previous line you may have to press
+ line down (arrow down) first.
+ 
+ = Helpful tools
+ 
+ Apart from rdoc you may also want to use rtags - which allows jumping around
+ source code by clicking on class and method names. 
+ 
+   cd bioruby/lib
+   rtags -R --vi
+ 
+ For a tutorial see ((<URL:http://rtags.rubyforge.org/>))

  = APPENDIX
***************
*** 1227,1230 ****
--- 1271,1283 ----
  carefully that come with each package.

+ == Trouble shooting
+ 
+ * Error: in `require': no such file to load -- bio (LoadError)
+ 
+ Ruby fails to find the BioRuby libraries - add it to the RUBYLIB path, or pass
+ it to the interpeter. For example:
+ 
+   ruby -I~/cvs/bioruby/lib yourprogram.rb
+ 
  == Modifying this page