[BioRuby-cvs] bioruby/doc Tutorial.rd,1.18,1.19
Pjotr Prins
pjotr at dev.open-bio.org
Mon Feb 11 07:08:56 UTC 2008
Update of /home/repository/bioruby/bioruby/doc
In directory dev.open-bio.org:/tmp/cvs-serv7263/doc
Modified Files:
Tutorial.rd
Log Message:
Expanding on the Tutorial
Index: Tutorial.rd
===================================================================
RCS file: /home/repository/bioruby/bioruby/doc/Tutorial.rd,v
retrieving revision 1.18
retrieving revision 1.19
diff -C2 -d -r1.18 -r1.19
*** Tutorial.rd 5 Feb 2008 12:01:16 -0000 1.18
--- Tutorial.rd 11 Feb 2008 07:08:46 -0000 1.19
***************
*** 1,5 ****
# This document is generated with a version of rd2html (part of Hiki)
#
! # A possible test run could be from rdtool:
#
# ruby -I lib ./bin/rd2 ~/cvs/opensource/bioruby/doc/Tutorial.rd
--- 1,5 ----
# This document is generated with a version of rd2html (part of Hiki)
#
! # A possible test run could be from rdtool (on Debian package rdtool)
#
# ruby -I lib ./bin/rd2 ~/cvs/opensource/bioruby/doc/Tutorial.rd
***************
*** 10,14 ****
ss=bioruby.css ~/cvs/opensource/bioruby/doc/Tutorial.rd > ~/bioruby.html
#
! # A common problem is tabs in the text file!
=begin
--- 10,23 ----
ss=bioruby.css ~/cvs/opensource/bioruby/doc/Tutorial.rd > ~/bioruby.html
#
! # in Debian:
! #
! # rd2 -r rd/rd2html-lib --with-css="/home/wrk/izip/cvs/opensource/bioruby/lib/bio/shell/rails/vendor/plugins/bioruby/generators/bioruby/templates/bioruby.css" Tutorial.rd > index.html
! #
! # A common problem is tabs in the text file! TABs are not allowed.
! #
! # To add tests run Toshiaki's bioruby shell and paste in the query plus
! # results.
! #
! # To run the embedded Ruby doctests you can get the doctest.rb from Pjotr.
=begin
***************
*** 36,41 ****
((<here|URL:http://www.rubycentral.com/pickaxe/>)).
! For BioRuby you need to install
! Ruby and the BioRuby package on your computer.
You can check whether Ruby is installed on your computer and what
--- 45,49 ----
((<here|URL:http://www.rubycentral.com/pickaxe/>)).
! For BioRuby you need to install Ruby and the BioRuby package on your computer
You can check whether Ruby is installed on your computer and what
***************
*** 80,83 ****
--- 88,95 ----
==> "ttttgcatgcat"
+ See the the Bioruby shell section below for more tweaking. If you have trouble running
+ examples also check the section below on trouble shooting. You can also post a
+ question to the mailing list. BioRuby developers usually try to help.
+
== Working with nucleic / amino acid sequences (Bio::Sequence class)
***************
*** 171,181 ****
through a variable named +s+.
! * Shows average percentage of GC content for 20 bases (stepping the default one base at a time)
bioruby> seq = Bio::Sequence::NA.new("atgcatgcaattaagctaatcccaattagatcatcccgatcatcaaaaaaaaaa")
==> "atgcatgcaattaagctaatcccaattagatcatcccgatcatcaaaaaaaaaa"
! bioruby> seq.window_search(20) { |s| print s.gc_percent,',' }
! 30,35,40,40,35,35,35,30,25,30,30,30,35,35,35,35,35,40,45,45,45,45,40,35,40,40,40,40,40,35,35,35,30,30,30, ==> ""
Since the class of each subsequence is the same as original sequence
--- 183,195 ----
through a variable named +s+.
! * Show average percentage of GC content for 20 bases (stepping the default one base at a time)
bioruby> seq = Bio::Sequence::NA.new("atgcatgcaattaagctaatcccaattagatcatcccgatcatcaaaaaaaaaa")
==> "atgcatgcaattaagctaatcccaattagatcatcccgatcatcaaaaaaaaaa"
! bioruby> a=[]; seq.window_search(20) { |s| a.push s.gc_percent }
! bioruby> a
! ==> [30, 35, 40, 40, 35, 35, 35, 30, 25, 30, 30, 30, 35, 35, 35, 35, 35, 40, 45, 45, 45, 45, 40, 35, 40, 40, 40, 40, 40, 35, 35, 35, 30, 30, 30]
!
Since the class of each subsequence is the same as original sequence
***************
*** 185,191 ****
* Shows translation results for 15 bases shifting a codon at a time
! seq.window_search(15, 3) do |s|
! puts s.translate
! end
Finally, the window_search method returns the last leftover
--- 199,209 ----
* Shows translation results for 15 bases shifting a codon at a time
! bioruby> a = []
! bioruby> seq.window_search(15, 3) do |s|
! bioruby> a.push s.translate
! bioruby> end
! bioruby> a
! ==> ["MHAIK", "HAIKL", "AIKLI", "IKLIP", "KLIPI", "LIPIR", "IPIRS", "PIRSS", "IRSSR", "RSSRS", "SSRSS", "SRSSK", "RSSKK", "SSKKK"]
!
Finally, the window_search method returns the last leftover
***************
*** 193,206 ****
* Divide a genome sequence into sections of 10000bp and
! output FASTA formatted sequences. The 1000bp at the start and end of
! each subsequence overlapped. At the 3' end of the sequence the
! leftover subsequence shorter than 10000bp is also added
i = 1
remainder = seq.window_search(10000, 9000) do |s|
! puts s.to_fasta("segment #{i}", 60)
i += 1
end
! puts remainder.to_fasta("segment #{i}", 60)
If you don't want the overlapping window, set window size and stepping
--- 211,227 ----
* Divide a genome sequence into sections of 10000bp and
! output FASTA formatted sequences (line width 60 chars). The 1000bp at the
! start and end of each subsequence overlapped. At the 3' end of the sequence
! the leftover is also added:
i = 1
+ textwidth=60
remainder = seq.window_search(10000, 9000) do |s|
! puts s.to_fasta("segment #{i}", textwidth)
i += 1
end
! if remainder
! puts remainder.to_fasta("segment #{i}", textwidth)
! end
If you don't want the overlapping window, set window size and stepping
***************
*** 211,224 ****
* Count the codon usage
! codon_usage = Hash.new(0)
! seq.window_search(3, 3) do |s|
! codon_usage[s] += 1
! end
* Calculate molecular weight for each 10-aa peptide (or 10-nt nucleic acid)
! seq.window_search(10, 10) do |s|
! puts s.molecular_weight
! end
In most cases, sequences are read from files or retrieved from databases.
--- 232,251 ----
* Count the codon usage
! bioruby> codon_usage = Hash.new(0)
! bioruby> seq.window_search(3, 3) do |s|
! bioruby> codon_usage[s] += 1
! bioruby> end
! bioruby> codon_usage
! ==> {"cat"=>1, "aaa"=>3, "cca"=>1, "att"=>2, "aga"=>1, "atc"=>1, "cta"=>1, "gca"=>1, "cga"=>1, "tca"=>3, "aag"=>1, "tcc"=>1, "atg"=>1}
!
* Calculate molecular weight for each 10-aa peptide (or 10-nt nucleic acid)
! bioruby> a = []
! bioruby> seq.window_search(10, 10) do |s|
! bioruby> a.push s.molecular_weight
! bioruby> end
! bioruby> a
! ==> [3096.2062, 3086.1962, 3056.1762, 3023.1262, 3073.2262]
In most cases, sequences are read from files or retrieved from databases.
***************
*** 246,249 ****
--- 273,280 ----
% ruby na2aa.rb my_naseq.txt
+ or use a pipe!
+
+ % cat my_naseq.txt|ruby na2aa.rb
+
Outputs
***************
*** 254,259 ****
% ruby -r bio -e 'p Bio::Sequence::NA.new($<.read).translate' my_naseq.txt
! In the next section we will retrieve data from databases instead of
! using raw sequence files.
== Parsing GenBank data (Bio::GenBank class)
--- 285,291 ----
% ruby -r bio -e 'p Bio::Sequence::NA.new($<.read).translate' my_naseq.txt
! In the next section we will retrieve data from databases instead of using raw
! sequence files. One generic example of the above can be found in
! ./sample/na2aa.rb.
== Parsing GenBank data (Bio::GenBank class)
***************
*** 460,474 ****
Array and BioPerl's Bio::SimpleAlign. A very simple example is:
! require 'bio'
!
! seqs = [ 'atgca', 'aagca', 'acgca', 'acgcg' ]
! seqs = seqs.collect{ |x| Bio::Sequence::NA.new(x) }
!
# creates alignment object
! a = Bio::Alignment.new(seqs)
!
! # shows consensus sequence
! p a.consensus # ==> "a?gc?"
!
# shows IUPAC consensus
p a.consensus_iupac # ==> "ahgcr"
--- 492,501 ----
Array and BioPerl's Bio::SimpleAlign. A very simple example is:
! bioruby> seqs = [ 'atgca', 'aagca', 'acgca', 'acgcg' ]
! bioruby> seqs = seqs.collect{ |x| Bio::Sequence::NA.new(x) }
# creates alignment object
! bioruby> a = Bio::Alignment.new(seqs)
! bioruby> a.consensus
! ==> "xa?gc?"
# shows IUPAC consensus
p a.consensus_iupac # ==> "ahgcr"
***************
*** 1168,1179 ****
== The BioRuby example programs
! Some sample programs are stored in samples/ directry.
! Some programs are obsolete. Since samples are not enough,
! practical and interesting samples are welcome.
!
! to be written...
! (EDITOR's NOTE: I would like some examples automatically
! included - with output)
== Unit testing and doctests
--- 1195,1201 ----
== The BioRuby example programs
! Some sample programs are stored in ./samples/ directory. Run for example:
! ./sample/na2aa.rb test/data/fasta/example1.txt
== Unit testing and doctests
***************
*** 1195,1198 ****
--- 1217,1242 ----
((<URL:http://bioruby.org/rdoc/>)).
+ == BioRuby Shell
+
+ The BioRuby shell implementation you find in ./lib/bio/shell. It is very interesting
+ as it uses IRB (the Ruby intepreter) which is a powerful environment described in
+ ((<Programming Ruby's irb chapter|URL:http://ruby-doc.org/docs/ProgrammingRuby/html/irb.html>)). IRB commands can directly be typed in the shell, e.g.
+
+ bioruby!> IRB.conf[:PROMPT_MODE]
+ ==!> :PROMPT_C
+
+ optionally you also may want to install the optional Ruby readline support -
+ with Debian libreadline-ruby. To edit a previous line you may have to press
+ line down (arrow down) first.
+
+ = Helpful tools
+
+ Apart from rdoc you may also want to use rtags - which allows jumping around
+ source code by clicking on class and method names.
+
+ cd bioruby/lib
+ rtags -R --vi
+
+ For a tutorial see ((<URL:http://rtags.rubyforge.org/>))
= APPENDIX
***************
*** 1227,1230 ****
--- 1271,1283 ----
carefully that come with each package.
+ == Trouble shooting
+
+ * Error: in `require': no such file to load -- bio (LoadError)
+
+ Ruby fails to find the BioRuby libraries - add it to the RUBYLIB path, or pass
+ it to the interpeter. For example:
+
+ ruby -I~/cvs/bioruby/lib yourprogram.rb
+
== Modifying this page
More information about the bioruby-cvs
mailing list