[BioRuby] Parsing MSF alignment file

Fredrik Johansson fredjoha at bioreg.kyushu-u.ac.jp
Mon Apr 13 04:16:00 UTC 2009


I tried to parse an alignment file given in the MSF format by using 
Bio::GCG::Msf. It turned out though that points/dots (.) were used as a 
gap character in the alignment and that can't be handled by 
Bio::GCG::Msf. So, for what it's worth, I made these changes to 
bio/appl/gcg/msf.rb:

$ diff msf.rb.old msf.rb.new

33,35c33,36
<         if /^\!\![A-Z]+\_MULTIPLE\_ALIGNMNENT/ =~ str[/.*/] then
<           @heading = str[/.*/] # '!!NA_MULTIPLE_ALIGNMENT 1.0' or like 
this
<           str.sub!(/.*/, '')
---
 >         preamble, at data = str.split(/^\/\/$/)
 >         if /^\!\![A-Z]+\_MULTIPLE\_ALIGNMNENT/ =~ preamble[/.*/] then
 >           @heading = preamble[/.*/] # '!!NA_MULTIPLE_ALIGNMENT 1.0' 
or like this
 >           preamble.sub!(/.*/, '')
37c38
<         str.sub!(/.*\.\.$/m, '')
---
 >         preamble.sub!(/.*\.\.$/m, '')
48,49d48
<         str.sub!(/.*\/\/$/m, '')
<         a = $&.to_s.split(/^/)
51c50
<         a.each do |x|
---
 >         preamble.split(/^/).each do |x|
59d57
<         @data = str


Best regards,
Fredrik Johansson

-- 
***********************************
Fredrik Johansson, grad. student

Division of Bioinformatics
Medical Institute of Bioregulation
Kyushu University
3-1-1 Maidashi, Higashi-ku
Fukuoka 812-8582, Japan

fredjoha at bioreg.kyushu-u.ac.jp
***********************************




More information about the BioRuby mailing list