[BioRuby] Calculation of Conserved residues.

Yonatan Gross gross_yonatan at yahoo.com
Sat Apr 7 23:31:39 UTC 2007


Hi all, first post to this group. I hope this question is legit.

I am trying to quantify the alignment of few genes.
with mac vector I am able to get  a  "conserved Identities" value i  
can use , but I need to manually run the clustalw procedure.

I want to write a ruby script to do the same.

any suggestions ?

I have the following:


#!/usr/bin/env ruby
require 'rubygems'
require 'bio'

#----------------------------------------------------------------------- 
--
arabidopsis = <<END
 >gi|9843639|emb|CAC03598.1| Rieske FeS protein [Arabidopsis thaliana]
MASSSLSPATQLGSSRSALMAMSSGLFVKPTKMNHQMVRKEKIGLRIACQASSIPADRVPDMEKRKTLNL
LLLGALSLPTGYMLVPYATFFVPPGTGGGGGGTPAKDALGNDVVAAEWLKTHGPGDRTLTQGLKGDPTYL
VVENDKTLATYGINAVCTHLGCVVPWNKAENKFLCPCHGSQYNAQGRVVRGPAPLSLALAHADIDEAGKV
LFVPWVETDFRTGDAPWWS
END


tobacco = <<END
 >gi|19995|emb|CAA46808.1| Rieske FeS [Nicotiana tabacum]
MASSTLSPVTQLCSSKSGLSSVSQCLLVKPMKINSHGLGKDKRMKVKCMATSIPADDRVPDMEKRNLMNL
LLLGALSLPTAGMLVPYGTFFVPPGSGGGSGGTPAKDALGNDVIASEWLKTHPPGNRTLTQGLKGDPTYL
VVENDGTLATYGINAVCTHLGCVVPFNAAENKFICPCHGSQYNNQGRVVRGPAPLSLALAHADIDDGKVV
FVPWVETDFRTGEDPWWA
END

spinach = <<END
 >gi|226151|prf||1412276A rieske FeS precursor protein [spinach]
MIISIFNQLHLTENSSLMASFTLSSATPSQLCSSKNGMFAPSLALAKAGRVNVLISKERIRGMKLTCQAT
SIPADNVPDMQKRETLNLLLLGALSLPTGYMLLPYASFFVPPGGGAGTGGTIAKDALGNDVIAAEWLKTH
APGDRTLTQGLKGDPTYLVVESDKTLATFGINAVCTHLGCVVPFNAAENKFICPCHGSQYNNQGRVVRGP
APLSLALAHCDVDDGKVVFVPWTETDFRTGEAPWWSA
END

rice = <<END
 >gi|115472727|ref|NP_001059962.1| Os07g0556200 [Oryza sativa  
(japonica cultivar-group)]
MASTALSTASNPTQLCRSRASLGKPVKGLGFGRERVPRTATTITCQAASSIPADRVPDMGKRQLMNLLLL
GAISLPTVGMLVPYGAFFIPAGSGNAGGGQVAKDKLGNDVLAEEWLKTHGPNDRTLTQGLKGDPTYLVVE
ADKTLATYGINAVCTHLGCVVPWNAAENKFICPCHGSQYNNQGRVVRGPAPLSLALVHADVDDGKVLFVP
WVETDFRTGDNPWWA
END

potato = <<END
 >gi|37222949|gb|AAQ90151.1| putative Rieske Fe-S protein precursor  
[Solanum tuberosum]
MASSTLSHVTPSQLCSSKSGVSSVSQALLVKPMKINGHGMGKDKRMKAKCMAASIPADDRVPDMEKRNLM
NLLLLGALALPTGGMLVPYATFFAPPGSGGGSSGTIAKDANGNDVVVTEWLKTHSPGTRTLTQGLKGDPT
YLVVENDGTLATYGINAVCTHLGCVVPWNTAENKFICPCHGSQYNNQGKVVRGPAPLSLALAHADIDDGK
VVFVPWVETDFRTGDSPWWA
END
seqs = []
seqs << 	Bio::Sequence::AA.new(arabidopsis)
seqs << 	Bio::Sequence::AA.new(tobacco)
seqs <<  	Bio::Sequence::AA.new(spinach)
seqs <<  	Bio::Sequence::AA.new(rice)
seqs <<  	Bio::Sequence::AA.new(potato)

factory = Bio::ClustalW.new
report = factory.query_align(seqs)
puts report.alignment.match_line

#----------------------------------------------------------------------- 
--

thanks ahead.
Yonatan Gross. 



More information about the BioRuby mailing list