[Bioperl-l] Count or weight matrix in bioperl?
Sam Al-Droubi
saldroubi at yahoo.com
Fri Feb 17 17:49:40 UTC 2006
Torsten and all,
I don't think this will work for me for it only generates statistics for a single sequence. What I need is a count matrix for each position for a number of DNA sequences. In other words, if I pass there 3 sequences to this function then it returns the count for each postion for each nucleotide.
For example if I pass an array of sequences say: ATC,CCC,TTT
then I should get a matrix back that will have count for postion 1,2,3 for each A,C,T, or G like this:
1 2 3
A 1 0 0
C 1 1 2
T 1 2 1
G 0 0 0
Any idea of this is already built somewhere in bioperl?
Thank you.
Torsten Seemann <torsten.seemann at infotech.monash.edu.au> wrote:> Say I have an array of nucleotide sequences of of length N. I want to calculate the count matrix (weight matrix). That is for each position 1..N, I want to know how many As, Cs ,Ts and Gs there are. Is the code to do this already written in bioperl to build this matrix if I pass it those strings?
> Please excuse my lack of knowledge as I am a new comer to bioinformatics.
Use the Bio::Tools::SeqStats module. The PDoc documentation even has an
example similar to what you want to do:
http://doc.bioperl.org/releases/bioperl-1.5.0-RC1/Bio/Tools/SeqStats.html
--Torsten Seemann
Sincerely,
Sam Al-Droubi, M.S.
saldroubi at yahoo.com
More information about the Bioperl-l
mailing list