[Bioperl-l] Count or weight matrix in bioperl?
Cook, Malcolm
MEC at stowers-institute.org
Fri Feb 17 18:15:53 UTC 2006
http://forkhead.cgb.ki.se/TFBS/ provides ability to generate position
frequency matrix from list of (presumaby aligned) sequences as follows:
#!/usr/bin/env perl
use TFBS::PatternGen::SimplePFM;
my @sequences = <>;
chomp @sequences;
print
TFBS::PatternGen::SimplePFM->new(-seq_list=>\@sequences)->pattern->rawpr
int;
exit 0;
The output when run on your example input shows that the order the
nucleotides is not the same as you expect (it is alphbetical):
1 0 0
1 1 2
0 0 0
1 2 1
Good luck,
TFBS installation requires signifigant dependencies, including bioperl
and PDL.
Malcolm Cook
>-----Original Message-----
>From: bioperl-l-bounces at lists.open-bio.org
>[mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Sam
>Al-Droubi
>Sent: Friday, February 17, 2006 11:50 AM
>To: Torsten Seemann
>Cc: BioPerl list
>Subject: Re: [Bioperl-l] Count or weight matrix in bioperl?
>
>
>Torsten and all,
>
> I don't think this will work for me for it only generates
>statistics for a single sequence. What I need is a count
>matrix for each position for a number of DNA sequences. In
>other words, if I pass there 3 sequences to this function then
>it returns the count for each postion for each nucleotide.
>
> For example if I pass an array of sequences say: ATC,CCC,TTT
> then I should get a matrix back that will have count for
>postion 1,2,3 for each A,C,T, or G like this:
>
>
> 1 2 3
> A 1 0 0
> C 1 1 2
> T 1 2 1
> G 0 0 0
>
> Any idea of this is already built somewhere in bioperl?
>
> Thank you.
>
>
> Torsten Seemann <torsten.seemann at infotech.monash.edu.au>
>wrote:> Say I have an array of nucleotide sequences of of
>length N. I want to calculate the count matrix (weight
>matrix). That is for each position 1..N, I want to know how
>many As, Cs ,Ts and Gs there are. Is the code to do this
>already written in bioperl to build this matrix if I pass it
>those strings?
>> Please excuse my lack of knowledge as I am a new comer to
>bioinformatics.
>
>Use the Bio::Tools::SeqStats module. The PDoc documentation
>even has an
>example similar to what you want to do:
>
>http://doc.bioperl.org/releases/bioperl-1.5.0-RC1/Bio/Tools/Seq
>Stats.html
>
>--Torsten Seemann
>
>
>
>
>Sincerely,
>Sam Al-Droubi, M.S.
>saldroubi at yahoo.com
>_______________________________________________
>Bioperl-l mailing list
>Bioperl-l at lists.open-bio.org
>http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
More information about the Bioperl-l
mailing list