[Bioperl-l] Getting IC & Consensus with
Bio::Matrix::PSM::SiteMatrix
James Thompson
tex at biocompute.net
Sun Mar 13 12:35:23 EST 2005
Edward,
1. There is no code in SiteMatrix (or any of other other Bio::Matrix::PSM modules
as far as I know) that calculates information content for you. It's assumed to
provided as a parameter to the constructor rather than calculated by the
SiteMatrix object itself.
2. I don't know the exact reasoning behind this implementation for calculating
ambiguity, but here's the algorithm to calculate the consensus for an individual
position:
- Take the frequencies for a given position, multiply them all by ten and divide
by the total number of characters at that position. In your example for the third
position, we would transform these numbers:
{ A => 3, T => 6, C => 2, G => 1 }
into this set of numbers:
{ A => 2.5, T => 3, C => 1.667, G => 0.833 }
- If none of these numbers are above the threshold (which defaults to 5),
then return an N for this position.
This algorithm is in the _to_cons method of the Bio::Matrix::PSM::SiteMatrix module
if you'd like to take a peek.
I'll defer your other questions to Stefan and the rest of the list. :)
Cheers,
James Thompson
On Mon, 14 Mar 2005, Edward Wijaya wrote:
> Hi,
>
> Why my code below fails to return the IC values?
> I thought the method is able to do that.
> Is there anything I miss here?
>
> My second question is about"consensus" method.
> The consensus is generated by choosing the highest probability OR *N if
> prob is too low*
>
> 1. How do you define when the probability is *too low*?
> 2. What is the reasoning behind this implementation?
> e.g. Why my code below gives 'TANGTA' instead of "TATGTA"?
>
> I find this particular module is very very useful.
> I really wish I can make best use of it.
>
> Thanks so much for your time.
> Hope to hear from you again.
>
> ---
> Regards,
> Edward WIJAYA
> SINGAPORE
>
>
> __BEGIN__
>
> #!/usr/bin/perl -w
> use strict;
> use Data::Dumper;
> use Bio::Matrix::PSM::SiteMatrix;
>
> #Frequency matrix
> my @pA = (2,19,3,6,8,10);
> my @pT = (7,3,6,2,20,5);
> my @pC = (1,2,2,1,1,1);
> my @pG = (3,1,1,9,8,7);
>
>
> my %param =( -pA=>\@pA,-pC=>\@pC,-pG=>\@pG,-pT=>\@pT);
> my $site=new Bio::Matrix::PSM::SiteMatrix(%param);
>
> my $consensus = $site->consensus;
> my $ic = $site->IC; #Why it fails here?
>
>
> print Dumper $ic;
> print Dumper $consensus;
More information about the Bioperl-l
mailing list