[Bioperl-l] Fitch's Parsimony Algorithm with Perl
Gundala Viswanath
gundalav at gmail.com
Wed Sep 3 14:48:07 UTC 2008
Hi,
What's a correct way to implement Fitch's parsimony algorithm?
Especially to compute minimum substitiution rate per column
in the aligned sequence.
Is there a Bioperl module to do it?
For example
CGGCGGAAAACTGTCCTCCGTGC mouse
CGACGGAACATTCTCCTCCGCGC rat
CGACGGAATATTCCCCTCCGTGC human
CGACGGAAGACTCTCCTCCGTGC chimp
00100000302011000000100 -> number of subst per site (max parsimony)
My code below doesn't seem to do the job.
__BEGIN__
use Data::Dumper;
use List::MoreUtils qw(uniq);
# The related phylogenetic in Newick format tree is:
my $tree = ' (mouse,rat,(human,chimp))';
my $sites = [
'CGGCGGAAAACTGTCCTCCGTGC', # mouse
'CGACGGAACATTCTCCTCCGCGC', # rat
'CGACGGAATATTCCCCTCCGTGC', # human
'CGACGGAAGACTCTCCTCCGTGC', # chimp
];
my @val = my_parsimony($sites);
print Dumper \@val;
sub my_parsimony {
my $tfbs = shift;
my $mlen = length($tfbs->[0]);
my $sum_min = 0;
my @mincol;
foreach my $pos ( 0 .. $mlen-1 ) {
my @colbp = ();
foreach my $site ( @{$tfbs} ) {
my $bp = substr($site,$pos,1);
push @colbp, $bp;
}
# this heuristic seems to be faulty
# Column 11 it predicts 1 instead of 2
# Not sure how can I make use of the tree
my $min_mm = scalar( uniq(@colbp) ) - 1;
push @mincol, $min_mm;
}
return @mincol;
}
__END__
- Gundala Viswanath
Jakarta - Indonesia
More information about the Bioperl-l
mailing list