[Bioperl-l] How to get the entropy of each nucleotide of analigment?
Mark A. Jensen
maj at fortinbras.us
Sun Nov 23 14:30:24 UTC 2008
Cláudio -
If you have a Bio::SimpleAlign object prepared ( maybe from
$alnio = new Bio::AlignIO(-format=>'fasta', -file=>'your.fas');
$aln = $alnio->next_aln;
try the following function, as
$entropies = entropy_by_column( $aln )
(which also gives an example of how you (or I, anyway) might
manipulate
alignments on a per-column basis)
cheers, Mark
=head2 entropy_by_column
Title : entropy_by_column
Usage : entropy_by_column( $aln )
Function: returns the Shannon entropy for each column in an alignment
Example :
Returns : hashref of the form { $column_number => $entropy, ... }
Args : a Bio::SimpleAlign object
=cut
sub entropy_by_column {
my ($aln) = @_;
my (%ent);
foreach my $col (1..$aln->length) {
my %res;
foreach my $seq ($aln->each_seq) {
my $loc = $seq->location_from_column($col);
next if $loc->location_type eq 'IN-BETWEEN';
$res{$seq->subseq($loc)}++;
}
$ent{$col} = entropy(values %res);
}
return [%ent];
}
=head2 entropy
Title : entropy
Usage : entropy( @numbers )
Function: returns the Shannon entropy of an array of numbers,
each number represents the count of a unique category
in a collection of items
Example : entropy ( 1, 1, 1 ) # returns 1.09861228866811 = log(1/3)
Returns : Shannon entropy or undef if entropy undefined;
Args : an array
=cut
sub entropy {
@a = map {$_ || ()} @_;
return undef unless grep {$_>0} @a;
return undef if grep {$_<0} @a;
my $t = eval join('+', @a);
map {$_ /= $t} @a;
return eval(join('+', map { $_ ? -$_*log($_) : () } @a));
}
> ----- Original Message -----
> From: "Cláudio Sampaio" <patola at gmail.com>
> To: <bioperl-l at bioperl.org>
> Sent: Sunday, November 23, 2008 12:19 AM
> Subject: [Bioperl-l] How to get the entropy of each nucleotide of
> analigment?
>
>
> Hi all,
>
> I am still a newbie to bioperl, and while searching for a way to
> calculate
> the entropy score of an alignment I came to Matrix scoring -
> http://doc.bioperl.org/releases/bioperl-1.4/Bio/Matrix/Scoring.html
> - but
> couldn't figure out how it relate to the Bio::Align class and
> objects. Can
> someone more knowledgeable give me a clue on how to start?
>
> Best regards,
>
> Cláudio "Patola"
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
More information about the Bioperl-l
mailing list