[Bioperl-l] Information content of alignment
Shawn Hoon
shawnh at stanford.edu
Thu May 13 11:02:10 EDT 2004
On May 13, 2004, at 6:32 AM, martin wrote:
> Hi Malay,
>
> Not quite sure what you mean by 'information content'. You can access
> a
> single column of an alignment using the slice() function:
>
> $aln2 = $aln->slice(20, 30)
>
> which returns another AlignI object. So something like;
>
> foreach (0..$aln->length){
> my $column=$aln->slice($_, $_);
> # $column is now an AlignI object
> # do something with it....
> }
>
I had written something similar for Bio::Graphics::Pictogram, but there
is nothing explicit
right now that I can think of. Maybe it would be useful to add to
SimpleAlign.
Something I would do, continuing from the code above, once you get the
slice you can start counting the frequencies:
my $pos = 1;
foreach (0..$aln->length){
my $column = $aln->slice($_,$_);
my @seq = $column->each_seq;
my $total = 0;
foreach my $letter(@seq){
$hash{$pos}{$letter->seq}++;
$total++;
}
$hash{$pos}{'total'} = $total;
$pos++;
}
#calculate entropy
foreach my $pos(sort{$a<=>$b} keys %hash){
my $ent;
foreach my $base(keys %{$hash{$pos}}){
my $freq = $hash{$pos}{$base}/$hash{$pos}{'total'};
$ent += -1 * $freq*log2($freq);
}
print "Position $pos, entropy: $ent bits \n";
}
sub log2{
my ($x) = @_;
return 0 if $x==0;
return log($x)/log(2);
}
> you can get the documentation with
>
> % perldoc Bio::Align::AlignI
>
> If you let me know what you want to do with the column, maybe I can
> give
> some more advice.
>
> Cheers
>
> Martin
>
>
>
> On Wed, 2004-05-12 at 18:56, Malay wrote:
>> Hi Bioperlers:
>>
>> Perdon my ignorance. I could not remove by haze about the numerous
>> bioperl modules. I looked as AlignI interface but could not gather the
>> answer to my question:
>>
>> Is there any way to quickly calculate information content of each
>> column
>> of the alignment in bioperl?
>>
>> Any pointers or source code would be appreciated. Otherwise, I have to
>> get my hand dirty.
>>
>> Cheers,
>>
>> Malay
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at portal.open-bio.org
>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
> --
> Martin Jones
> The Nematode Genomics Lab
> Institute of Cell, Animal and Population Biology
> University of Edinburgh
> King's Buildings
> West Mains Road
> Edinburgh, EH9 3JT
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
More information about the Bioperl-l
mailing list