[Bioperl-l] Per-column conservation of multiple alignment in Perl

Jun Yin jun.yin at ucd.ie
Wed Jun 15 12:53:25 UTC 2011


Hi, 

There is no function in Bio::SimpleAlign calculating
per-column-conservation. I think it is a good idea to implement it. Just one
suggestion, you can also define a window size parameter to get
per-window-conservation. This will make this function more useful.

Cheers,
Jun Yin
Ph.D. student in U.C.D.

Bioinformatics Laboratory
Conway Institute
University College Dublin


-----Original Message-----
From: bioperl-l-bounces at lists.open-bio.org
[mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Chad Davis
Sent: Wednesday, June 15, 2011 12:21 PM
To: bioperl-l at lists.open-bio.org
Subject: [Bioperl-l] Per-column conservation of multiple alignment in Perl

I asked this on BioStar, but then started thinking a patch to
Bio::SimpleAlign would be easy, depending on what people here think
...

http://biostar.stackexchange.com/questions/9196/per-column-conservation-of-m
ultiple-alignment-in-perl

Given a Bio::SimpleAlign, what is the best way to get per-column
conservation scores. E.g. into an array of values in [0:1] where the
array length would be the same as $align->length. I don't find
anything like this in Bio::SimpleAlign. I'm looking for a function
that allows:

my $io = Bio::AlignIO->new(-file=>$file);
my $align = $io->next_aln;
my @cons = $align->percentage_identity_by_column(); # <- does this exist?
print "@cons";
# 0.75 1.0 1.0 1.0 0.64 ....
Or should I just concat the gapped sequence, use substr() to extract
the characters and count them with a hash and return the frequency of
the most frequent character per column?

It looks like the private method Bio::SimpleAlign::_consensus_aa()
already does most of this, but it returns the character rather than
the fraction, which is what I was looking for. Short of submitting a
patch for that, is there a better approach?

Would there be general interest in such a patch to get per-column
conservation of multiple alignments?

Chad
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l





More information about the Bioperl-l mailing list