[Bioperl-l] get the sequence of a column in a multiple alignment

Mathieu Rouard mrouard at gmail.com
Wed Feb 14 11:23:47 UTC 2007


Dear all,

I am starting to use the bioperl API to parse multiple alignments and I am
wondering what is the most effective way to splice all the columns from an
alignment (all the AA at the postion 1, position 2 etc.). I quickly
implemented this simple code but it becomes quite slow when the length of
sequences increases.

my $stream  = Bio::AlignIO->new(-file => $inputfilename,
                        '-format' => 'stockholm');

my $aln = $stream->next_aln();

my $length = $aln->length();
my %column;

for (my $i=1;$i<=$length;$i++) {
       my $aa;
        foreach my $seq ($aln->each_seq()) {
          my $obj = $seq->trunc($i,$i);
          $aa .=$obj->seq;
        }
     # need to track the column number and the sequence of the column
     push $column,  $aa;
}

Would you have any other suggestion?

thanks
Mathieu



More information about the Bioperl-l mailing list