[Bioperl-l] counting gaps in a column of alignment

Alexey Morozov alexeymorozov1991 at gmail.com
Thu Jul 25 05:54:07 UTC 2013


Why would you use a regexp (which by the way I cannot understand: what are
\Q and \E?) and not simply

elsif($col eq $gap_char)

$col is always one char long, so KISS.
Also I hope I'll not sound like your school teacher picking all minor bugs,
but you have just \ and not \t in header printing. And %count doubles your
five $count_foo variables.



2013/7/25 subha kalyanamoorthy <sksweety24 at gmail.com>

> Hi there,
>
> I am a new bioperl user.  I made a script to count the nucleotide
> composition in each column of the alignment. I am able to get the count for
> the nucleotides, but not for the gap characters from the following program.
> I would greatly appreciate any suggestions to correct this program.
>
> Thanks.
>
> #***************************My Program**********************************
>
> #!/bin/perl -w
> use strict;
> use warnings;
>
> use List::Util 'max';
> use Bio::SimpleAlign;
> use Bio::Align::AlignI;
> use Bio::AlignIO;
> use Bio::SeqIO;
>
> my $in= Bio::AlignIO->new( -file => "seq.fst", -format => "fasta");
>
> my $align = $in->next_aln();
>
> print "column\tA's\tT's\C's\G's\n";
>
> for (my $i = 1; $i <= $align->length; $i++) {
>
>  my %count;
>
>  my $seqs = $align->slice($i,$i);
>
> my $gap_char = $seqs->gap_char();
>
>     my $count_A=0;
>     my $count_C=0;
>     my $count_T=0;
>     my $count_G=0;
>     my $count_N=0;
>     my $count_gap=0;
>
>     foreach my $seq ($seqs->each_seq) {
>
>     my $col=$seq->seq;
>
>         if ($col eq 'A'){
>         $count_A++;
>         }elsif ($col eq 'C'){
>         $count_C++;
>         }elsif ($col eq 'T'){
>         $count_T++;
>         }elsif ($col eq 'G'){
>         $count_G++;
>         }elsif ( $col eq 'N'){
>         $count_N++;
>         }elsif ($col =~ m/^\Q$gap_char\E$/){
>         $count_gap++;
> }
>       $count{$seq->seq} += 1;
>  }
>
> print"$i\t$count_A\t$count_T\t$count_C\t$count_G\n";
>
> }
>
> #***********************************************************************`
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>



-- 
Alexey Morozov,
LIN SB RAS, bioinformatics group.
Irkutsk, Russia.



More information about the Bioperl-l mailing list