[Bioperl-l] How do you add a consensus to AlignIO?

Wes Barris wes.barris at csiro.au
Thu Aug 28 18:40:31 EDT 2003


Brian Osborne wrote:
> Wes,
> 
> I'm puzzled. I'm looking at SimpleAlign::_consensus_aa and it certainly
> looks like this method should ignore the $gapchar when it calculates the
> consensus at any given position. Does your method look like this:

Hi Brian,

Thanks for responding.  No, my SimpleAlign.pm does not look like this.
I am using bioperl-1.2.2.  You must be using bioperl-live from CVS.
I added the two "gapchar" changes to my bioper-1.2.2 code and it works
like a charm.  Thanks!


> 
> sub _consensus_aa {
>     my $self = shift;
>     my $point = shift;
>     my $threshold_percent = shift || -1 ;
>     my ($seq,%hash,$count,$letter,$key);
>     my $gapchar = $self->gap_char;
>     foreach $seq ( $self->each_seq() ) {
> 	$letter = substr($seq->seq,$point,1);
> 	$self->throw("--$point-----------") if $letter eq '';
> 	($letter eq $gapchar || $letter =~ /\./) && next;
> 	# print "Looking at $letter\n";
> 	$hash{$letter}++;
>     }
>     my $number_of_sequences = $self->no_sequences();
>     my $threshold = $number_of_sequences * $threshold_percent / 100. ;
>     $count = -1;
>     $letter = '?';
> 
>     foreach $key ( sort keys %hash ) {
> 	# print "Now at $key $hash{$key}\n";
> 	if( $hash{$key} > $count && $hash{$key} >= $threshold) {
> 	    $letter = $key;
> 	    $count = $hash{$key};
> 	}
>     }
>     return $letter;
> }
> 
> ?
> 
> Brian O.
> 
> -----Original Message-----
> From: bioperl-l-bounces at portal.open-bio.org
> [mailto:bioperl-l-bounces at portal.open-bio.org]On Behalf Of Wes Barris
> Sent: Wednesday, August 27, 2003 10:56 PM
> To: Bioperl Mailing List
> Subject: Re: [Bioperl-l] How do you add a consensus to AlignIO?
> 
> Jason Stajich wrote:
> 
> 
>>Try:
>> my $consensus = new Bio::LocatableSeq(-seq=>$aln->consensus_string(),
>>                                        -id=>'btcn1000');
> 
> 
> Thanks.  That did the trick almost (I had to add -start and -end to the
> argument list).  However, it is not producing the consensus in the way
> that I want.  Here is a portion of the resulting msf file:
> 
> AU278862              ---------- ---------C TCTACAGAAT CTGTGTTTAT TTTGTTTCAG
> AU278567              ---------- ---------- ---ACAGAAT CTGTGTTTAT TTTGTTTCAG
> AU277959              -------AGA TTTTGACATC TCTACAGAAT CTGTGTTTAT TTTGTTTCAG
> AU278008              CCTTCTTANA TTTTGACATC TCTACAGAAT CTGNGTTTAT TTTGTTTCAG
> AU278623              -------AAA TTTTGACATC TCTACANAA- CTGTGTTTAT TTTGTTTCAN
> AU278682              -------AAA TTTTGACATC TCTACANAAT CTGTGTTTAT TTTGTTTCAN
> BM031781              ---------- ---------- ---------- ---------- ----------
> consensus             -------A-A TTTTGACATC TCTACAGAAT CTGTGTTTAT TTTGTTTCAG
> 
> I need the consensus to span the entire alignment length like this:
> 
> consensus             CCTTCTTAAA TTTTGACATC TCTACAGAAT CTGTGTTTAT TTTGTTTCAG
> 
> i.e. I need it to ignore where the aligned sequences do not exist.  Is there
> a way to make it do that?
> 
> --
> Wes Barris
> E-Mail: Wes.Barris at csiro.au
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
> 


-- 
Wes Barris
E-Mail: Wes.Barris at csiro.au



More information about the Bioperl-l mailing list