[Bioperl-l] gap characters in SimpleAlign objects
Ewan Birney
birney at ebi.ac.uk
Wed Feb 18 08:54:30 EST 2004
On Wed, 18 Feb 2004, Nathan Haigh wrote:
> I've been using the clustalw module for creating alignment, and I've just
> realised that when you output the alignment the gap character is a "." not a
> "-".
> This is most annoying because I am adding support to this module for
> generating trees via clustalw, and clustalw removes these "." characters. Is
> there a method for changing these gap characters to "-". I have seen the
> gap_char method in the SimpleAlign module, but this seems only to designate
> a particular character as a gap character, and does not actually change the
> character.
>
> Any ideas on how to do this substitution, and where in BioPerl does this
> assignment get made in the first place, since the default gap char for
> clustalw output is "-" not "."
To fix (short term): Loop over the sequences making a new SimpleAlign
object with LocatableSeqs and s/\./-/ on the seq strings
How are you reading in Clustalw alignments? The Bio::AlignIO::clustalw
doesn't touch the gap characters:
foreach my $name ( sort { $order{$a} <=> $order{$b} } keys %alignments
) {
if( $name =~ /(\S+):(\d+)-(\d+)/ ) {
($sname,$start,$end) = ($1,$2,$3);
} else {
($sname, $start) = ($name,1);
my $str = $alignments{$name};
$str =~ s/[^A-Za-z]//g;
$end = length($str);
}
my $seq = new Bio::LocatableSeq('-seq' => $alignments{$name},
'-id' => $sname,
'-start' => $start,
'-end' => $end);
($alignments{$name} has no regex put on it earlier either)
>
> Thanks
> Nathan
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
-----------------------------------------------------------------
Ewan Birney. Mobile: +44 (0)7970 151230, Work: +44 1223 494420
<birney at ebi.ac.uk>.
-----------------------------------------------------------------
More information about the Bioperl-l
mailing list