[Bioperl-l] Bio::Tools::pSW stop codon bug?
Prachi Shah
prachi at stanford.edu
Wed Aug 9 00:18:08 UTC 2006
Hi,
I am trying to align very similar protein sequences with the
Bio::Tools::pSW modules but running into an issue which seems like a
bug. One of the two sequences is extended considerably with gaps so
that an Amino acid residue matches the stop codon (*). I know there
should not be any internal stop codons but we are working with a new
assembly of the candida genome and we want to pick out such
inconsistent cases. In any case, the alignment should match the two
sequences (because they are the same) up until the stop codon is
encountered in the new sequence. Instead it artificially extends the
old sequence and matches the Alanine with the stop codon. Any help on
this is appreciated.
Thanks
Prachi
Here is an example set of two sequences I am trying to align:
>orf19.6264.3
MSNYLNLAQFSGVTDRFNLERIKSDFSSVQSTISKLRPPQEFFDFRRLSKPANFGEIQQRVGYNLGYFSANYITIVLGLSIYALITNFLLLFVTIFVLGGIYGINKLNGEDLVLPVGRFNTSQLYTGLLIVAVPLGFLASPISTMMWLIGSSGVTVGAHAALMEKPIETVFEEEV*V
>orf19.6264.3_old
MSNYLNLAQFSGVTDRFNLERIKSDFSSVQSTISKLRPPQEFFDFRRLSKPANFGEIQQRVGYNLGYFSANYITIVLGLSIYALITNFLLLFVTIFVLGGIYGINKLNGEDLVLPVGRFNTSQLYTGLLIVAVPLGFLASPISTMMWLIGSSGVTVGAHAALMEKPIETVFEEEV
and below is the part of code that generates the alignments --
################
my $new_translatedSeqObj = Bio::Seq->new(-display_id => $gene,
-seq => $new_translatedSeq);
my $old_translatedSeqObj = Bio::Seq->new(-display_id => $gene. "_old",
-seq => $old_translatedSeq);
# do alignments
my $align_factory = new Bio::Tools::pSW( '-matrix' =>
'/tools/perl/5.8.8/lib/site_perl/5.8.8/Bio/Ext/Align/blosum62.bla',
'-gap' => 12,
'-ext' => 2
);
my $aln = $align_factory->pairwise_alignment( $old_translatedSeqObj,
$new_translatedSeqObj );
my $alnout = new Bio::AlignIO(-format => 'clustalw',
-fh => \*STDOUT);
##################
The alignment --
CLUSTAL W(1.81) multiple sequence alignment
orf19.6264.3_old/1-162
MSNYLNLAQFSGVTDRFNLERIKSDFSSVQSTISKLRPPQEFFDFRRLSKPANFGEIQQR
orf19.6264.3/1-177
MSNYLNLAQFSGVTDRFNLERIKSDFSSVQSTISKLRPPQEFFDFRRLSKPANFGEIQQR
************************************************************
orf19.6264.3_old/1-162
VGYNLGYFSANYITIVLGLSIYALITNFLLLFVTIFVLGGIYGINKLNGEDLVLPVGRFN
orf19.6264.3/1-177
VGYNLGYFSANYITIVLGLSIYALITNFLLLFVTIFVLGGIYGINKLNGEDLVLPVGRFN
************************************************************
orf19.6264.3_old/1-162 TSQLYTGLLIVAVPLGFLASPISTMMWLIGSSGVTVGAHA---------------AL
orf19.6264.3/1-177 TSQLYTGLLIVAVPLGFLASPISTMMWLIGSSGVTVGAHAALMEKPIETVFEEEV*V
**************************************** :
More information about the Bioperl-l
mailing list