[Bioperl-l] how to not count gaps in the multiple sequence alignment

wenbin mei wenbinmei at gmail.com
Wed Nov 2 04:25:32 UTC 2011


Hi,

I need some help in coding. I have a multiple sequence alignment which has
gaps. And also I have a reference genome sequence in the alignment which I
know all the coordinates for the protein coding genes. I want to extract
all these protein coding genes alignment from the big alignment. I am using
Bio SimpleAlign but the question is that due to the gaps in the alignment,
the coordinates has shifted in the alignment. I wonder is there a way I can
not count the gaps and still be able to extract the protein alignment. One
way I can do is remove the gaps in the reference first and then extract the
sequence. But I don't like this way ... Thank you for help.

-best,
wenbin



More information about the Bioperl-l mailing list