[BioPython] help for local alignment

Peter biopython at maubp.freeserve.co.uk
Sat Jan 3 19:59:04 UTC 2009


Hi Chen,

I've copied the Biopython mailing list on my reply as I think this is
of general interest.

On 1/3/09, Chen Ku wrote:
> Dear Peter,
>                            it will be a great help of you if you can send me
> the exact code for this problem using Bio package. I think as you are expert
> in this you can write me and I think it will be few line code.
>
> My Problem: Given two protein sequence I have to perform Global alignment
> using Blosum 62 scoring scheme.
>
> v = NGPSTKDFGKISESREFDNQNGPSTKDFGKISESREFDNQ
> w = QNQLERSFGKINMRLEDALVQNQLERSFGKINMRLEDALV
> Scoring matrix: BLOSUM62
> Gap open: 10.0
> Gap extended: 0.5
>
> The answer I did manually will come 20 using BLOSUM matrix.
> ....
>
> Regards
> Chen

I'd already told Chen that Biopython provides the BLOSUM matrices in
Bio.SubsMat.MatrixInfo (as simple dictionaries) and pointed him at the
Bio.pairwise2 for doing pairwise alignments.  The information is all
there in the pairwise2 docstrings, but perhaps it could be clearer.

There are lots of global alignment functions named globalXX in
Bio.pairwise2.align, where the two letter code "XX" tells you the type
of parameters for matches (and mismatches), and the parameters for gap
penalties.  In this case we want to use the function globalds because
we want to use the BLOSUM62 matrix which we have as a dictionary (d
for dictionary), and the two sequences have the same gap parameters (s
for same).

from Bio import pairwise2
from Bio.SubsMat.MatrixInfo import blosum62
v = "NGPSTKDFGKISESREFDNQNGPSTKDFGKISESREFDNQ"
w = "QNQLERSFGKINMRLEDALVQNQLERSFGKINMRLEDALV"
open_penalty = -10
extend_penalty = -0.5
alignments = pairwise2.align.globalds(v, w, blosum62, open_penalty,
extend_penalty)
for align1, align2, score, begin, end in alignments :
    print pairwise2.format_alignment(align1, align2, score, begin, end)

This gives me two alignments back, both scoring twenty - as you had
calculated by hand Chen.  But do double check this is doing what you
expected!

Peter



More information about the Biopython mailing list