[Biopython] Calculating the Hamming distance
Michiel de Hoon
mjldehoon at yahoo.com
Thu Jun 27 09:08:52 UTC 2013
Hi Philipp,
Maybe the sequence alignment doesn't show up clearly in the email, but the two sequences do match very well. The Hamming distance is only 4 (i.e. 4 mismatches/insertions/deletions).
Best,
-Michiel.
________________________________
From: Philipp Schiffer <philipp.schiffer at gmail.com>
To: Michiel de Hoon <mjldehoon at yahoo.com>
Cc: "biopython at biopython.org" <biopython at biopython.org>
Sent: Thursday, June 27, 2013 4:47 PM
Subject: Re: [Biopython] Calculating the Hamming distance
Hi Michiel,
maybe I am thick here (or lack the biological) knowledge, but to me it looks as if your sequence just don't match. Thus the Bio.pairwise2 alignment is 'correct' in terms if alignment.
Cheers
Philipp
--
Philipp Schiffer
Sent with Sparrow
On Thursday, 27. June 2013 at 09:13, Michiel de Hoon wrote:
Dear all,
>
>
>I am trying to align a small RNA sequence to a (shortish) DNA sequence.
>The alignment I am looking for is:
>
>
>
>
>AGGATTCGGCGCTCTCACCGCCGCGGCCCGGGTTCGAT--TCCCGGTCAGGGAACCA-
> GGATGATCCCGGTCAGGGAACCAA
>
>
>where the first sequence is the DNA and the second sequence is the RNA.
>The Hamming distance is 4 (the initial mismatch, the 2 insertions, and the gap at the end).
>
>
>If I try to calculate this alignment with Bio.pairwise2, I get the following if I use
>globalms(dna, rna, 0, -1, -1, -1, penalize_end_gaps=True):
>
>
>AGGATTCGGCGCTCTCACCGCCGCGGCCCGGGTTCGATTCCCGGTCAGGGAACC-A
>-GGAT--G--------A---------------------TCCCGGTCAGGGAACCAA
>
>
>However, if I set penalize_end_gaps to False, I get
>
>
>-----------------------AGGATTCGGCGCTCTCACCGCCGCGGCCCGGGTTCGATTCCCGGTCAGGGAACCA
>GGATGATCCCGGTCAGGGAACCAA------------------------------------------------------
>
>
>I guess the solution is to penalize end gaps in the DNA but not in the RNA.
>I could modify Bio.parwise2 to allow for that possibility, but before I do so, I was wondering if there are any other ways to find the desired alignment with Biopython (preferably without using 3rd-party software).
>
>
>Thanks,
>-Michiel.
>_______________________________________________
>Biopython mailing list - Biopython at lists.open-bio.org
>http://lists.open-bio.org/mailman/listinfo/biopython
More information about the Biopython
mailing list