[Biopython] Get all alignments of a sequence against another

Kevin Rue kevin.rue at ucdconnect.ie
Fri Mar 14 11:07:46 UTC 2014


Hi Mary,

Please do let us know if that solution suits you or if the Levenshtein
distance metric does not fit your needs.
The approach below gives you the number of matches (length of the output
list), the start and stop positions of the match (be careful about Python
0-based indexing), and the edit distance between each match and the
sequence you search for. It's already a good place to start from.

Best
Kevin


On 14 March 2014 10:53, Tal Einat <taleinat at gmail.com> wrote:

> On Fri, Mar 14, 2014 at 11:16 AM, Kevin Rue <kevin.rue at ucdconnect.ie>
> wrote:
> > >>> import fuzzysearch
> > >>>
> fuzzysearch.find_near_matches_with_ngrams("GGGTTLTTSS","XXXXXXXXXXXXXXXXXXXGGGTTVTTSSAAAAAAAAAAAAAGGGTTLTTSSAAAAAAAAAAAAAAAAAAAAAABBBBBBBBBBBBBBBBBBBBBBBBBGGGTTLTTSS",
> > >>> 1)
> >
> > The output will find two matches.
> > Out[7]: [Match(start=89, end=99, dist=0), Match(start=89, end=99,
> dist=0)]
> >
> > BUG:
> > I did notice that the second match is reported twice instead and I assume
> > this is a bug where the first match was somehow replaced by the second,
> > which is why I copied Tal (the developer of this package) to this email
> >
> > Another example where I added you sequence (with a mismatch) a third
> time:
> >
> >>>>
> >>>>
> fuzzysearch.find_near_matches_with_ngrams("GGGTTLTTSS","XXXXXXXXXXXXXXXXXXXGGGTTVTTSSAAAAAAAAAAAAAGGGTTVTTSSAAAAAAAAAAAAAAAAAAAAAABBBBBBBBBBBBBBBBBBBBBBBBBGGGTTLTTSS",
> >>>> 1)
> >
> > returns
> > Out[9]:
> > [Match(start=42, end=52, dist=1),
> >  Match(start=99, end=109, dist=0),
> >  Match(start=99, end=109, dist=0)]
> >
> > You can see three matches, one of the mismatched sequence was detected
> > correctly (edit distance of 1), but the bug seems to duplicate the last
> > match and replace the one before the last match with it.
> >
> > Tal, can you fix that? I will add the issue to your repository :)
>
> Thanks for bringing this to my attention! Fixed.
>
> Upgrade to version 0.2.1 and your example will work as expected.
>
> (To upgrade, run: pip install --upgrade fuzzysearch)
>
> - Tal Einat
>



-- 
Kévin RUE-ALBRECHT
Wellcome Trust Computational Infection Biology PhD Programme
University College Dublin
Ireland
http://fr.linkedin.com/pub/k%C3%A9vin-rue/28/a45/149/en




More information about the Biopython mailing list