[Biopython] allow ambiguities is sequence matching?
Christian Schaefer
schafer at rostlab.org
Fri Nov 20 16:55:58 UTC 2009
Hey Cedar,
I'm currently doing something similar on protein sequences. A simple
brute force method could work like this:
Slide the short sequence 'underneath' the long sequence. After each step
translate the current overlap into a bit-string where 1 indicates a
match and 0 a mismatch. Now you can easily apply a regex on this
bit-string to look for particular patterns like 'n mismatches allowed'.
Hope that helps.
Chris
Cedar McKay wrote:
> Hello all,
> Apologies if this is covered in the tutorial anywhere, if so I didn't
> see it.
>
> I am trying to test whether sequence A appears anywhere in sequence B.
> The catch is I want to allow n mismatches. Right now my code looks like:
>
> #record is a SeqRecord
> #query_seq is a string
> if query_seq in record.seq:
> do something
>
>
> If I want query_seq to match despite n nucleotide mismatches, how should
> I do that? It seems like something that would be pretty common for
> people working with DNA probes. Is this even a biopython problem? Or is
> it just a general python problem?
>
> thanks,
> Cedar
>
> _______________________________________________
> Biopython mailing list - Biopython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython
More information about the Biopython
mailing list