[Bioperl-l] Allowing One error in Sequence matching

Mark A. Jensen maj at fortinbras.us
Wed Sep 16 22:33:00 UTC 2009


Hi Abhi -
Maybe Chris' scrap
http://www.bioperl.org/wiki/Tricking_the_perl_regex_engine_to_get_suboptimal_matches
is what you're after?
MAJ


>-----Original Message-----
>From: Abhishek Pratap [mailto:abhishek.vit at gmail.com]
>Sent: Wednesday, September 16, 2009 05:41 PM
>To: bioperl-l at lists.open-bio.org
>Subject: [Bioperl-l] Allowing One error in Sequence matching
>
>Hi All
>
>I am not able to think of smart way to do sequence matching allowing
>userdefined number of mismatches.
>
>For eg:
>
>Given Sequence : AGCT will be considered a match to reference if any
>one base pair position #(1,2,3,4)  has a mismatch that is  [ACGTN] so
>the possible matches could be
>
>This is for position 1.
>AGCT
>GGCT
>CGCT
>TGCT
>NGCT
>and likewise for each position.
>
>any nice regular expression. One way that I could think was to
>generate all the possible tags for a given sequence and then do the
>matching. It will be a computationally expensive for long dataset .
>Any neat method ?
>
>Thanks,
>-Abhi
>_______________________________________________
>Bioperl-l mailing list
>Bioperl-l at lists.open-bio.org
>http://lists.open-bio.org/mailman/listinfo/bioperl-l
>





More information about the Bioperl-l mailing list