[Bioperl-l] Allowing One error in Sequence matching

Wed Sep 16 22:30:50 UTC 2009

Hi Abhi -
Maybe Chris' scrap
http://www.bioperl.org/wiki/Tricking_the_perl_regex_engine_to_get_suboptimal_matches
is what you're after?
MAJ
----- Original Message ----- 
From: "Abhishek Pratap" <abhishek.vit at gmail.com>
To: <bioperl-l at lists.open-bio.org>
Sent: Wednesday, September 16, 2009 5:41 PM
Subject: [Bioperl-l] Allowing One error in Sequence matching


> Hi All
>
> I am not able to think of smart way to do sequence matching allowing
> userdefined number of mismatches.
>
> For eg:
>
> Given Sequence : AGCT will be considered a match to reference if any
> one base pair position #(1,2,3,4)  has a mismatch that is  [ACGTN] so
> the possible matches could be
>
> This is for position 1.
> AGCT
> GGCT
> CGCT
> TGCT
> NGCT
> and likewise for each position.
>
> any nice regular expression. One way that I could think was to
> generate all the possible tags for a given sequence and then do the
> matching. It will be a computationally expensive for long dataset .
> Any neat method ?
>
> Thanks,
> -Abhi
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>