[Bioperl-l] Pattern search with gap
Tamas Horvath
hotafin at gmail.com
Wed Jun 22 10:13:31 EDT 2005
Here's a much simpler code:
#!/usr/bin/perl
# 10 20 30 40 45 50
60 7072
#
123456789012345678901234567890123456789012345678901234567890123456789012345678901
my $seqstring ="--------------------CAAAATAAATAGGTTATACAGAAACA---------------------AGATAAAAATTACA";
my $qseq = "CAAGATA";
my @qqq = split (//,$qseq);
my $pat = join('-*', at qqq);
my $pat_rege = qr/$pat/;
$seqstring =~ /$pat_rege/;
my $before = $`;
my $match_seq = $&;
my $before_length = length $before;
my $mseq_length = length $match_seq;
my $start = 1 + $before_length;
my $end = $before_length + $mseq_length;
print "Start:$start End:$end\n";
#Start:45 End:72
It should be quite fast. try it out, and let me know, if it works well for you!
Hota
On 6/22/05, khoueiry <khoueiry at ibdm.univ-mrs.fr> wrote:
> Hello,
>
> I want to parse a gapped sequence and search for a pattern in it... What
> is important for me is to get the Position of the pattern Start and End
> taking gaps into account:
>
> i.e :
> my $seqstring =
> "--------------------CAAAATAAATAGGTTATACAGAAACA---------------------AGATAAAAATTACA";
> my $qseq = "CAAGATA";
>
> so the result should give me : start = 61 and End = 89
>
> I wrote a program to do that.. It works well but when working with very
> large sequences (And I have a lot of them), it take a lot of time....
>
> In fact, my program parse the sequence with a sliding window equal the
> length of the pattern...
>
> the while loop is attached :
>
> Any suggestion will be appreciated....
>
>
> Pierre
>
>
>
> --
> ==========================
> Pierre Khoueiry
> LGPD/IBDM
> Campus de Luminy, Case 907
> 13288 Marseille cedex 9, France
> Tel : +33 (0)4 91 82 94 18
> Fax : +33 (0)4 91 82 06 82
>
> ==========================
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
More information about the Bioperl-l
mailing list