[Bioperl-l] Sequence matching problem!
Heikki Lehvaslaiho
heikki at sanbi.ac.za
Fri Feb 23 08:25:39 UTC 2007
Kurt,
There are few things in your code to note:
- regexp /C*T/ matches any T preceded by zero or more Cs,
not what you meant
- $- and $+ are among the "expensive" perl functions worth
not using unless you have to. Using them once in your
code slows execution down considerable. There is always
an other way.
- Keep in mind what you want to use the match positions for:
Human readable locations usually start counting with 1 but
perl code uses 0 as the first location. The code below assumes
you want to print the locations out.
Study my example code below.
Yours,
-Heikki
###################################################################
#!/usr/bin/perl
$seq = "GATCAAT";
#$pattern= 'C*T';
$pattern= 'C.*T';
while ($seq =~ m/($pattern)/gi) {
$match = $1;
$end = pos($seq);
$start = $end - length($match) +1;
print "$match : $start - $end\n";
}
###################################################################
On Thursday 22 February 2007 22:41:37 Kurt Gobain wrote:
> Hi every1..
> I m facing a great deal of problem in simple pattern matching between
> sequence & a pattern ..Program shod be designed such a way that it shod be
> able do two things 1) normal matching...For eg: GATCAAT....if TC is
> entered... output shod be 2...2) matching using spl character..In same
> example if C*T value is entered It shod give o/p as 3 & seq to b displayed
> is CAAT..I m easily getting 1st part...But in 2nd part Its giving sum
> problem..output I m gettin as 1 instead of 3...Code is really simple!
>
> #!/usr/bin/perl
> $alphabet = "GATCAAT";
> $pattern= "C*T ";
>
> $alphabet =~ /($pattern)/i;
>
> print "The entire '$pattern' match began at $-[0] and ended at $+[0]\n";
>
> ====================
> OUTPUT!
> The entire C*T match began at 1 and ended at 2
> ====================
>
> but the o/p shod be 3????
> & Is there n e chance I can get seq too..I mean instead of C*T'' i need
> 'CAAT'...????
>
> Well..Its not compulsion to use regex....But I find it quite simple..can
> there be n e other method??
>
> Thanx in advance!
> Kurt!
--
______ _/ _/_____________________________________________________
_/ _/
_/ _/ _/ Heikki Lehvaslaiho heikki at_sanbi _ac _za
_/_/_/_/_/ Associate Professor skype: heikki_lehvaslaiho
_/ _/ _/ SANBI, South African National Bioinformatics Institute
_/ _/ _/ University of Western Cape, South Africa
_/ Phone: +27 21 959 2096 FAX: +27 21 959 2512
___ _/_/_/_/_/________________________________________________________
More information about the Bioperl-l
mailing list