[Bioperl-l] get regions

Steve Chervitz sac at bioperl.org
Tue May 15 01:46:55 UTC 2007


On 5/14/07, Kevin Brown <Kevin.M.Brown at asu.edu> wrote:
> I do this in perl with the pos() function.  This requires the use of the
> match operator (m) like
>
> if ($gene =~ m/$pattern/gi)
> {
>         $start = pos($gene) - length($pattern) + 1;
> }
>
> pos() returns the location of the pointer where the regex left off after
> finding a match.

Cool. I hadn't known that was possible.

> I remove the length of my pattern (which is just a
> string with a few placeholder (.) wildcards, so I know how long the
> match will always be).

To generalize your code so that it will work for any pattern, such as
one that can match strings of variable length like "A{5,10}", just
subtract the length of the actual string that was matched:

if ($gene =~ m/$pattern/gi)
{
    $start = pos($gene) - length($&) + 1;
 }

Steve

> > -----Original Message-----
> > From: bioperl-l-bounces at lists.open-bio.org
> > [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of
> > Jason Stajich
> > Sent: Monday, May 14, 2007 12:06 PM
> > To: Thiago Venancio
> > Cc: bioperl-l list
> > Subject: Re: [Bioperl-l] get regions
> >
> > I assume you are doing the matches on the string with =~ so
> > Bio::Seq doesn't really help you here I don't think.
> > See the $` variable in Perl for how to capture the position
> > of where a regexp matches.
> >
> > -jason
> > On May 14, 2007, at 11:54 AM, Thiago Venancio wrote:
> >
> > > Hi all,
> > >
> > > Using Bio::Seq, is there any easy way to get the
> > coordinates where a
> > > regular expression matches or should I build a sliding window?
> > >
> > > For example, looking for a given promoter region in a FASTA
> > file. If
> > > the region is found, I would like to recover exactly the
> > coordinates
> > > where it matches.
> > >
> > > Thanks in advance.
> > >
> > > Thiago
> > > --
> > > "Doubt is not a pleasant condition, but certainty is absurd."
> > >             Voltaire
> > >
> > > ========================
> > > Thiago Motta Venancio, MSc
> > > PhD student in Bioinformatics
> > > University of Sao Paulo
> > > ========================
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l at lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> > --
> > Jason Stajich
> > jason at bioperl.org
> > http://jason.open-bio.org/
> >
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>



More information about the Bioperl-l mailing list