[Bioperl-l] Bio::Tools::SeqPattern
Mark, Terry
tmark@amgen.com
Mon, 17 Dec 2001 09:26:17 -0800
Hi all,
I recently took at a look at this module. The documentation was pretty
sparse; as near as I can tell the module is supposed to generate strings to
be used with Perl's regexp facilities, and the syntax for the expressions is
thus exactly as with Perl. (I am a little uncertain about this, however, as
one of the POD examples has a pattern that contains '(GXX){3,2}', which
would generate an error if the corresponding regexp were run by itself in
Perl.)
Anyways, i have found what appears to be bugs in the module.
Consider the following stub of code:
use Bio::Tools::SeqPattern;
$a = '[C][C][C].{0,13}A';
print "original pattern is '$a'\n";
my $bioPat = new Bio::Tools::SeqPattern (-SEQ => $a,
-TYPE => 'Dna');
print "forward: \n";
print $bioPat->expand . "\n";
print "reverse\n";
print $bioPat->revcom()->expand . "\n";
print "reverse (expanded)\n";
print $bioPat->revcom(1)->expand . "\n";
Which generates the output:
[tmark@xena scripts]$ perl test.pl
original pattern is '[C][C][C].{0,13}A'
forward:
CCC.{0,13}A
reverse
T.G{0,13}GG
reverse (expanded)
T.{0,13}GGG
Note, the reverse complement output differs from the
expanded-before-rev-complement output, even though the underlying pattern
contains no ambiguity codes - in which case, if I understand correctly, both
expanded strings should be the same.
The problem seems to stem from the (here degenerate) use of character
classes as the pattern 'CCC.{0,13}A' seems OK.
Has anybody experienced problems with this module ? I'm using 0.7.2.
Thanks,
terry