[Bioperl-l] seq_word and pattern counts
Torsten Seemann
torsten.seemann at infotech.monash.edu.au
Tue Feb 28 22:47:01 UTC 2006
Staffa, Nick (NIH/NIEHS) [C] wrote:
> The real problem is this:
> We want to count sites in a long sequence where a restriction enzyme would cut.
> This restriction enzyme, in the example I gave will recognize GGnnCC,
> that is two G separated by two of any bases followed by two C.
> The GCG program findpatterns will do this, but bioperl makes certain statistics easy.
> I'm sure there is some module somewhere for this purpose.
(Nick - please respond to me AND the bioperl-l at bioperl.org mailing list
ie. "Reply All", so others can benefit from the Q&A - I've re-sent your
past responses already).
Perhaps this module?
http://doc.bioperl.org/bioperl-live/Bio/Tools/RestrictionEnzyme.html
With this code?
my $enz = "GGNNCC";
my $re = new Bio::Tools::RestrictionEnzyme(-NAME =>"NicksResEnz--$enz",
-MAKE =>'custom');
@fragments = $re->cut_seq($seqobj);
print "$enz cuts ", $seqobj->display_id, " ", scalar(@fragments), "
times.\n";
--
Torsten Seemann
Victorian Bioinformatics Consortium, Monash University, Australia
http://www.vicbioinformatics.com/
Phone: +61 3 9905 9010
More information about the Bioperl-l
mailing list