[Bioperl-l] Re: Motif finding

Boris Lenhard Boris.Lenhard@cgb.ki.se
20 Feb 2002 19:41:19 +0100

> Is there a way of finding motifs in a strand of DNA?
> The thing is, I want to find exact matches and some fuzzy ones (i.e 80% exact). Is 
> there a perl module to do it?
> Thanks.
> Desmond

Try TFBS at http://forkhead.cgb.ki.se/TFBS/ .

But please wait a few hours, I am uploading the 0.3 release tonight.

It represents DNA patterns using matrices, but has modules for
converting a set of DNA motifs to matrix representation. In your case,
if you have e.g. motif "ACATTAGATTT", you would do

   my $patterngen =

   my $frequency_matrix = $patterngen->pattern;
   my $weight_matrix = $frequency_matrix->to_PWM;

       # suppose you want to scan a sequence in a Bio::Seq object 
       # called $seqobj, with 80% score threshold

   my $binding_site_set = $weight_matrix->search_seq(-seqobj=>$seqobj,

       # to loop through the $binding_site_set, do

   my $iterator = $binding_site_set->iterator;
   while (my $binding_site = $iterator->next) {
	# do whatever you want with $binding_site;
	# $binding_site is a TFBS::Site object,
	# which is a subclass of Bio::SeqFeature::Generic
	# and has all its functionality

There are other ways to go, too.



 Boris Lenhard, Ph.D.
 Center for Genomics and Bioinformatics
 Karolinska Institutet
 Berzelius väg 35, B322
 171 77 Stockholm, SWEDEN
 Phone: +46 (0)8 728 6142
 FAX: +46 (0)8 32 48 26
 E-mail: Boris.Lenhard@cgb.ki.se