[Bioperl-l] Sequence search

Heikki Lehvaslaiho heikki at ebi.ac.uk
Tue Mar 2 08:04:59 EST 2004


Pierre,

The closest we have is a Bio::Tools::SeqPattern, but it does not do the 
pattern matching. That could be added if we decide what exactly and how 
generic we want to implement.  Ideas anyone?


The code below shows one simple approach to the problem. More advanced 
algorthms exist, but this one is simple to work with if you want to go a head 
and solve your immediate problem.

Yours,
		-Heikki

##############################################
#!/usr/bin/perl -w

use strict;

my $s= "abjabjabjbkakjkjbkajkjanakjnakjnakjnakjnakjaankjankajakjna";

my $verbose = 0;

my $length = 7;
my $pattern = '[ak]';
my $threshold = 5;


my $offset =0;
while (1) {
    my $subs = substr $s, $offset, $length;
    last unless length($subs) eq $length;

    my $subsori = $subs;
    my $c = $subs =~ tr/$pattern/1/;

    print "\t", $subsori, " $c\n" if $verbose;
    print $subsori, " $c\n" if $c >= $threshold;

    $offset++;
}
##############################################

On Tuesday 02 Mar 2004 11:00, KHOUEIRY pierre wrote:
> Hi all,
> I'm searching in bioperl for methods that can detect in a protein
> sequence a subseq rich in a special amino acids. In other way, I want to
> find _per example_ if there is subsequence of 12 aa (sliding window)
> that contains at least 7 (valin + leucine) in a given sequence of 400
> aa. I need to progress amino acid by amino acid using my sliding window
> I appreciate any help,
> Thanks

-- 
______ _/      _/_____________________________________________________
      _/      _/                      http://www.ebi.ac.uk/mutations/
     _/  _/  _/  Heikki Lehvaslaiho    heikki_at_ebi ac uk
    _/_/_/_/_/  EMBL Outstation, European Bioinformatics Institute
   _/  _/  _/  Wellcome Trust Genome Campus, Hinxton
  _/  _/  _/  Cambs. CB10 1SD, United Kingdom
     _/      Phone: +44 (0)1223 494 644   FAX: +44 (0)1223 494 468
___ _/_/_/_/_/________________________________________________________


More information about the Bioperl-l mailing list