[Bioperl-l] Sequence search
Heikki Lehvaslaiho
heikki at ebi.ac.uk
Tue Mar 2 08:04:59 EST 2004
Pierre,
The closest we have is a Bio::Tools::SeqPattern, but it does not do the
pattern matching. That could be added if we decide what exactly and how
generic we want to implement. Ideas anyone?
The code below shows one simple approach to the problem. More advanced
algorthms exist, but this one is simple to work with if you want to go a head
and solve your immediate problem.
Yours,
-Heikki
##############################################
#!/usr/bin/perl -w
use strict;
my $s= "abjabjabjbkakjkjbkajkjanakjnakjnakjnakjnakjaankjankajakjna";
my $verbose = 0;
my $length = 7;
my $pattern = '[ak]';
my $threshold = 5;
my $offset =0;
while (1) {
my $subs = substr $s, $offset, $length;
last unless length($subs) eq $length;
my $subsori = $subs;
my $c = $subs =~ tr/$pattern/1/;
print "\t", $subsori, " $c\n" if $verbose;
print $subsori, " $c\n" if $c >= $threshold;
$offset++;
}
##############################################
On Tuesday 02 Mar 2004 11:00, KHOUEIRY pierre wrote:
> Hi all,
> I'm searching in bioperl for methods that can detect in a protein
> sequence a subseq rich in a special amino acids. In other way, I want to
> find _per example_ if there is subsequence of 12 aa (sliding window)
> that contains at least 7 (valin + leucine) in a given sequence of 400
> aa. I need to progress amino acid by amino acid using my sliding window
> I appreciate any help,
> Thanks
--
______ _/ _/_____________________________________________________
_/ _/ http://www.ebi.ac.uk/mutations/
_/ _/ _/ Heikki Lehvaslaiho heikki_at_ebi ac uk
_/_/_/_/_/ EMBL Outstation, European Bioinformatics Institute
_/ _/ _/ Wellcome Trust Genome Campus, Hinxton
_/ _/ _/ Cambs. CB10 1SD, United Kingdom
_/ Phone: +44 (0)1223 494 644 FAX: +44 (0)1223 494 468
___ _/_/_/_/_/________________________________________________________
More information about the Bioperl-l
mailing list