[Bioperl-l] K-mer generating script
Dave Messina
David.Messina at sbc.su.se
Sat Dec 20 01:11:00 UTC 2008
Hi Marco,
Here's some code to generate and print all possible nmers. I'm really just
using the module Math::Combinatorics to do all the dirty work here, so
probably won't be as fast as if you wrote a custom recursive function as you
suggest. But gets the job done anyway.
See also Bio::Tools::SeqWords and Bio::Tools::SeqStats for related goodies.
Dave
-------------- example code --------------
#!/usr/local/bin/perl
use strict;
use warnings;
use Math::Combinatorics;
# do all codons (3-mers) as an example
generate_possible_kmers(3);
=head2 generate_possible_kmers
Title : generate_possible_kamers
Usage : my $possible_perms = generate_possible_kmers()
Function: create and print the list of possible DNA kmers
Returns : none
Args : n - the length of the desired 'mer'
=cut
sub generate_possible_kmers {
my ($n) = @_;
my $alphabet = [ qw( A C G T ) ];
my $words_per_row = 10;
my $i=0;
my $o = Math::Combinatorics->new( count=>$n, data=>$alphabet,
frequency=>[$n,$n,$n,$n] );
while ( my @x = $o->next_multiset ) {
my $p = Math::Combinatorics->new( data=>\@x , frequency=>[map{1} @x] );
while ( my @y = $p->next_string ) {
print join('', @y), ' ';
$i++;
if (($i % $words_per_row) == 0) { print "\n"; }
}
}
}
----------------- end code -----------------
More information about the Bioperl-l
mailing list