[Bioperl-l] K-mer generating script

Chris Fields cjfields at illinois.edu
Sat Dec 20 05:53:59 UTC 2008


To add to the pile:

Mark-Jason Dominus tackles this problem in Higher-Order Perl using  
iterators, which also allows other nifty bits like 'give variants of  
A(CTG)T(TGA)', where anything in parentheses are wild-cards.  The nice  
advantage of the iterator approach is you don't tank memory for long  
strings.  Furthermore, as a bonus, you can now download the book for  
free:

http://hop.perl.plover.com/book/

The relevant chapter is here (p. 135):

http://hop.perl.plover.com/book/pdf/04Iterators.pdf

chris

On Dec 19, 2008, at 11:02 PM, Jason Stajich wrote:

> Does someone want to put this on the wiki too?
>
> Maybe we could start a little bit of perl snippets for examples like  
> these.
>
> -j
> On Dec 19, 2008, at 7:45 PM, Mark A. Jensen wrote:
>
>> A little sloppy, but it recurses and is general---
>>
>> # ex...
>> @combs = doit(3, [ qw( A T G C ) ]);
>> 1;
>> # code
>>
>> sub doit {
>>  my ($n, $sym) = @_;
>>  my $a = [];
>>  doit_guts($n, $sym, $a, '');
>>  return map {$_ || ()} @$a;
>> }
>>
>> sub doit_guts {
>> my ($n, $sym, $store, $str)  = @_;
>> if (!$n) {
>>  return $str;
>> }
>> else {
>>  foreach my $s (@$sym) {
>>    push @$store, doit_guts($n-1, $sym, $store, $str.$s);
>>  }
>> }
>> }
>>
>>
>> ----- Original Message ----- From: "Blanchette, Marco" <MAB at stowers-institute.org 
>> >
>> To: <bioperl-l at lists.open-bio.org>
>> Sent: Friday, December 19, 2008 6:25 PM
>> Subject: [Bioperl-l] K-mer generating script
>>
>>
>>> Dear all,
>>>
>>> Does anyone have a little function that I could use to generate  
>>> all possible k-mer DNA sequences? For instance all possible 3-mer  
>>> (AAA, AAT, AAC, AAG, etc...). I need something that I could input  
>>> the value of k and get all possible sequences...
>>>
>>> I know that it's a problem that need to use recursive programming  
>>> but I can't get my brain around the problem.
>>>
>>> Many thanks
>>>
>>> Marco
>>> --
>>> Marco Blanchette, Ph.D.
>>> Assistant Investigator
>>> Stowers Institute for Medical Research
>>> 1000 East 50th St.
>>>
>>> Kansas City, MO 64110
>>>
>>> Tel: 816-926-4071
>>> Cell: 816-726-8419
>>> Fax: 816-926-2018
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Jason Stajich
> jason at bioperl.org
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l




More information about the Bioperl-l mailing list