[Bioperl-l] K-mer generating script

Cook, Malcolm MEC at stowers.org
Mon Jan 5 19:14:42 UTC 2009


Gang,

I couldn't resist adding the following non-perl solution...

#!/bin/bash
k=$1
s=$( printf "%${k}s" ); # a string with $k blanks
s=${s// /{A,T,G,C\}};   # substitute '{A,T,G,C}' for each of the k blanks
echo 'kmers using bash to expand:' $s > /dev/stderr
bash -c "echo  $s";     # let brace expanion of inferior bash compute the cross product

-- Malcolm


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org
> [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of
> Chris Fields
> Sent: Friday, December 19, 2008 11:54 PM
> To: Jason Stajich
> Cc: bioperl list; Mark A. Jensen; Blanchette, Marco
> Subject: Re: [Bioperl-l] K-mer generating script
>
> To add to the pile:
>
> Mark-Jason Dominus tackles this problem in Higher-Order Perl
> using iterators, which also allows other nifty bits like
> 'give variants of A(CTG)T(TGA)', where anything in
> parentheses are wild-cards.  The nice advantage of the
> iterator approach is you don't tank memory for long strings.
> Furthermore, as a bonus, you can now download the book for
> free:
>
> http://hop.perl.plover.com/book/
>
> The relevant chapter is here (p. 135):
>
> http://hop.perl.plover.com/book/pdf/04Iterators.pdf
>
> chris
>
> On Dec 19, 2008, at 11:02 PM, Jason Stajich wrote:
>
> > Does someone want to put this on the wiki too?
> >
> > Maybe we could start a little bit of perl snippets for
> examples like
> > these.
> >
> > -j
> > On Dec 19, 2008, at 7:45 PM, Mark A. Jensen wrote:
> >
> >> A little sloppy, but it recurses and is general---
> >>
> >> # ex...
> >> @combs = doit(3, [ qw( A T G C ) ]);
> >> 1;
> >> # code
> >>
> >> sub doit {
> >>  my ($n, $sym) = @_;
> >>  my $a = [];
> >>  doit_guts($n, $sym, $a, '');
> >>  return map {$_ || ()} @$a;
> >> }
> >>
> >> sub doit_guts {
> >> my ($n, $sym, $store, $str)  = @_;
> >> if (!$n) {
> >>  return $str;
> >> }
> >> else {
> >>  foreach my $s (@$sym) {
> >>    push @$store, doit_guts($n-1, $sym, $store, $str.$s);  } } }
> >>
> >>
> >> ----- Original Message ----- From: "Blanchette, Marco"
> >> <MAB at stowers-institute.org
> >> >
> >> To: <bioperl-l at lists.open-bio.org>
> >> Sent: Friday, December 19, 2008 6:25 PM
> >> Subject: [Bioperl-l] K-mer generating script
> >>
> >>
> >>> Dear all,
> >>>
> >>> Does anyone have a little function that I could use to
> generate all
> >>> possible k-mer DNA sequences? For instance all possible
> 3-mer (AAA,
> >>> AAT, AAC, AAG, etc...). I need something that I could input the
> >>> value of k and get all possible sequences...
> >>>
> >>> I know that it's a problem that need to use recursive programming
> >>> but I can't get my brain around the problem.
> >>>
> >>> Many thanks
> >>>
> >>> Marco
> >>> --
> >>> Marco Blanchette, Ph.D.
> >>> Assistant Investigator
> >>> Stowers Institute for Medical Research 1000 East 50th St.
> >>>
> >>> Kansas City, MO 64110
> >>>
> >>> Tel: 816-926-4071
> >>> Cell: 816-726-8419
> >>> Fax: 816-926-2018
> >>>
> >>> _______________________________________________
> >>> Bioperl-l mailing list
> >>> Bioperl-l at lists.open-bio.org
> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>>
> >>
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> > Jason Stajich
> > jason at bioperl.org
> >
> >
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>




More information about the Bioperl-l mailing list