Bioperl: making a random sequence

Andreas Matern alm13@cornell.edu
Wed, 31 May 2000 17:45:19 -0400


Paul Gordon wrote:

> I seem to remember someone on the mailing list talking about writing a
> module to generate random sequences with base composition bias, but I
> could be wrong...

I couldn't find that in the archives, was anyone successful?

I've been asked to do something similar (although I find the biological
question questionable <grin>) and I was about to use the example in
Mastering Algorithms with Perl (pg 582 "Loaded Dice and Candy Colors:
Nonuniform Discrete Distributions").

If someone has already done this, however, I'd love to give it a
test-drive...

-Andreas

---------------
Andreas Matern
622 Rhodes Hall
Cornell Theory Center
Ithaca, NY 14853

alm13@cornell.edu
http://syntom.cit.cornell.edu/



> -----Original Message-----
> From: owner-vsns-bcd-perl@lists.uni-bielefeld.de
> [mailto:owner-vsns-bcd-perl@lists.uni-bielefeld.de]On Behalf Of Paul
> Gordon
> Sent: Tuesday, May 30, 2000 10:33 AM
> To: Gatherer, D. (Derek)
> Cc: BioPerl
> Subject: Re: Bioperl: making a random sequence
>
>
> > I'm trying to make some 'control' DNA sequences......
> >
> > srand();
> > for($x=1;$x<=1000000;$x++)
> > {
> > 	$r = rand(1);
> > 	if($r <= 0.25){ print "A"; }
> > 	elsif($r <= 0.5){ print "C"; }
> > 	elsif($r <= 0.75){ print "G"; }
> > 	elsif($r <= 1.0){ print "T"; }
> > }
> >
> > This spews out nonsense sequence but the proportions of bases
> are not equal.
> > A tends to be overrepresented at around 44% or so.  The others are
> > correspondingly reduced (sorry, I left the exact figures at home).  I've
> > checked that rand(1) really does generate numbers between 0 and
> 1, so why
> > the skew to larger numbers???
>
> There would be a *slight* skewing because rand() returns a number in the
> range [0,1) (i.e. including 0, excluding 1) so your <= operators should be
> < operators.  Your script works out fine on my SGI running 5.004_04, but I
> know there are some releases that had print buffering problems I ran into,
> which caused printed stuff to be reprinted (so 'print "a"; print "b";
> print "c"' prints "aababc" sometimes).  You could test this by pushing the
> letters onto an array, then printing the array at the end of the loop,
> instead of printing the sequence letter by letter.
>
> > Do I need to tweak srand() somehow?
> I beleive that the first time rand() is called, srand is called
> implicitly, so that's not actually necessary (since calling the program
> twice in a row without srand() yields different results).
>
> I seem to remember someone on the mailing list talking about writing a
> module to generate random sequences with base composition bias, but I
> could be wrong...
>
> Regards,
> 	Paul
>
> ________________________________________________________________________
> Paul Gordon                                     Paul.Gordon@nrc.ca
> Genomic Technologies				http://maggie.cbr.nrc.ca
> Institute for Marine Biosciences
> National Research Council Canada
>
> =========== Bioperl Project Mailing List Message Footer =======
> Project URL: http://bio.perl.org/
> For info about how to (un)subscribe, where messages are archived, etc:
> http://www.techfak.uni-bielefeld.de/bcd/Perl/Bio/vsns-bcd-perl.html
> ====================================================================

=========== Bioperl Project Mailing List Message Footer =======
Project URL: http://bio.perl.org/
For info about how to (un)subscribe, where messages are archived, etc:
http://www.techfak.uni-bielefeld.de/bcd/Perl/Bio/vsns-bcd-perl.html
====================================================================