[Bioperl-l] how to use bioperl to do Z-scores test
Sean Davis
sdavis2 at mail.nih.gov
Tue Jun 7 07:08:41 EDT 2005
On Jun 7, 2005, at 6:29 AM, Sean Davis wrote:
>
> On Jun 7, 2005, at 12:25 AM, Frank Lee wrote:
>
>> Hi, Sean
>>
>> What I am doing now is like this:
>>
>> I got a number from my data. Say 122. Then I generate one
>> thansand numbers randomly according to similar criteria, Say, (7, 19,
>> 45,199,......................). I wish to tell whether the result
>> 122 is random distributed in the random dataset or it is small or
>> large. And I wish to caculate the p-vlaue as a cutoff since I have
>> thousands of such data(set).
>>
>> Can you give me some suggestions? Thanks!
>>
>
> You could try using code like:
>
> #!/usr/bin/perl
> use strict;
> use warnings;
>
> # Observed data
> my $datapoint=90;
> # generate 1000 random numbers (from 1 to 100)
> my @j;
> for my $i ((1..1000)) {
> $j[$i-1] = int(rand(100));
> }
>
> # these lines return the number of permutation values > (<=) the
> observed
> # value.
> my $count_greater = grep {$_>$datapoint} @j;
> my $count_less = grep {$_<=$datapoint} @j;
>
> # output the result
> # You're mileage may vary depending on if you want a one-sided test or
> two-sided
> print "Original Data Point: $datapoint\n";
> print "Permutation values greater than Data Point: $count_greater
> (p=".($count_greater/1000).") \n";
> print "Permutation values less than Data Point: $count_less
> (p=".($count_less/1000).") \n";
>
>
> If you are working with a large dataset, you may really want to
> consider moving over to a statistical package like R, which has many
> facilities for doing all kinds of testing like this (and more).
>
And I didn't mention--the perl rand function generates numbers from a
uniform distribution, which may or may not be what you want.
Sean
More information about the Bioperl-l
mailing list