[Bioperl-l] how to use bioperl to do Z-scores test

Sean Davis sdavis2 at mail.nih.gov
Tue Jun 7 07:08:41 EDT 2005


On Jun 7, 2005, at 6:29 AM, Sean Davis wrote:

>
> On Jun 7, 2005, at 12:25 AM, Frank Lee wrote:
>
>> Hi, Sean
>>
>> What I am doing now is like this:
>>
>> I got a number from my data.  Say 122.     Then I generate one
>> thansand numbers randomly according to similar criteria, Say, (7, 19,
>> 45,199,......................).    I wish to tell whether the result
>> 122  is random distributed in the random dataset or it is small or
>> large.    And I wish to caculate the p-vlaue as a cutoff since I have
>> thousands of such data(set).
>>
>> Can you give me some suggestions?  Thanks!
>>
>
> You could try using code like:
>
> #!/usr/bin/perl
> use strict;
> use warnings;
>
> # Observed data
> my $datapoint=90;
> # generate 1000 random numbers (from 1 to 100)
> my @j;
> for my $i ((1..1000)) {
>   $j[$i-1] = int(rand(100));
> }
>
> # these lines return the number of permutation values > (<=) the 
> observed
> # value.
> my $count_greater = grep {$_>$datapoint} @j;
> my $count_less = grep {$_<=$datapoint} @j;
>
> # output the result
> # You're mileage may vary depending on if you want a one-sided test or 
> two-sided
> print "Original Data Point:  $datapoint\n";
> print "Permutation values greater than Data Point:  $count_greater 
> (p=".($count_greater/1000).") \n";
> print "Permutation values less than Data Point:  $count_less 
> (p=".($count_less/1000).") \n";
>
>
> If you are working with a large dataset, you may really want to 
> consider moving over to a statistical package like R, which has many 
> facilities for doing all kinds of testing like this (and more).
>

And I didn't mention--the perl rand function generates numbers from a 
uniform distribution, which may or may not be what you want.

Sean



More information about the Bioperl-l mailing list