TWO SAMPLE PERMUATION TEST

Name:

TWO SAMPLE <STATISTIC> PERMUATION TEST Type:

Analysis Command Purpose:

Perform a two sample permutation test for a specified statistic. Description:

Compute the desired statistic for the original data.
Combine the 2 data sets into a single data set.
Generate a permutation of the combined data. Then compute the desired statistic (the first n1 permuted values constitute the first response variable and the following n2 permuted values constitute the second response variable).
Repeat step 3 NITER number of times.

The NITER computed statistics represent the reference distribution. The statistic for the original data is compared to this reference distribution. For example, the cut-offs for a 95% two-sided test are obtained from the 2.5% and 97.5% percentiles of the reference distribution.

The permutation test is based on all possible permutations of the data. However, the number of permutations ((n1+n2)!/(n1!n2!)) grows rapidly as the sample sizes increase. However, sampling a subset of all possible permutations provides a reasonable approximation for the permutation test. By default, Dataplot generates 4,000 iterations. To change this, enter the command

If <value> is less than 100, it will be set to 100. If <value> is greater than 100,000, it will be set to 100,000.

The specified statistic should be one that can be computed from a single response variable (e.g., MEAN, MEDIAN, VARIANCE). By default, Dataplot will compute the difference of the statistic between the two samples. For scale statistics (e.g., STANDARD DEVIATION, VARIANCE), it is often preferred to compute the ratio rather than the difference. To specify the ratio be computed, enter

SET PERMUTATION TEST RATIO

To reset the default, enter

SET PERMUTATION TEST DIFFERENCE

Permutation tests assume the observations are independent. However, no distributional assumptions are made about the response variables.

Syntax 1:

If LOWER TAILED is specified, a lower tailed test is performed. If UPPER TAILED is specified, an upper tailed test is performed. If neither LOWER TAILED or UPPER TAILED is specified, a two-tailed test is performed.

To see a list of supported statistics, enter HELP STATISTICS.

Syntax 2:

This syntax performs all the two-way two sample permutation tests for the listed variables. This syntax supports the TO syntax.

To see a list of supported statistics, enter HELP STATISTICS.

Examples:

SET PERMUTATION TEST RATIO
TWO SAMPLE STANDARD DEVIATION PERMUATION TEST Y1 Y2

Note:

This routine uses a random permutation algorithm suggested by Knuth. Specifically, it adapts the RANDPERM routine of Knoble. Note:

STATVAL	-	value of the test statistic
STATCDF	-	CDF of the test statistic
PVALUE	-	p-value of the two tailed test statistic
PVALUELT	-	p-value of the lower tailed test statistic
PVALUEUT	-	p-value of the upper tailed test statistic
P80	-	80% upper critical value
P90	-	90% upper critical value
P95	-	95% upper critical value
P975	-	97.5% upper critical value
P99	-	99% upper critical value
P995	-	99.5% upper critical value
P999	-	99.9% upper critical value
P20	-	20% lower critical value
P10	-	10% lower critical value
P05	-	5% lower critical value
P025	-	2.5% lower critical value
P01	-	1% lower critical value
P005	-	0.5% lower critical value
P001	-	0.1% lower critical value

Default:

The difference (or the ratio) of the statistic for the two samples will generated for 4,000 permutations. Synonyms:

2 SAMPLE is a synonym for TWO SAMPLE Related Commands:

LINEAR RANK SUM TEST	=	Perform a 2-sample linear rank sum test.
T TEST	=	Perform a 2-sample t test.
RANK SUM TEST	=	Perform a 2-sample rank sum test for location
MEDIAN TEST	=	Perform a k-sample medians test
VAN DER WAERDEN TEST	=	Perform a k-sample Van Der Waerden test
SIEGEL TUKEY TEST	=	Perform a 2-sample Siegel Tukey test
SQUARED RANKS TEST	=	Perform a k-sample squared ranks test for homogeneous variances.
KLOTZ TEST	=	Perform a k-sample Klotz test for homogenuous variances.
BIHISTOGRAM	=	Generate a bihistogram.
QUANTILE QUANTILE PLOT	=	Generate a quantile-quantile plot.

References:

Duxbury Press

Knuth (1998), "The Art of Computer Programming: Volume 2 Seminumerical Algorithms, Third Edition", Section 3.4.2, Addison-Wesley.

Knoble RANDPERM algorithm downloaded from: "http://coding.derkeiler.com/Archive/Fortran/comp.lang.fortran/ 2006-03/msg00748.html"

Applications:

Two Sample Analysis Implementation Date:

2023/08: Program:

 
set random number generator fibbonacci congruential
seed 32119
.
.           Read the data
.
skip 25
read auto83b.dat y1 y2
retain y2 subset y2 >= 0
.
.           Perform the permutation test
.
lower tailed two sample mean permutation test              y1 y2
upper tailed two sample mean permutation test              y1 y2
two sample mean permutation test                           y1 y2
.
.           Plot the results
.
title offset 7
title case asis
label case asis
y1label Count
x1label Difference of Means for Permutations
let statval = round(statval,3)
let p025 = round(p025,3)
let p975 = round(p975,3)
let pval = round(pvalue2t,3)
let statcdf = round(statcdf,3)
.
x2label color red
x2label Difference of Means for Original Sample: ^statval
x3label color blue
x3label 2.5 Percentile: ^P025, 97.5 Percentile: ^P975
xlimits -0.5 0.5
let niter = 4000
skip 1
read dpst1f.dat z
title Histogram of Difference of Means for ^niter Permutationscr() ...
      (Pvalue: ^pval, CDF: ^statcdf)
.
histogram z
.
line color red
line dash
drawdsds statval 20 statval 90
line color blue
line dash
drawdsds p025 20 p025 90
drawdsds p975 20 p975 90

             Two Sample Permutation Test (Difference)
                               MEAN
  
 First Response Variable:  Y1
 Second Response Variable: Y2
  
 H0: Difference = 0
 Ha: Difference < 0
  
 Sample One Summary Statistics:
 Number of Observations:                             249
 Sample Mean:                                   20.14458
 Sample Median:                                 19.00000
 Sample Standard Deviation:                      6.41470
  
 Sample Two Summary Statistics:
 Number of Observations:                              79
 Sample Mean:                                   30.48101
 Sample Median:                                 32.00000
 Sample Standard Deviation:                      6.10771
  
 Test:
 Number of Permutation Samples:                     4000
 Statistic Value:                              -10.33643
 Test CDF Value:                                 0.00000
 Test P-Value:                                   0.00000
  
  
             Conclusions (Lower 1-Tailed Test)
  
 ------------------------------------------------------------
                                                         Null
    Significance           Test       Critical     Hypothesis
           Level      Statistic    Region (<=)     Conclusion
 ------------------------------------------------------------
           80.0%      -10.33643       -0.83209         REJECT
           90.0%      -10.33643       -1.29897         REJECT
           95.0%      -10.33643       -1.63245         REJECT
           99.0%      -10.33643       -2.38263         REJECT
  
  
             Two Sample Permutation Test (Difference)
                               MEAN
  
 First Response Variable:  Y1
 Second Response Variable: Y2
  
 H0: Difference = 0
 Ha: Difference > 0
  
 Sample One Summary Statistics:
 Number of Observations:                             249
 Sample Mean:                                   20.14458
 Sample Median:                                 19.00000
 Sample Standard Deviation:                      6.41470
  
 Sample Two Summary Statistics:
 Number of Observations:                              79
 Sample Mean:                                   30.48101
 Sample Median:                                 32.00000
 Sample Standard Deviation:                      6.10771
  
 Test:
 Number of Permutation Samples:                     4000
 Statistic Value:                              -10.33643
 Test CDF Value:                                 0.00000
 Test P-Value:                                   1.00000
  
  
             Conclusions (Upper 1-Tailed Test)
  
 ------------------------------------------------------------
                                                         Null
    Significance           Test       Critical     Hypothesis
           Level      Statistic    Region (>=)     Conclusion
 ------------------------------------------------------------
           80.0%      -10.33643        0.85202         ACCEPT
           90.0%      -10.33643        1.30055         ACCEPT
           95.0%      -10.33643        1.65238         ACCEPT
           99.0%      -10.33643        2.36938         ACCEPT
  
  
             Two Sample Permutation Test (Difference)
                               MEAN
  
 First Response Variable:  Y1
 Second Response Variable: Y2
  
 H0: Difference = 0
 Ha: Difference not equal 0
  
 Sample One Summary Statistics:
 Number of Observations:                             249
 Sample Mean:                                   20.14458
 Sample Median:                                 19.00000
 Sample Standard Deviation:                      6.41470
  
 Sample Two Summary Statistics:
 Number of Observations:                              79
 Sample Mean:                                   30.48101
 Sample Median:                                 32.00000
 Sample Standard Deviation:                      6.10771
  
 Test:
 Number of Permutation Samples:                     4000
 Statistic Value:                              -10.33643
 Test CDF Value:                                 0.00000
 Test P-Value:                                   0.00000
  
  
             Conclusions (Two-Tailed Test)
  
 ---------------------------------------------------------------------------
                                                                        Null
    Significance           Test       Critical       Critical     Hypothesis
           Level      Statistic    Region (<=)    Region (>=)     Conclusion
 ---------------------------------------------------------------------------
           80.0%      -10.33643       -1.33232        1.28555         REJECT
           90.0%      -10.33643       -1.69915        1.63487         REJECT
           95.0%      -10.33643       -1.99929        1.90250         REJECT
           99.0%      -10.33643       -2.64950        2.60265         REJECT