K SAMPLE PERMUATION TEST

Name:

K SAMPLE <STATISTIC> PERMUATION TEST Type:

Analysis Command Purpose:

Perform a k-sample permutation test for a specified statistic. Description:

Compute the desired statistic for the original data.
Generate a permutation of the response data. Then compute the desired statistic for the permutation.
Repeat step 3 NITER number of times.

The NITER computed statistics represent the reference distribution. The statistic for the original data is compared to this reference distribution. For example, the cut-offs for a two-sided 95% test are obtained from the 2.5% and 97.5% percentiles of the reference distribution.

The permutation test is based on all possible permutations of the data. However, the number of permutations grows rapidly as the sample size increases. sampling a subset of all possible permutations provides a reasonable approximation for the permutation test. By default, Dataplot generates 4,000 iterations. To change this, enter the command

SET PERMUTATION TEST SAMPLE SIZE <value>

If <value> is less than 100, it will be set to 100. If <value> is greater than 100,000, it will be set to 100,000.

The specified statistic should be one that can be computed from a single response variable with a corresponding group-id variable.

This test is most commonly used with F statistic obtained from a one way analysis of variance.

Permutation tests assume the observations are independent. However, no distributional assumptions are made about the response variable.

Syntax:

If LOWER TAILED is specified, a lower tailed test is performed. If UPPER TAILED is specified, an upper tailed test is performed. If neither LOWER TAILED or UPPER TAILED is specified, a two-tailed test is performed.

Examples:

Note:

Of these, the ONE WAY ANOVA F STATISTIC and KRUSKAL WALLIS TEST statisics are probably the ones of most interest.

Note:

This routine uses a random permutation algorithm suggested by Knuth. Specifically, it adapts the RANDPERM routine of Knoble. Note:

STATVAL	-	value of the test statistic
STATCDF	-	CDF of the test statistic
PVALUE	-	p-value of the two tailed test statistic
PVALUELT	-	p-value of the lower tailed test statistic
PVALUEUT	-	p-value of the upper tailed test statistic
P80	-	80% upper critical value
P90	-	90% upper critical value
P95	-	95% upper critical value
P975	-	97.5% upper critical value
P99	-	99% upper critical value
P995	-	99.5% upper critical value
P999	-	99.9% upper critical value
P20	-	20% lower critical value
P10	-	10% lower critical value
P05	-	5% lower critical value
P025	-	2.5% lower critical value
P01	-	1% lower critical value
P005	-	0.5% lower critical value
P001	-	0.1% lower critical value

Note:

Note that although this example compares differences of means, you could use other location statistics such as the MEDIAN or BIWEIGHT LOCATION.

Default:

The number of permutations defaults to 4,000. Synonyms:

None Related Commands:

TWO SAMPLE PERMUTATION TEST	=	Perform a 2-sample permutation test.
LINEAR RANK SUM TEST	=	Perform a 2-sample linear rank sum test.
ONE WAY ANOVA	=	= Perform a one-way analysis of variance.
KRUSKAL WALLIS TEST	=	Perform a k-sample Kruskal-Wallis test.
MEDIAN TEST	=	Perform a k-sample medians test.
SQUARED RANKS TEST	=	Perform a k-sample squared ranks test for homogeneous variances.

References:

Addison-Wesley

Knoble RANDPERM algorithm downloaded from: "http://coding.derkeiler.com/Archive/Fortran/comp.lang.fortran/ 2006-03/msg00748.html"

Higgins (2004), "Introduction to Modern Nonparametric Statistics," Duxbury Press, Chapter 3.

Applications:

K Sample Analysis Implementation Date:

2023/09: Program 1:

 
set permutation test sample size 5000
set random number generator fibbonacci congruential
seed 88807
.
. Step 1:   Create the data (from Higgins, p. 85)
.
read x y
 1    6.08
 1   22.29
 1    7.51
 1   34.36
 1   23.68
 2   30.45
 2   22.71
 2   44.52
 2   31.47
 2   36.81
 3   32.04
 3   28.03
 3   32.74
 3   23.84
 3   29.64
end of data
.
. Step 2:   Perform the permutation test
.
upper tailed k sample one way anova f statistic permutation test y x

             K-Sample Permutation Test
               ONE WAY ANOVA F-VALUE
  
 Response Variable:  Y
 Group-ID Variable:  X
  
  
 Test:
 Number of Permutation Samples:                     5000
 Statistic Value:                                3.78144
 Test CDF Value:                                 0.95040
 Test P-Value:                                   0.04960
  
  
             Conclusions (Upper 1-Tailed Test)
  
 ------------------------------------------------------------
                                                         Null
    Significance           Test       Critical     Hypothesis
           Level      Statistic    Region (>=)     Conclusion
 ------------------------------------------------------------
           80.0%        3.78144        1.85538         REJECT
           90.0%        3.78144        2.78072         REJECT
           95.0%        3.78144        3.77866         REJECT
           99.0%        3.78144        6.03793         ACCEPT

.
.           Step 3: Plot the results
.
title offset 7
title case asis
label case asis
y1label Count
x1label One Way Anova F-Statistic for Permutations
let statval = round(statval,4)
let p95  = round(p95,3)
let p99  = round(p99,3)
let pval = round(pvalueut,4)
let statcdf = round(statcdf,4)
.
x2label color red
x2label One Way Anova F-Statistic for Original Sample: ^statval
x3label color blue
x3label 95 Percentile: ^P95, 99 Percentile: ^P99
xlimits -5.0 10.0
let niter = 5000
skip 1
read dpst1f.dat z
title Histogram of One Way Anova F Statistic for ^niter Permutationscr() ...
      (Pvalue: ^pval, CDF: ^statcdf)
.
histogram z
.
line color red
line dash
line thickness 0.3
drawdsds statval 20 statval 90
line thickness 0.1
line color blue
line dash
drawdsds p95 20 p95 90
drawdsds p99 20 p99 90

 
.
.           Step 4: Multiple comparisons
.
let xdist = distinct x
let ndist = size xdist
let icnt = 0
if ndist >= 3
   loop for k = 1 1 ndist
       let xval1 = xdist(k)
       let jstrt = k + 1
       loop for j = jstrt 1 ndist
           let xval2 = xdist(j)
           let ytemp1 = y
           let ytemp2 = y
           retain ytemp1 subset x = xval1
           retain ytemp2 subset x = xval2
           two sample mean permutation test ytemp1 ytemp2
           let icnt = icnt + 1
           let group1(icnt) = xval1
           let group2(icnt) = xval2
           let pvalmc(icnt)   = pvalue2t
           delete ytemp1 ytemp2
       end of loop
   end of loop
end of if
write1 ksamp_mc.out "   Group-ID One   Group-ID Two         P-Value"
write1 ksamp_mc.out "----------------------------------------------"
write1 ksamp_mc.out group1 group2 pvalmc

   Group-ID One   Group-ID Two         P-Value
----------------------------------------------
         1.00000        2.00000        0.05840
         1.00000        3.00000        0.11320
         2.00000        3.00000        0.36960

Program 2:

 
set permutation test sample size 5000
set random number generator fibbonacci congruential
seed 49217
.
. Step 1:   Create the data (from Higgins, p. 85)
.
read x y
 1    6.08
 1   22.29
 1    7.51
 1   34.36
 1   23.68
 2   30.45
 2   22.71
 2   44.52
 2   31.47
 2   36.81
 3   32.04
 3   28.03
 3   32.74
 3   23.84
 3   29.64
end of data
.
. Step 2:   Perform the permutation test
.
echo on
upper tailed k sample one way anova f statistic permutation test y x
upper tailed k sample kruskal wallis test permutation test y x
kruskal wallis y x
upper tailed k sample squared ranks test permutation test y x
squared ranks y x
upper tailed k sample anderson darling k sample test permutation test y x
anderson darling k sample test y x
upper tailed k sample cochran variance outlier test permutation test y x
cochran variance outlier test y x
upper tailed k sample median test permutation test y x
median test y x
echo off

       ****************************************************************************
       **  upper tailed k sample one way anova f statistic permutation test y x  **
       ****************************************************************************
  
  
             K-Sample Permutation Test
               ONE WAY ANOVA F-VALUE
  
 Response Variable:  Y
 Group-ID Variable:  X
  
  
 Test:
 Number of Permutation Samples:                     5000
 Statistic Value:                                3.78144
 Test CDF Value:                                 0.94720
 Test P-Value:                                   0.05280
  
  
             Conclusions (Upper 1-Tailed Test)
  
 ------------------------------------------------------------
                                                         Null
    Significance           Test       Critical     Hypothesis
           Level      Statistic    Region (>=)     Conclusion
 ------------------------------------------------------------
           80.0%        3.78144        1.92732         REJECT
           90.0%        3.78144        2.90249         REJECT
           95.0%        3.78144        3.89597         ACCEPT
           99.0%        3.78144        6.12665         ACCEPT
  
  
       **********************************************************************
       **  upper tailed k sample kruskal wallis test permutation test y x  **
       **********************************************************************
  
  
             K-Sample Permutation Test
               KRUSKALL WALLIS TEST
  
 Response Variable:  Y
 Group-ID Variable:  X
  
  
 Test:
 Number of Permutation Samples:                     5000
 Statistic Value:                                4.16000
 Test CDF Value:                                 0.86820
 Test P-Value:                                   0.12500
  
  
             Conclusions (Upper 1-Tailed Test)
  
 ------------------------------------------------------------
                                                         Null
    Significance           Test       Critical     Hypothesis
           Level      Statistic    Region (>=)     Conclusion
 ------------------------------------------------------------
           80.0%        4.16000        3.42000         REJECT
           90.0%        4.16000        4.56000         ACCEPT
           95.0%        4.16000        5.82000         ACCEPT
           99.0%        4.16000        8.00000         ACCEPT
  
  
       **************************
       **  kruskal wallis y x  **
       **************************
  
  
             Kruskal-Wallis One Factor Test
  
 Response Variable: Y
 Group-ID Variable: X
  
 H0: Samples Come From Identical Populations
 Ha: Samples Do Not Come From Identical Populations
  
 Summary Statistics:
 Total Number of Observations:                                  15
 Number of Groups:                                               3
  
 Kruskal-Wallis Test Statistic Value:                      4.16000
 CDF of Test Statistic:                                    0.87507
 P-Value:                                                  0.12493
  
  
 Percent Points of the Chi-Square Reference Distribution
 -----------------------------------
   Percent Point               Value
 -----------------------------------
             0.0    =          0.000
            50.0    =          1.386
            75.0    =          2.773
            90.0    =          4.605
            95.0    =          5.991
            97.5    =          7.378
            99.0    =          9.210
            99.9    =         13.816
  
 Conclusions (Upper 1-Tailed Test)
 ----------------------------------------------
   Alpha    CDF   Critical Value     Conclusion
 ----------------------------------------------
     10%    90%            4.605      Accept H0
      5%    95%            5.991      Accept H0
    2.5%  97.5%            7.378      Accept H0
      1%    99%            9.210      Accept H0
  
  
             Multiple Comparisons Table
  
 ---------------------------------------------------------------------------------------
     I    J  |Ri/Ni - Rj/Nj|         90% CV         95% CV         99% CV        P-VALUE
 ---------------------------------------------------------------------------------------
     1    2          5.60000        4.56488        5.58048        7.82344        0.00006
     1    3          4.00000        4.56488        5.58048        7.82344        0.00088
     2    3          1.60000        4.56488        5.58048        7.82344        0.06779
  
  
       *********************************************************************
       **  upper tailed k sample squared ranks test permutation test y x  **
       *********************************************************************
  
  
             K-Sample Permutation Test
                 SQUARED RANK TEST
  
 Response Variable:  Y
 Group-ID Variable:  X
  
  
 Test:
 Number of Permutation Samples:                     5000
 Statistic Value:                                5.23351
 Test CDF Value:                                 0.77720
 Test P-Value:                                   0.22280
  
  
             Conclusions (Upper 1-Tailed Test)
  
 ------------------------------------------------------------
                                                         Null
    Significance           Test       Critical     Hypothesis
           Level      Statistic    Region (>=)     Conclusion
 ------------------------------------------------------------
           80.0%        5.23351        5.48205         ACCEPT
           90.0%        5.23351        6.52571         ACCEPT
           95.0%        5.23351        7.77241         ACCEPT
           99.0%        5.23351        9.57074         ACCEPT
  
  
       *************************
       **  squared ranks y x  **
       *************************
  
  
             Squared Ranks Test
  
 Response Variable: Y
 Group-ID Variable: X
  
 H0: Samples Have Equal Variability
 Ha: Samples Do Not Have Equal Variability
  
 Summary Statistics:
 Total Number of Observations:                         15
 Number of Groups:                                      3
  
 Squared Ranks Test Statistic Value:              5.23351
 CDF of Test Statistic:                           0.92696
 P-Value:                                         0.07304
  
  
 Percent Points of the Chi-Square Reference Distribution
 -----------------------------------
   Percent Point               Value
 -----------------------------------
             0.0    =          0.000
            50.0    =          1.386
            75.0    =          2.773
            90.0    =          4.605
            95.0    =          5.991
            97.5    =          7.378
            99.0    =          9.210
            99.9    =         13.816
  
             Upper-Tailed Test: Chi-Square Approximation
  
 H0: Variances Are Equal; Ha: Variance Are Not Equal
 ------------------------------------------------------------
                                                         Null
    Significance           Test       Critical     Hypothesis
           Level      Statistic      Value (>)     Conclusion
 ------------------------------------------------------------
           80.0%        5.23351        3.21888         REJECT
           90.0%        5.23351        4.60517         REJECT
           95.0%        5.23351        5.99146         ACCEPT
           99.0%        5.23351        9.21034         ACCEPT
  
  
             Multiple Comparisons Table
  
 ---------------------------------------------------------------------------------------
     I    J  |Si/Ni - Sj/Nj|         90% CV         95% CV         99% CV        P-Value
 ---------------------------------------------------------------------------------------
     1    2         63.20000      116.14987      171.14898      394.78593        0.25304
     1    3        105.80000      116.14987      171.14898      394.78593        0.11705
     2    3         42.60000      116.14987      171.14898      394.78593        0.39629
  
  
       *********************************************************************************
       **  upper tailed k sample anderson darling k sample test permutation test y x  **
       *********************************************************************************
  
  
             K-Sample Permutation Test
             ANDERSON DARLING K-SAMPLE TEST
  
 Response Variable:  Y
 Group-ID Variable:  X
  
  
 Test:
 Number of Permutation Samples:                     5000
 Statistic Value:                                1.76560
 Test CDF Value:                                 0.92440
 Test P-Value:                                   0.07560
  
  
             Conclusions (Upper 1-Tailed Test)
  
 ------------------------------------------------------------
                                                         Null
    Significance           Test       Critical     Hypothesis
           Level      Statistic    Region (>=)     Conclusion
 ------------------------------------------------------------
           80.0%        1.76560        1.34619         REJECT
           90.0%        1.76560        1.66064         REJECT
           95.0%        1.76560        1.94778         ACCEPT
           99.0%        1.76560        2.58359         ACCEPT
  
  
       ******************************************
       **  anderson darling k sample test y x  **
       ******************************************
  
  
             Anderson-Darling K-Sample Test for Common Groups
  
 Response Variable: Y
 Group-ID Variable: X
  
 H0: The Groups Are Homogeneous
 Ha: The Groups Are Not Homogeneous
  
 Summary Statistics:
 Total Number of Observations:                        15
 Number of Groups:                                     3
 Minimum Batch Size:                                   5
 Maximum Batch Size:                                   5
  
 Test Statistic Value:                           1.76560
 Test Statistic Standard Error:                  0.45946
  
  
             Conclusions (Upper 1-Tailed Test)
  
 ------------------------------------------------------------------------
                                                                     Null
         Null   Significance           Test       Critical     Hypothesis
   Hypothesis          Level      Statistic    Region (>=)     Conclusion
 ------------------------------------------------------------------------
  Homogeneous          50.0%        1.76560        1.13711         REJECT
  Homogeneous          75.0%        1.76560        1.44702         REJECT
  Homogeneous          90.0%        1.76560        1.72594         REJECT
  Homogeneous          95.0%        1.76560        1.89286         ACCEPT
  Homogeneous          97.5%        1.76560        2.03764         ACCEPT
  Homogeneous          99.0%        1.76560        2.20598         ACCEPT
  Homogeneous          99.9%        1.76560        2.55696         ACCEPT
  
  
       ********************************************************************************
       **  upper tailed k sample cochran variance outlier test permutation test y x  **
       ********************************************************************************
  
  
             K-Sample Permutation Test
             COCHRAN VARIANCE OUTLIER TEST
  
 Response Variable:  Y
 Group-ID Variable:  X
  
  
 Test:
 Number of Permutation Samples:                     5000
 Statistic Value:                                0.64473
 Test CDF Value:                                 0.79260
 Test P-Value:                                   0.20740
  
  
             Conclusions (Upper 1-Tailed Test)
  
 ------------------------------------------------------------
                                                         Null
    Significance           Test       Critical     Hypothesis
           Level      Statistic    Region (>=)     Conclusion
 ------------------------------------------------------------
           80.0%        0.64473        0.64761         ACCEPT
           90.0%        0.64473        0.69848         ACCEPT
           95.0%        0.64473        0.82783         ACCEPT
           99.0%        0.64473        0.88012         ACCEPT
  
  
       *****************************************
       **  cochran variance outlier test y x  **
       *****************************************
  
  
             Cochran Variance Outlier Test
  
 Response Variable: Y
 Group-ID Variable: X
  
 H0: Largest Variance is Not an Outlier
 Ha: Largest Variance is an Outlier
  
 Summary Statistics:
 Total Number of Observations:                        15
 Number of Groups:                                     3
 Number of Groups with Positive Variance:              3
 Group with Largest Variance:                          1
 Largest Variance:                             141.84233
 Sum of Variance:                              880.01148
  
 Cochran Test Statistic Value:                   0.64473
 CDF of Test Statistic:                          0.82896
 P-Value:                                        0.17104
  
  
 Percent Points of the Reference Distribution
 -----------------------------------
   Percent Point               Value
 -----------------------------------
             0.1    =        0.40230
             0.5    =        0.40308
             1.0    =        0.40405
             2.5    =        0.40698
             5.0    =        0.41192
            10.0    =        0.42201
            25.0    =        0.45418
            50.0    =        0.51726
            75.0    =        0.60490
            90.0    =        0.69343
            95.0    =        0.74566
            97.5    =        0.78836
            99.0    =        0.83347
            99.5    =        0.86083
            99.9    =        0.90789
  
 Conclusions (Upper 1-Tailed Test)
 ----------------------------------------------
   Alpha    CDF   Critical Value     Conclusion
 ----------------------------------------------
     10%    90%          0.69343      Accept H0
      5%    95%          0.74566      Accept H0
    2.5%  97.5%          0.78836      Accept H0
      1%    99%          0.83347      Accept H0
  
  
       **************************************************************
       **  upper tailed k sample median test permutation test y x  **
       **************************************************************
  
  
             K-Sample Permutation Test
                    MEDIAN TEST
  
 Response Variable:  Y
 Group-ID Variable:  X
  
  
 Test:
 Number of Permutation Samples:                     5000
 Statistic Value:                                3.75000
 Test CDF Value:                                 0.70900
 Test P-Value:                                   0.06440
  
  
             Conclusions (Upper 1-Tailed Test)
  
 ------------------------------------------------------------
                                                         Null
    Significance           Test       Critical     Hypothesis
           Level      Statistic    Region (>=)     Conclusion
 ------------------------------------------------------------
           80.0%        3.75000        3.75000         REJECT
           90.0%        3.75000        3.75000         REJECT
           95.0%        3.75000        6.96429         ACCEPT
           99.0%        3.75000       10.17857         ACCEPT
  
  
       ***********************
       **  median test y x  **
       ***********************
  
  
             Median Test
  
 Response Variable: Y
 Group-ID Variable: X
 H0: Samples Have Equal Medians
 Ha: At Least Two Samples Have Different Medians
  
 Summary Statistics:
 Original Number of Observations:                            15
 Number of Observations After Omitting
 Groups With Less Than Two Observations:                     15
 Number of Groups:                                            3
 Grand Median:                                               30
 Number of Points > the Grand Median:                         7
 Number of Points <= the Grand Median:                        8
  
 Median Test Statistic Value:                           3.75000
 CDF of Test Statistic:                                 0.84665
 P-Value:                                               0.15335
  
  
 Percent Points of the Chi-Square Reference Distribution
 -----------------------------------
   Percent Point               Value
 -----------------------------------
             0.0    =          0.000
            50.0    =          1.386
            75.0    =          2.773
            90.0    =          4.605
            95.0    =          5.991
            97.5    =          7.378
            99.0    =          9.210
            99.9    =         13.816
  
             Upper-Tailed Test: Chi-Square Approximation
  
 H0: Medians Are Equal; Ha: Medians Are Not Equal
 ------------------------------------------------------------
                                                         Null
    Significance           Test       Critical     Hypothesis
           Level      Statistic      Value (>)     Conclusion
 ------------------------------------------------------------
           90.0%        3.75000        4.60517         ACCEPT
           95.0%        3.75000        5.99146         ACCEPT
           97.5%        3.75000        7.37776         ACCEPT
           99.0%        3.75000        9.21034         ACCEPT
           99.9%        3.75000       13.81551         ACCEPT