SED navigation bar go to SED home page go to Dataplot home page go to NIST home page SED Home Page SED Staff SED Projects SED Products and Publications Search SED Pages
Dataplot Vol 2 Vol 1

TWO SAMPLE PERMUATION TEST

Name:
    TWO SAMPLE <STATISTIC> PERMUATION TEST
Type:
    Analysis Command
Purpose:
    Perform a two sample permutation test for a specified statistic.
Description:
    Given random variables Y1 and Y2 with sample sizes n1 and n2, respectiively, permutation tests are performed as follows

    1. Compute the desired statistic for the original data.

    2. Combine the 2 data sets into a single data set.

    3. Generate a permutation of the combined data. Then compute the desired statistic (the first n1 permuted values constitute the first response variable and the following n2 permuted values constitute the second response variable).

    4. Repeat step 3 NITER number of times.

    The NITER computed statistics represent the reference distribution. The statistic for the original data is compared to this reference distribution. For example, the cut-offs for a 95% two-sided test are obtained from the 2.5% and 97.5% percentiles of the reference distribution.

    The permutation test is based on all possible permutations of the data. However, the number of permutations ((n1+n2)!/(n1!n2!)) grows rapidly as the sample sizes increase. However, sampling a subset of all possible permutations provides a reasonable approximation for the permutation test. By default, Dataplot generates 4,000 iterations. To change this, enter the command

      SET PERMUTATION TEST SAMPLE SIZE

    If <value> is less than 100, it will be set to 100. If <value> is greater than 100,000, it will be set to 100,000.

    The specified statistic should be one that can be computed from a single response variable (e.g., MEAN, MEDIAN, VARIANCE). By default, Dataplot will compute the difference of the statistic between the two samples. For scale statistics (e.g., STANDARD DEVIATION, VARIANCE), it is often preferred to compute the ratio rather than the difference. To specify the ratio be computed, enter

      SET PERMUTATION TEST RATIO

    To reset the default, enter

      SET PERMUTATION TEST DIFFERENCE

    Permutation tests assume the observations are independent. However, no distributional assumptions are made about the response variables.

Syntax 1:
    <LOWER TAILED/UPPER TAILED> TWO SAMPLE PERMUATION TEST
                                  <y1> <y2>             <SUBSET/EXCEPT/FOR qualification>
    where <stat> is the desired statistic;
    <y1> is the first response variable;
    <y2> is the second response variable;
    and where the <SUBSET/EXCEPT/FOR qualification> is optional.

    If LOWER TAILED is specified, a lower tailed test is performed. If UPPER TAILED is specified, an upper tailed test is performed. If neither LOWER TAILED or UPPER TAILED is specified, a two-tailed test is performed.

    To see a list of supported statistics, enter HELP STATISTICS.

Syntax 2:
    <LOWER TAILED/UPPER TAILED> TWO SAMPLE <stat>
                            PERMUATION TEST <y1> ... <yk>
                            <SUBSET/EXCEPT/FOR qualification>
    where <stat> is the desired statistic;
                <y1> ... <yk> is a list of two or more response variables;
    and where the <SUBSET/EXCEPT/FOR qualification> is optional.

    This syntax performs all the two-way two sample permutation tests for the listed variables. This syntax supports the TO syntax.

    If LOWER TAILED is specified, a lower tailed test is performed. If UPPER TAILED is specified, an upper tailed test is performed. If neither LOWER TAILED or UPPER TAILED is specified, a two-tailed test is performed.

    To see a list of supported statistics, enter HELP STATISTICS.

Examples:
    TWO SAMPLE MEAN PERMUATION TEST Y1 Y2
    TWO SAMPLE MEDIAN PERMUATION TEST Y1 Y2
    TWO SAMPLE MEDIAN PERMUATION TEST Y1 Y2 SUBSET Y2 > 0
    LOWER TAILED TWO SAMPLE MEDIAN PERMUATION TEST Y1 Y2
    UPPER TAILED TWO SAMPLE MEDIAN PERMUATION TEST Y1 Y2

    SET PERMUTATION TEST RATIO
    TWO SAMPLE STANDARD DEVIATION PERMUATION TEST Y1 Y2

Note:
    This routine uses a random permutation algorithm suggested by Knuth. Specifically, it adapts the RANDPERM routine of Knoble.
Note:
    The following parameters are saved after the two sample permutation test is performed.

      STATVAL - value of the test statistic
      STATCDF - CDF of the test statistic
      PVALUE - p-value of the two tailed test statistic
      PVALUELT - p-value of the lower tailed test statistic
      PVALUEUT - p-value of the upper tailed test statistic
      P80 - 80% upper critical value
      P90 - 90% upper critical value
      P95 - 95% upper critical value
      P975 - 97.5% upper critical value
      P99 - 99% upper critical value
      P995 - 99.5% upper critical value
      P999 - 99.9% upper critical value
      P20 - 20% lower critical value
      P10 - 10% lower critical value
      P05 - 5% lower critical value
      P025 - 2.5% lower critical value
      P01 - 1% lower critical value
      P005 - 0.5% lower critical value
      P001 - 0.1% lower critical value
Default:
    The difference (or the ratio) of the statistic for the two samples will generated for 4,000 permutations.
Synonyms:
    2 SAMPLE is a synonym for TWO SAMPLE
Related Commands: References:
    Higgins (2004), "Introduction to Modern Nonparametric Statistics," Duxbury Press, Chapter 2.

    Knuth (1998), "The Art of Computer Programming: Volume 2 Seminumerical Algorithms, Third Edition", Section 3.4.2, Addison-Wesley.

    Knoble RANDPERM algorithm downloaded from: "http://coding.derkeiler.com/Archive/Fortran/comp.lang.fortran/ 2006-03/msg00748.html"

Applications:
    Two Sample Analysis
Implementation Date:
    2023/08:
Program:
     
    set random number generator fibbonacci congruential
    seed 32119
    .
    .           Read the data
    .
    skip 25
    read auto83b.dat y1 y2
    retain y2 subset y2 >= 0
    .
    .           Perform the permutation test
    .
    lower tailed two sample mean permutation test              y1 y2
    upper tailed two sample mean permutation test              y1 y2
    two sample mean permutation test                           y1 y2
    .
    .           Plot the results
    .
    title offset 7
    title case asis
    label case asis
    y1label Count
    x1label Difference of Means for Permutations
    let statval = round(statval,3)
    let p025 = round(p025,3)
    let p975 = round(p975,3)
    let pval = round(pvalue2t,3)
    let statcdf = round(statcdf,3)
    .
    x2label color red
    x2label Difference of Means for Original Sample: ^statval
    x3label color blue
    x3label 2.5 Percentile: ^P025, 97.5 Percentile: ^P975
    xlimits -0.5 0.5
    let niter = 4000
    skip 1
    read dpst1f.dat z
    title Histogram of Difference of Means for ^niter Permutationscr() ...
          (Pvalue: ^pval, CDF: ^statcdf)
    .
    histogram z
    .
    line color red
    line dash
    drawdsds statval 20 statval 90
    line color blue
    line dash
    drawdsds p025 20 p025 90
    drawdsds p975 20 p975 90
        
    The following output is generated
                 Two Sample Permutation Test (Difference)
                                   MEAN
      
     First Response Variable:  Y1
     Second Response Variable: Y2
      
     H0: Difference = 0
     Ha: Difference < 0
      
     Sample One Summary Statistics:
     Number of Observations:                             249
     Sample Mean:                                   20.14458
     Sample Median:                                 19.00000
     Sample Standard Deviation:                      6.41470
      
     Sample Two Summary Statistics:
     Number of Observations:                              79
     Sample Mean:                                   30.48101
     Sample Median:                                 32.00000
     Sample Standard Deviation:                      6.10771
      
     Test:
     Number of Permutation Samples:                     4000
     Statistic Value:                              -10.33643
     Test CDF Value:                                 0.00000
     Test P-Value:                                   0.00000
      
      
                 Conclusions (Lower 1-Tailed Test)
      
     ------------------------------------------------------------
                                                             Null
        Significance           Test       Critical     Hypothesis
               Level      Statistic    Region (<=)     Conclusion
     ------------------------------------------------------------
               80.0%      -10.33643       -0.83209         REJECT
               90.0%      -10.33643       -1.29897         REJECT
               95.0%      -10.33643       -1.63245         REJECT
               99.0%      -10.33643       -2.38263         REJECT
      
      
                 Two Sample Permutation Test (Difference)
                                   MEAN
      
     First Response Variable:  Y1
     Second Response Variable: Y2
      
     H0: Difference = 0
     Ha: Difference > 0
      
     Sample One Summary Statistics:
     Number of Observations:                             249
     Sample Mean:                                   20.14458
     Sample Median:                                 19.00000
     Sample Standard Deviation:                      6.41470
      
     Sample Two Summary Statistics:
     Number of Observations:                              79
     Sample Mean:                                   30.48101
     Sample Median:                                 32.00000
     Sample Standard Deviation:                      6.10771
      
     Test:
     Number of Permutation Samples:                     4000
     Statistic Value:                              -10.33643
     Test CDF Value:                                 0.00000
     Test P-Value:                                   1.00000
      
      
                 Conclusions (Upper 1-Tailed Test)
      
     ------------------------------------------------------------
                                                             Null
        Significance           Test       Critical     Hypothesis
               Level      Statistic    Region (>=)     Conclusion
     ------------------------------------------------------------
               80.0%      -10.33643        0.85202         ACCEPT
               90.0%      -10.33643        1.30055         ACCEPT
               95.0%      -10.33643        1.65238         ACCEPT
               99.0%      -10.33643        2.36938         ACCEPT
      
      
                 Two Sample Permutation Test (Difference)
                                   MEAN
      
     First Response Variable:  Y1
     Second Response Variable: Y2
      
     H0: Difference = 0
     Ha: Difference not equal 0
      
     Sample One Summary Statistics:
     Number of Observations:                             249
     Sample Mean:                                   20.14458
     Sample Median:                                 19.00000
     Sample Standard Deviation:                      6.41470
      
     Sample Two Summary Statistics:
     Number of Observations:                              79
     Sample Mean:                                   30.48101
     Sample Median:                                 32.00000
     Sample Standard Deviation:                      6.10771
      
     Test:
     Number of Permutation Samples:                     4000
     Statistic Value:                              -10.33643
     Test CDF Value:                                 0.00000
     Test P-Value:                                   0.00000
      
      
                 Conclusions (Two-Tailed Test)
      
     ---------------------------------------------------------------------------
                                                                            Null
        Significance           Test       Critical       Critical     Hypothesis
               Level      Statistic    Region (<=)    Region (>=)     Conclusion
     ---------------------------------------------------------------------------
               80.0%      -10.33643       -1.33232        1.28555         REJECT
               90.0%      -10.33643       -1.69915        1.63487         REJECT
               95.0%      -10.33643       -1.99929        1.90250         REJECT
               99.0%      -10.33643       -2.64950        2.60265         REJECT
        
Date created: 08/04/2023
Last updated: 09/25/2023

Please email comments on this WWW page to alan.heckert@nist.gov.