|
FISHER TWO SAMPLE RANDOMIZATION TESTName:
Randomization tests can be used when these assumptions are questionable. Fisher introduced randomization tests (also referred to as permutation tests) in 1935. The randomization test for the equality of the means for two samples is computed as follows:
The primary drawback to this test is that NTOTAL grows rapidly as n1 and n2 increase. A test based on the full set of permutations may be computationaly prohibitive except for relatively small samples. For larger n1 and n2, one approach is to generate a random subset of the complete set of permutations (typically on the order of 4,000 to 10,000 random subsets will be generated). For this command, Dataplot is using the algorithm of Richards and Byrd. This algorithm generates the complete set of permutations. The advantage of this algorithm is that exact p-values are obtained for one-tailed tests and also for two-tailed tests when n1 = n2. If n1 is not equal n2, an approximate p-value is obtained for the two-tailed test. The primary drawback is that this test is limited to small sample sizes. Dataplot currently limits the maximum value of n1 and n2 to be 22. See the Note section below for some guidance to generating this test for larger samples based on randomly sampling the permutations. If the two samples are not randomly drawn from larger populations, the inference will be valid for the observations under study but not necesarily for the populations from which the observations are drawn.
<SUBSET/EXCEPT/FOR qualification> where <y1> is the first response variable; <y2> is the second response variable; and where the <SUBSET/EXCEPT/FOR qualification> is optional. The <y1> and <y2> need not be the same length. Either <y1> or <y2> (or both) may be matrix arguments. If a matrix argument is given, the response variable will consist of all observations in that matrix. Although matrix arguments are allowed, they are rarely used for this command due to limitation on the size of the response variable.
<SUBSET/EXCEPT/FOR qualification> where <y1> ... <yk> is a list of 1 to 30 response variables; and where the <SUBSET/EXCEPT/FOR qualification> is optional. This syntax will implement all the pairwise Fisher two sample randomization tests for the listed response variables. For example,
is equivalent to
FISHER TWO SAMPLE RANDOMIZATION TEST Y1 Y3 FISHER TWO SAMPLE RANDOMIZATION TEST Y1 Y4 FISHER TWO SAMPLE RANDOMIZATION TEST Y2 Y3 FISHER TWO SAMPLE RANDOMIZATION TEST Y2 Y4 FISHER TWO SAMPLE RANDOMIZATION TEST Y3 Y4 The <y1>, ..., <yk> need not be the same length. Any of the listed response variables may be matrix arguments. If a matrix argument is given, the response variable will consist of all observations in that matrix. Although matrix arguments are allowed, they are rarely used for this command due to limitation on the size of the response variable.
FISHER TWO SAMPLE RANDOMIZATION TEST Y1 TO Y4
PVALUE = the p-value for the two-sided test PVALUELT = the p-value for the lower tailed test
LET A = FISHET TWO SAMPLE RANDOMIZATION TEST PVALUE Y1 Y2 In addition to the above LET command, built-in statistics are supported for about 20+ different commands (enter HELP STATISTICS for details).
The SAMPLE RANDOM PERMUTATION command can be used to implement other randomization tests (and to accomodate sample sizes greater than allowed here). The Program 2 and Program 3 examples demonstrate this. Although these examples demonstrate the difference of means statistic, other statistics can be easily substituted into these examples.
Fisher (1935), "Design of Experiments", Edinburgh: Oliver and Boyd. Conover (1999), "Practical Non-Parametric Statistics", Third Edition, Wiley, p. 410. Higgins (2004), "Introduction to Modern Nonparametric Statistics", Thomson/Brooks/Cole, Duxbury Advanced Series, Chapter 2.
. Example from p. 410 of Convover (1999), "Practical Nonparametric . Statistics", Third Edition, Wiley. . let y1 = data 0 1 1 0 -2 let y2 = data 6 7 7 4 -3 9 14 let y3 = data 9 2 3 5 7 let y4 = data 6 8 9 12 15 set write decimals 5 . let t = fisher two sample rand test y1 y2 let pval = fisher two sample rand test pvalue y1 y2 . print t pval . fisher two sample rand test y1 y2 fisher two sample rand test y1 y2 y3 y4The following output is generated PARAMETERS AND CONSTANTS-- T -- 0.00000 PVAL -- 0.02778 Two Sample Two-Sided Fisher Randomization Test (Independent Samples) First Response Variable: Y1 Second Response Variable: Y2 H0: E(X) = E(Y) Ha: E(X) <> E(Y) Summary Statistics: Sample with Smaller Mean: Number of Observations: 5 Mean: 0.00000 Sum of Observations: 0.00000 Sample with Larger Mean: Number of Observations: 7 Mean: 6.28571 Sum of Observations: 44.00000 Difference of Means: -6.28571 Test Statistic: 0.00000 Approximate P-Value (two-tailed test): 0.02778 Exact P-Value (lower-tailed test): 0.01389 Two Sample Two-Sided Fisher Randomization Test (Independent Samples) First Response Variable: Y1 Second Response Variable: Y2 H0: E(X) = E(Y) Ha: E(X) <> E(Y) Summary Statistics: Sample with Smaller Mean: Number of Observations: 5 Mean: 0.00000 Sum of Observations: 0.00000 Sample with Larger Mean: Number of Observations: 7 Mean: 6.28571 Sum of Observations: 44.00000 Difference of Means: -6.28571 Test Statistic: 0.00000 Approximate P-Value (two-tailed test): 0.02778 Exact P-Value (lower-tailed test): 0.01389 Two Sample Two-Sided Fisher Randomization Test (Independent Samples) First Response Variable: Y1 Second Response Variable: Y3 H0: E(X) = E(Y) Ha: E(X) <> E(Y) Summary Statistics: Sample with Smaller Mean: Number of Observations: 5 Mean: 0.00000 Sum of Observations: 0.00000 Sample with Larger Mean: Number of Observations: 5 Mean: 5.20000 Sum of Observations: 26.00000 Difference of Means: -5.20000 Test Statistic: 0.00000 Approximate P-Value (two-tailed test): 0.00794 Exact P-Value (lower-tailed test): 0.00397 Two Sample Two-Sided Fisher Randomization Test (Independent Samples) First Response Variable: Y1 Second Response Variable: Y4 H0: E(X) = E(Y) Ha: E(X) <> E(Y) Summary Statistics: Sample with Smaller Mean: Number of Observations: 5 Mean: 0.00000 Sum of Observations: 0.00000 Sample with Larger Mean: Number of Observations: 5 Mean: 10.00000 Sum of Observations: 50.00000 Difference of Means: -10.00000 Test Statistic: 0.00000 Approximate P-Value (two-tailed test): 0.00794 Exact P-Value (lower-tailed test): 0.00397 Two Sample Two-Sided Fisher Randomization Test (Independent Samples) First Response Variable: Y2 Second Response Variable: Y3 H0: E(X) = E(Y) Ha: E(X) <> E(Y) Summary Statistics: Sample with Smaller Mean: Number of Observations: 5 Mean: 6.28571 Sum of Observations: 26.00000 Sample with Larger Mean: Number of Observations: 7 Mean: 5.20000 Sum of Observations: 44.00000 Difference of Means: 1.08571 Test Statistic: 26.00000 Approximate P-Value (two-tailed test): 0.72222 Exact P-Value (lower-tailed test): 0.36111 Two Sample Two-Sided Fisher Randomization Test (Independent Samples) First Response Variable: Y2 Second Response Variable: Y4 H0: E(X) = E(Y) Ha: E(X) <> E(Y) Summary Statistics: Sample with Smaller Mean: Number of Observations: 5 Mean: 5.20000 Sum of Observations: 26.00000 Sample with Larger Mean: Number of Observations: 5 Mean: 10.00000 Sum of Observations: 50.00000 Difference of Means: -4.80000 Test Statistic: 26.00000 Approximate P-Value (two-tailed test): 0.06349 Exact P-Value (lower-tailed test): 0.03175 Two Sample Two-Sided Fisher Randomization Test (Independent Samples) First Response Variable: Y3 Second Response Variable: Y4 H0: E(X) = E(Y) Ha: E(X) <> E(Y) Summary Statistics: Sample with Smaller Mean: Number of Observations: 5 Mean: 5.20000 Sum of Observations: 26.00000 Sample with Larger Mean: Number of Observations: 5 Mean: 10.00000 Sum of Observations: 50.00000 Difference of Means: -4.80000 Test Statistic: 26.00000 Approximate P-Value (two-tailed test): 0.06349 Exact P-Value (lower-tailed test): 0.03175
Date created: 12/18/2017 |
Last updated: 12/11/2023 Please email comments on this WWW page to alan.heckert@nist.gov. |