KRUSKAL WALLIS

Name:

KRUSKAL WALLIS Type:

Analysis Command Purpose:

Perform a Kruskal Wallis test that k samples come from identical populations. Description:

The Kruskal Wallis test can be applied in the one factor ANOVA case. It is a non-parametric test for the situation where the ANOVA normality assumptions may not apply. Although this test is for identical populations, it is designed to be sensitive to unequal means.

Let n_i (i = 1, 2, ..., k) represent the sample sizes for each of the k groups (i.e., samples) in the data. Next, rank the combined sample. Then compute R_i = the sum of the ranks for group i. Then the Kruskal Wallis test statistic is:

\( H = \frac{12} {n(n+1)} \sum_{i=1}^{k}{\frac{R_{i}^{2}} {n_i}} - 3(n+1) \)

This statistic approximates a chi-square distribution with k-1 degrees of freedom if the null hypothesis of equal populations is true. Each of the n_i should be at least 5 for the approximation to be valid.

We reject the null hypothesis of equal population means if the test statistic H is greater than CHIPPF(ALPHA,K-1) where CHIPPF is the chi-square percent point function

More formally,

H₀: All of the k population distribution functions are identical

H_A: At least one of the populations tends to yield larger observations than at least one of the other populations

Test Statistic: \( H = \frac{12} {n(n+1)} \sum_{i=1}^{k}{\frac{R_{i}^{2}} {n_i}} - 3(n+1) \)

Significance Level: \( \alpha \), typically set to 0.05.

Critical Region: H > CHIPPF(\( \alpha \),k - 1) where CHIPPF is the chi-square percent point function.

Conclusion: Reject the null hypothesis if the test statistic is in the critical region.

Syntax 1:

Syntax 2:

This syntax is used for the case when the data for each group is stored in a separate variable. This syntax accepts matrix arguments.

Examples:

Note:

All samples are random samples from their respective populations.
In addition to independence within each sample, there is mutual independence among the various samples.
The measurement scale is at least ordinal (i.e., the data can be ranked).
Either the k population distribution functions are identical or else some of the populations tend to yield larger values than other populations do.

Note:

The populations i and j seem to be different if the following inequality is satisfied:

\( \left| \frac{R_{i}}{N_{i}} - \frac{R_{j}}{N_{j}} \right| > \mbox{TPPF}(1 - \alpha/2) \sqrt{\frac{s^2(N-1-T)}{N-k}} \sqrt{\frac{1}{N_i} + \frac{1}{N_j}} \)

with TPPF and T denoting the t percent point function with N - k degrees of freedom and the Kruskal-Wallis test statistic, respectively.

Note:

The output was reformatted for the 2011/6 version. The SET WRITE DECIMALS command can now be used to specify the number of digits to include in the output. Note:

with Y denoting the response variable, X denoting the group-id variable, and ALPHA denoting the significance level for the critical value.

In addition to the above LET command, built-in statistics are supported for about 20+ different commands (enter HELP STATISTICS for details).

Default:

None Synonyms:

Related Commands:

ANOVA	= Perform an analysis of variance.
MEDIAN POLISH	= Carries out a robust ANOVA.
YATES ANALYSIS	= Analyze a Yate's design.
BLOCK PLOT	= Generate a block plot.
T TEST	= Carries out a t-test.
RANK SUM TEST	= Perform a rank sum test.
SIGNED RANK TEST	= Perform a signed rank test.
PLOT	= Plots (e.g., residuals and GANOVA ).

Reference:

Practical Nonparametric Statistics

Walpole and Myers (1978), "Probability and Statistics for Engineers and Scientists," Second Edition, MacMillian.

Applications:

Analysis of Variance Implementation Date:

Program:

 
SKIP 25
READ SPLETT2.DAT Y MACHINE
SET WRITE DECIMALS 5
KRUSKAL WALLIS Y MACHINE

            Kruskal-Wallis One Factor Test
 
Response Variable: Y
Group-ID Variable: MACHINE
 
H0: Samples Come From Identical Populations
Ha: Samples Do Not Come From Identical Populations
 
Summary Statistics:
Total Number of Observations:                                  99
Number of Groups:                                               4
 
Kruskal-Wallis Test Statistic Value:                     41.10239
CDF of Test Statistic:                                    0.99999
P-Value:                                                  0.00000
 
 
Percent Points of the Chi-Square Reference Distribution
-----------------------------------
  Percent Point               Value
-----------------------------------
            0.0    =          0.000
           50.0    =          2.366
           75.0    =          4.107
           90.0    =          6.251
           95.0    =          7.815
           97.5    =          9.348
           99.0    =         11.345
           99.9    =         16.265
 
Conclusions (Upper 1-Tailed Test)
----------------------------------------------
  Alpha    CDF   Critical Value     Conclusion
----------------------------------------------
    10%    90%            6.251      Reject H0
     5%    95%            7.815      Reject H0
   2.5%  97.5%            9.348      Reject H0
     1%    99%           11.345      Reject H0
 
 
            Multiple Comparisons Table
 
------------------------------------------------------------------------
    I    J  |Ri/Ni - Rj/Nj|         90% CV         95% CV         99% CV
------------------------------------------------------------------------
    1    2         18.82083       10.54643       12.60485       16.68947
    1    3         47.56083       10.54643       12.60485       16.68947
    1    4          4.98083       10.54643       12.60485       16.68947
    2    3         28.74000       10.43825       12.47556       16.51830
    2    4         13.83999       10.43825       12.47556       16.51830
    3    4         42.58000       10.43825       12.47556       16.51830