7.
Product and Process Comparisons
7.4. Comparisons based on data from more than two processes


The KruskalWallis (KW) Test for Comparing Populations with Unknown Distributions  
A nonparametric test for comparing population medians by Kruskal and Wallis 
The KW procedure tests the null hypothesis that \(k\)
samples from possibly different populations actually originate from
similar populations, at least as far as their central tendencies,
or medians, are concerned. The test assumes that the variables under
consideration have underlying continuous distributions.
In what follows assume we have \(k\) samples, and the sample size of the \(i\)th sample is \(n_i, \,\, i=1, \, 2, \, \ldots, \, k\). 

Test based on ranks of combined data  In the computation of the KW statistic, each observation is replaced by its rank in an ordered combination of all the \(k\) samples. By this we mean that the data from the \(k\) samples combined are ranked in a single series. The minimum observation is replaced by a rank of 1, the nexttothesmallest by a rank of 2, and the largest or maximum observation is replaced by the rank of \(N\), where \(N\) is the total number of observations in all the samples (\(N\) is the sum of the \(n_i\)).  
Compute the sum of the ranks for each sample  The next step is to compute the sum of the ranks for each of the original samples. The KW test determines whether these sums of ranks are so different by sample that they are not likely to have all come from the same population.  
Test statistic follows a \(\chi^2\) distribution 
It can be shown that if the \(k\)
samples come from the same population, that is, if the null hypothesis is true,
then the test statistic, \(H\),
used in the KW procedure is distributed approximately as a chisquare statistic with
df = \(k1\),
provided that the sample sizes of the \(k\)
samples are not too small (say, \(n_i > 4\),
for all \(i\)).
\(H\)
is defined as follows:
$$ H = \frac{12}{N(N+1)} \, \sum_{i=1}^k \frac{R_i^2}{n_i}  3(N+1) \, , $$
where


Example  
An illustrative example 
The following data are from a comparison of four investment firms. The
observations represent percentage of growth during a three month period.for
recommended funds.
Step 1: Express the data in terms of their ranks


Compute the test statistic 
The corresponding \(H\)
test statistic is
$$ H = \frac{12}{19(20)} \left[ \frac{65^2}{4} + \frac{41.5^2}{5} + \frac{17.5^2}{5} + \frac{66^2}{5} \right]
 3(20) = 13.678 \, . $$
From the chisquare table
in Chapter 1, the critical value for 1  \(\alpha\)
= 0.95 with df = \(k\)  1 = 3
is 7.812. Since 13.678 > 7.812, we reject the null hypothesis.
Note that the rejection region for the KW procedure is onesided, since we only reject the null hypothesis when the \(H\) statistic is too large. 