7.4.2.3. The ANOVA table and tests of hypotheses about means

7. Product and Process Comparisons
7.4. Comparisons based on data from more than two processes
7.4.2. Are the means equal?

7.4.2.3. The ANOVA table and tests of hypotheses about means

Sums of Squares help us compute the variance estimates displayed in ANOVA Tables

The sums of squares SST and SSE previously computed for the one-way ANOVA are used to form two mean squares, one for treatments and the second for error. These mean squares are denoted by MST and MSE respectively. These are typically displayed in a tabular form, known as an ANOVA Table. The ANOVA table also shows the statistics used to test hypotheses about the population means.

When the null hypothesis of equal means is true, the two mean sum of squares estimate the same quantity (error variance), and should be about of equal magnitude. In other words, their ratio should be close to 1. If the null hypothesis is false, MST should be larger than MSE.

The mean squares are formed by dividing the sum of squares by the associated degrees of freedom.

Let N = Sn_i. Then, the degrees of freedom for treatment, DFT = k - 1 and the degrees of freedom for error, DFE = N- k
The corresponding mean squares are:

MST = SST / DFT
MSE = SSE / DFE

The F-test

The test statistic, used in testing the equality of treatment means is: F = MST / MSE.

The critical value is the table value of the F distribution, based on the chosen a level and the degrees of freedom DFT and DFE.

The calculations are displayed in an ANOVA table, as follows:

Source SS DF MS F

Treatments SST k-1 SST / (k-1) MST/MSE

Error SSE N-k SSE / (N-k)

Total (corrected) SS N-1

The word "source" stands for source of variation. Some authors prefer to use "among" and "within" instead of "treatments" and "error" respectively.

ANOVA Table Example

A numerical example

The data below resulted from measuring the difference in resistance resulting from subjecting identical resistors to three different temperatures for a period of 24 hours. The sample size of each group was 5. In the language of Design of Experiments we have an experiment in which each of three treatments was replicated 5 times.

Sample 1 Sample 2 Sample 3

6.9 8.3 8.0

5.4 6.8 10.5

5.8 7.8 8.1

4.6 9.2 6.9

4.0 6.5 9.3

means 5.34 7.72 8.56

The resulting ANOVA table is

Example ANOVA table


Source	SS	DF	MS	F

Treatments	27.897	2	13.949	9.59
Error	17.452	12	1.454

Total (corrected)	45.349	14
Correction Factor	779.041	1

The test statistic is the F value of 9.59. Using an a of .05, we have that F_{.05; 2, 12} = 3.89. Since the test statistic is much larger than the critical value, we reject the null hypothesis of equal population means and conclude that there is a (statistically) significant difference among the population means. The p-value for 9.59 is .00325, so the test statistic is significant at that level.

The populations here are resistor readings while operating under the three different temperatures. What we do not know at this point is which of the three means are different from the others, and by how much.

There are several techniques we might use to further analyze the differences. These are:

constructing confidence intervals around the difference of two means,

estimating combinations of factor levels with confidence bounds

multiple comparisons.of combinations of factor levels tested simultaneously .