5.
Process Improvement
5.5. Advanced topics 5.5.9. An EDA approach to experimental design


Purpose 
The halfnormal probability plot answers the question:
The halfnormal probability plot is a graphical tool that uses these ordered estimated effects to help assess which factors are important and which are unimportant. A halfnormal distribution is the distribution of the X with X having a normal distribution. 

Output 
The outputs from the halfnormal probablity plot are


Definition 
A halfnormal probability plot is formed by


Motivation 
To provide a rationale for the halfnormal probability plot,
we first dicuss the motivation for the normal probability
plot (which also finds frequent use in these 2level designs).
The basis for the normal probability plot is the mathematical form for each (and all) of the estimated effects. As discussed for the effects plot, the estimated effects are the optimal least squares estimates. Because of the orthogonality of the 2^{k} full factorial and the 2^{kp} fractional factorial designs, all least squares estimators for main effects and interactions simplify to the form:
Under rather general conditions, the Central Limit Thereom allows that the differenceofsums form for the estimated effects tends to follow a normal distribution (for a large enough sample size n) a normal distribution. The question arises as to what normal distribution; that is, a normal distribution with what mean and what standard deviation? Since all estimators have an identical form (a difference of averages), the standard deviations, though unknown, will in fact be the same under the assumption of constant σ. This is good in that it simplifies the normality analysis. As for the means, however, there will be differences from one effect to the next, and these differences depend on whether a factor is unimportant or important. Unimportant factors are those that have nearzero effects and important factors are those whose effects are considerably removed from zero. Thus, unimportant effects tend to have a normal distribution centered near zero while important effects tend to have a normal distribution centered at their respective true large (but unknown) effect values. In the simplest experimental case, if the experiment were such that no factors were important (that is, all effects were near zero), the (n1) estimated effects would behave like random drawings from a normal distribution centered at zero. We can test for such normality (and hence test for a nulleffect experiment) by using the normal probability plot. Normal probability plots are easy to interpret. In simplest terms:
On the other hand, if the truth behind the experiment is that there is exactly one factor that was important (that is, significantly nonzero), and all remaining factors are unimportant (that is, nearzero), then the normal probability plot of all (n1) effects is nearlinear for the (n2) unimportant factors and the remaining single important factor would stand well off the line. Similarly, if the experiment were such that some subset of factors were important and all remaining factors were unimportant, then the normal probability plot of all (n1) effects would be nearlinear for all unimportant factors with the remaining important factors all well off the line. In real life, with the number of important factors unknown, this suggests that one could form a normal probability plot of the (n1) estimated effects and draw a line through those (unimportant) effects in the vicinity of zero. This identifies and extracts all remaining effects off the line and declares them as important. The above rationale and methodology works well in practice, with the net effect that the normal probability plot of the effects is an important, commonly used and successfully employed tool for identifying important factors in 2level full and factorial experiments. Following the lead of Cuthbert Daniel (1976), we augment the methodology and arrive at a further improvement. Specifically, the sign of each estimate is completely arbitrary and will reverse depending on how the initial assignments were made (e.g., we could assign "" to treatment A and "+" to treatment B or just as easily assign "+" to treatment A and "" to treatment B). This arbitrariness is addressed by dealing with the effect magnitudes rather than the signed effects. If the signed effects follow a normal distribution, the absolute values of the effects follow a halfnormal distribution. In this new context, one tests for important versus unimportant factors by generating a halfnormal probability plot of the absolute value of the effects. As before, linearity implies halfnormality, which in turn implies all factors are unimportant. More typically, however, the halfnormal probability plot will be only partially linear. Unimportant (that is, nearzero) effects manifest themselves as being near zero and on a line while important (that is, large) effects manifest themselves by being off the line and welldisplaced from zero. 

Plot for defective springs data 
The halfnormal probability plot of the effects for the defectice
springs data set is as follows.


How to interpret 
From the halfnormal probability plot, we look for the following:


Conclusions for the defective springs data 
The application of the halfnormal probability plot to the
defective springs data set results in the following conclusions:
