5.
Process Improvement
5.5.
Advanced topics
5.5.9.

An EDA approach to experimental design


Introduction

This section presents an
exploratory data analysis (EDA)
approach to analyzing the data from a designed experiment. This
material is meant to complement, not replace, the more modelbased
approach for analyzing experiment designs given in
section 4 of this chapter.
Choosing an appropriate design is discussed in detail in
section 3 of this chapter.


Starting point

Problem category

The problem category we will address is the screening problem.
Two characteristics of screening problems are:
 There are many factors to consider.
 Each of these factors may be either continuous or
discrete.

Desired output

The desired output from the analysis of a screening problem is:
 A ranked list (by order of importance) of factors.
 The best settings for each of the factors.
 A good model.
 Insight.

Problem essentials

The essentials of the screening problem are:
 There are k factors with n observations.
 The generic model is:
Y = f(X_{1},
X_{2}, ...,
X_{k}) + ε

Design type

In particular, the EDA approach is applied to 2^{k}
full factorial and
2^{kp}
fractional factorial designs.
An EDA approach is particularly applicable to screening designs
because we are in the preliminary stages of understanding our
process.

EDA philosophy

EDA is not a single technique. It is an approach to analyzing
data.
 EDA is datadriven. That is, we do not assume an initial
model. Rather, we attempt to let the data speak for
themselves.
 EDA is questionbased. That is, we select a technique
to answer one or more questions.
 EDA utilizes multiple techniques rather than depending on a
single technique. Different plots have a different basis,
focus, and sensitivities, and therefore may bring out different
aspects of the data. When multiple
techniques give us a redundancy of conclusions, this increases
our confidence that our conclusions are valid. When they give
conflicting conclusions, this may be giving us a clue as
to the nature of our data.
 EDA tools are often graphical. The primary objective is
to provide insight into the data, which graphical
techniques often provide more readily than quantitative
techniques.

10Step process

The following is a 10step EDA process for analyzing the data
from 2^{k} full factorial and
2^{kp} fractional factorial designs.
 Ordered data plot
 DOE scatter plot
 DOE mean plot
 Interaction effects matrix plot
 Block plot
 DOE Youden plot
 Effects plot
 Halfnormal probability plot
 Cumulative residual standard
deviation plot
 DOE contour plot
Each of these plots will be presented with the following format:
 Purpose of the plot
 Output of the plot
 Definition of the plot
 Motivation for the plot
 An example of the plot using the defective springs data
 A discussion of how to interpret the plot
 Conclusions we can draw from the plot for the defective springs
data


Data set

Defective springs data

The plots presented in this section are demonstrated with
a data set from
Box and Bisgaard (1987).
These data are from a 2^{3} full factorial data set that
contains the following variables:
 Response variable Y = percentage of springs without
cracks
 Factor 1 = oven temperature (2 levels: 1450 and 1600 F)
 Factor 2 = carbon concentration (2 levels: 0.5% and 0.7%)
 Factor 3 = quench temperature (2 levels: 70 and 120 F)
Y X1 X2 X3
Percent Oven Carbon Quench
Acceptable Temperature Concentration Temperature

67 1 1 1
79 +1 1 1
61 1 +1 1
75 +1 +1 1
59 1 1 +1
90 +1 1 +1
52 1 +1 +1
87 +1 +1 +1
(The reader can download the data as a
text file.)
