5.5. Advanced topics
5.5.9. An EDA approach to experimental design
126.96.36.199. Cumulative residual standard deviation plot
|Interpolation and extrapolation||
The previous section illustrated how to compute predicted values at
the points included in the design. One of the virtues of modeling
is that the resulting prediction equation is not restricted to
the design data points. From the prediction equation, predicted
values can be computed elsewhere and anywhere:
This added insight into the nature of the response is "free" and is an incredibly important benefit of the entire model-building exercise.
|Predict with caution||
Can we be fooled and misled by such a mathematical and computational
exercise? After all, is not the only thing that is "real" the data, and
everything else artificial? The answer is "yes", and so such
interpolation/extrapolation is a double-edged sword that must be wielded
with care. The best attitude, and especially for extrapolation, is
that the derived conclusions must be viewed with extra caution.
By construction, the recommended fitted models should be good at the design points. If the full-blown model were used, the fit will be perfect. If the full-blown model is reduced just a bit, then the fit will still typically be quite good. By continuity, one would expect perfection/goodness at the design points would lead to goodness in the immediate vicinity of the design points. However, such local goodness does not guarantee that the derived model will be good at some distance from the design points.
|Do confirmation runs||
Modeling and prediction allow us to go beyond the data to gain
additional insights, but they must be done with great caution.
Interpolation is generally safer than extrapolation, but
mis-prediction, error, and misinterpretation are liable to occur
in either case.
The analyst should definitely perform the model-building process and enjoy the ability to predict elsewhere, but the analyst must always be prepared to validate the interpolated and extrapolated predictions by collection of additional real, confirmatory data. The general empirical model that we recommend knows "nothing" about the engineering, physics, or chemistry surrounding your particular measurement problem, and although the model is the best generic model available, it must nonetheless be confirmed by additional data. Such additional data can be obtained pre-experimentally or post-experimentally. If done pre-experimentally, a recommended procedure for checking the validity of the fitted model is to augment the usual 2k or 2k-p designs with additional points at the center of the design. This is discussed in the next section.
|Applies only for continuous factors||Of course, all such discussion of interpolation and extrapolation makes sense only in the context of continuous ordinal factors such as temperature, time, pressure, size, etc. Interpolation and extrapolation make no sense for discrete non-ordinal factors such as supplier, operators, design types, etc.|