1. Exploratory Data Analysis
1.4. EDA Case Studies
1.4.2. Case Studies
1.4.2.6. Filter Transmittance

Graphical Output and Interpretation

Goal The goal of this analysis is threefold:
1. Determine if the univariate model:

$$Y_{i} = C + E_{i}$$

is appropriate and valid.

2. Determine if the typical underlying assumptions for an "in control" measurement process are valid. These assumptions are:
1. random drawings;
2. from a fixed distribution;
3. with the distribution having a fixed location; and
4. the distribution having a fixed scale.
3. Determine if the confidence interval

$$\bar{Y} \pm 2s/\sqrt{N}$$

is appropriate and valid where s is the standard deviation of the original data.

4-Plot of Data
Interpretation The assumptions are addressed by the graphics shown above:
1. The run sequence plot (upper left) indicates a significant shift in location around x=35.

2. The linear appearance in the lag plot (upper right) indicates a non-random pattern in the data.

3. Since the lag plot indicates significant non-randomness, we do not make any interpretation of either the histogram (lower left) or the normal probability plot (lower right).
The serious violation of the non-randomness assumption means that the univariate model
$$Y_{i} = C + E_{i}$$
is not valid. Given the linear appearance of the lag plot, the first step might be to consider a model of the type
$$Y_{i} = A_0 + A_1*Y_{i-1} + E_{i}$$
However, in this case discussions with the scientist revealed that non-randomness was entirely unexpected. An examination of the experimental process revealed that the sampling rate for the automatic data acquisition system was too fast. That is, the equipment did not have sufficient time to reset before the next sample started, resulting in the current measurement being contaminated by the previous measurement. The solution was to rerun the experiment allowing more time between samples.

Simple graphical techniques can be quite effective in revealing unexpected results in the data. When this occurs, it is important to investigate whether the unexpected result is due to problems in the experiment and data collection or is indicative of unexpected underlying structure in the data. This determination cannot be made on the basis of statistics alone. The role of the graphical and statistical analysis is to detect problems or unexpected results in the data. Resolving the issues requires the knowledge of the scientist or engineer.

Individual Plots Although it is generally unnecessary, the plots can be generated individually to give more detail. Since the lag plot indicates significant non-randomness, we omit the distributional plots.
Run Sequence Plot

Lag Plot