 1. Exploratory Data Analysis
1.2. EDA Assumptions

## Underlying Assumptions

Assumptions Underlying a Measurement Process There are four assumptions that typically underlie all measurement processes; namely, that the data from the process at hand "behave like":
1. random drawings;
2. from a fixed distribution;
3. with the distribution having fixed location; and
4. with the distribution having fixed variation.
Univariate or Single Response Variable The "fixed location" referred to in item 3 above differs for different problem types. The simplest problem type is univariate; that is, a single variable. For the univariate problem, the general model
response = deterministic component + random component
becomes
response = constant + error
Assumptions for Univariate Model For this case, the "fixed location" is simply the unknown constant. We can thus imagine the process at hand to be operating under constant conditions that produce a single column of data with the properties that
• the data are uncorrelated with one another;
• the random component has a fixed distribution;
• the deterministic component consists of only a constant; and
• the random component has fixed variation.
Extrapolation to a Function of Many Variables The universal power and importance of the univariate model is that it can easily be extended to the more general case where the deterministic component is not just a constant, but is in fact a function of many variables, and the engineering objective is to characterize and model the function.
Residuals Will Behave According to Univariate Assumptions The key point is that regardless of how many factors there are, and regardless of how complicated the function is, if the engineer succeeds in choosing a good model, then the differences (residuals) between the raw response data and the predicted values from the fitted model should themselves behave like a univariate process. Furthermore, the residuals from this univariate process fit will behave like:
• random drawings;
• from a fixed distribution;
• with fixed location (namely, 0 in this case); and
• with fixed variation.
Validation of Model Thus if the residuals from the fitted model do in fact behave like the ideal, then testing of underlying assumptions becomes a tool for the validation and quality of fit of the chosen model. On the other hand, if the residuals from the chosen fitted model violate one or more of the above univariate assumptions, then the chosen fitted model is inadequate and an opportunity exists for arriving at an improved model. 