1.
Exploratory Data Analysis
1.2.
EDA Assumptions
1.2.1.

Underlying Assumptions


Assumptions Underlying a Measurement Process

There are four assumptions that typically underlie all measurement
processes; namely, that the data from the process at hand "behave
like":
 random drawings;
 from a fixed distribution;
 with the distribution having fixed location; and
 with the distribution having fixed variation.

Univariate or Single Response Variable

The "fixed location" referred to in item 3 above differs for
different problem types. The simplest problem type is univariate;
that is, a single variable. For the univariate problem, the
general model
response = deterministic component + random component
becomes
response = constant + error

Assumptions for Univariate Model

For this case, the "fixed location" is simply the unknown constant.
We can thus imagine the process at hand to be operating under
constant conditions that produce a single column of data with
the properties that
 the data are uncorrelated with one another;
 the random component has a fixed distribution;
 the deterministic component consists of only a constant; and
 the random component has fixed variation.

Extrapolation to a Function of Many Variables

The universal power and importance of the univariate model is that
it can easily be extended to the more general case where the
deterministic component is not just a constant, but is in fact a
function of many variables, and the engineering objective is to
characterize and model the function.

Residuals Will Behave According to Univariate Assumptions

The key point is that regardless of how many factors there are,
and regardless of how complicated the function is, if the engineer
succeeds in choosing a good model, then the differences (residuals)
between the raw response data and the predicted values from the
fitted model should themselves behave like a univariate process.
Furthermore, the residuals from this univariate process fit will
behave like:
 random drawings;
 from a fixed distribution;
 with fixed location (namely, 0 in this case); and
 with fixed variation.

Validation of Model

Thus if the residuals from
the fitted model do in fact behave like the ideal, then testing
of underlying assumptions becomes a tool for the validation and
quality of fit of the chosen model. On the other hand, if the
residuals from the chosen fitted model violate one or more of the
above univariate assumptions, then the chosen fitted model is
inadequate and an opportunity exists for arriving at an improved
model.
