1.2.5.1. Consequences of Non-Randomness

1. Exploratory Data Analysis
1.2. EDA Assumptions
1.2.5. Consequences

1.2.5.1. Consequences of Non-Randomness

Randomness Assumption

There are four underlying assumptions:

randomness;
fixed location;
fixed variation; and
fixed distribution.

The randomness assumption is the most critical but the least tested.

Consequeces of Non-Randomness

If the randomness assumption does not hold, then

All of the usual statistical tests are invalid.
The calculated uncertainties for commonly used statistics become meaningless.
The calculated minimal sample size required for a pre-specified tolerance becomes meaningless.
The simple model: y = constant + error becomes invalid.
The parameter estimates become suspect and non-supportable.

Non-Randomness Due to Autocorrelation

One specific and common type of non-randomness is autocorrelation. Autocorrelation is the correlation between Y_t and Y_t-k, where k is an integer that defines the lag for the autocorrelation. That is, autocorrelation is a time dependent non-randomness. This means that the value of the current point is highly dependent on the previous point if k = 1 (or k points ago if k is not 1). Autocorrelation is typically detected via an autocorrelation plot or a lag plot.

If the data are not random due to autocorrelation, then

Adjacent data values may be related.
There may not be n independent snapshots of the phenomenon under study.
There may be undetected "junk"-outliers.
There may be undetected "information-rich"-outliers.