1. Exploratory Data Analysis
1.4. EDA Case Studies
1.4.2. Case Studies
1.4.2.3. Random Walk

## Develop A Better Model

Lag Plot Suggests Better Model Since the underlying assumptions did not hold, we need to develop a better model.

The lag plot showed a distinct linear pattern. Given the definition of the lag plot, Yi versus Yi-1, a good candidate model is a model of the form

$$Y_{i} = A_0 + A_1*Y_{i-1} + E_{i}$$
Fit Output The results of a linear fit of this model generated the following results.
      Coefficient    Estimate     Stan. Error   t-Value
A0         0.050165      0.024171       2.075
A1         0.987087      0.006313     156.350

Residual Standard Deviation = 0.2931
Residual Degrees of Freedom = 497


The slope parameter, A1, has a t value of 156.350 which is statistically significant. Also, the residual standard deviation is 0.2931. This can be compared to the standard deviation shown in the summary table, which is 2.078675. That is, the fit to the autoregressive model has reduced the variability by a factor of 7.

Time Series Model This model is an example of a time series model. More extensive discussion of time series is given in the Process Monitoring chapter.