Next Page Previous Page Home Tools & Aids Search Handbook
4. Process Modeling
4.6. Case Studies in Process Modeling
4.6.2. Alaska Pipeline

4.6.2.5.

Weighting to Improve Fit

Weighting Another approach when the assumption of constant standard deviation of the errors (i.e. homogeneous variances) is violated is to perform a weighted fit. In a weighted fit, we give less weight to the less precise measurements and more weight to more precise measurements when estimating the unknown parameters in the model.
Fit for Estimating Weights For the pipeline data, we chose approximate replicate groups so that each group has four observations (the last group only has three). This was done by first sorting the data by the predictor variable and then taking four points in succession to form each replicate group.

Using the power function model with the data for estimating the weights, Dataplot generated the following output for the fit of ln(variances) against ln(means) for the replicate groups. The output has been edited slightly for display.


LEAST SQUARES MULTILINEAR FIT
SAMPLE SIZE N       =       27
NUMBER OF VARIABLES =        1
NO REPLICATION CASE


PARAMETER ESTIMATES           (APPROX. ST. DEV.)    T VALUE
1  A0                  -3.18451       (0.8265    )         -3.9
2  A1       XTEMP       1.69001       (0.2344    )          7.2

RESIDUAL    STANDARD DEVIATION =         0.8561206460
RESIDUAL    DEGREES OF FREEDOM =          25

plot of replicated variance against relicated means with fit

The fit output and plot from the replicate variances against the replicate means shows that the a linear fit provides a reasonable fit with an estimated slope of 1.69. Note that this data set has a small number of replicates, so you may get a slightly different estimate for the slope. For example, S-PLUS generated a slope estimate of 1.52. This is caused by the sorting of the predictor variable (i.e., where we have actual replicates in the data, different sorting algorithms may put some observations in different replicate groups). In practice, any value for the slope, which will be used as the exponent in the weight function, in the range 1.5 to 2.0 is probably reasonable and should produce comparable results for the weighted fit.

We used an estimate of 1.5 for the exponent in the weighting function.

Residual Plot for Weight Function plot of residual values from fit for estimating weights reveals no obvious problems

The residual plot from the fit to determine an appropriate weighting function reveals no obvious problems.

Numerical Output from Weighted Fit Dataplot generated the following output for the weighted fit of the model that relates the field measurements to the lab measurements (edited slightly for display).
LEAST SQUARES MULTILINEAR FIT
SAMPLE SIZE N       =      107
NUMBER OF VARIABLES =        1
REPLICATION CASE
REPLICATION STANDARD DEVIATION =     0.6112687111D+01
REPLICATION DEGREES OF FREEDOM =          29
NUMBER OF DISTINCT SUBSETS     =          78


PARAMETER ESTIMATES           (APPROX. ST. DEV.)    T VALUE
1  A0                   2.35234       (0.5431    )          4.3
2  A1       LAB        0.806363       (0.2265E-01)          36.

RESIDUAL    STANDARD DEVIATION =         0.3645902574
RESIDUAL    DEGREES OF FREEDOM =         105
REPLICATION STANDARD DEVIATION =         6.1126871109
REPLICATION DEGREES OF FREEDOM =          29

This output shows a slope of 0.81 and an intercept term of 2.35. This is compared to a slope of 0.73 and an intercept of 4.99 in the original model.
Plot of Predicted Values

plot of predicted values with raw data indicates a good fit

The plot of the predicted values with the data indicates a good fit.

Diagnostic Plots of Weighted Residuals 6plot indicates regression assumptions satisfied

We need to verify that the weighting did not result in the other regression assumptions being violated. A 6-plot, after weighting the residuals, indicates that the regression assumptions are satisfied.

Plot of Weighted Residuals vs Lab Defect Size

plot of weighted residuals versus predictor variable shows homogeneous variances for residuals

In order to check the assumption of homogeneous variances for the errors in more detail, we generate a full sized plot of the weighted residuals versus the predictor variable. This plot suggests that the errors now have homogeneous variances.

Home Tools & Aids Search Handbook Previous Page Next Page