Dataplot Vol 1 Vol 2

# CALIBRATION

Name:
CALIBRATION
Type:
Analysis Command
Purpose:
Compute a linear or quadratic calibration using multiple methods.
Description:
The goal of calibration is to quantitatively convert measurements made on one of two measurement scales to the other measurement scale. There is also a model that describes the relationship between the two measurement scales.

The primary measurement scale is usually the scientifically relevant scale and measurements on this scale are typically more precise (relatively) than measurements on the secondary scale. However, the secondary scale is typically the easier measurement to obtain (i.e., it is typically cheaper or faster or more readily available).

So given a measurement on the secondary scale, we want to convert that to an estimate of the measurement on the primary scale. The steps involved are:

1. We start with a series of points that have been measured on both scales. The secondary measurement is treated as the response variable, Y, and the primary measurement is treated as the independent variable, X.

2. We perform a fit of Y and X. Currently, Dataplot supports calibration for the case where Y and X can be fit with either a linear fit

Y = A0 + A1*X

Y = A0 + Y = A0 + A2X2

This is typically referred to as the calibration curve.

Although these are the most common calibration models in practice, other calibration models are also used. For example, the fit could be multi-linear (i.e., more than one X variable), a higher order polynomial, or non-linear. These cases are not supported directly. However, you can use a bootstrap approach for many of these problems.

3. We then have one or more points measured on the secondary scale with no corresponding measurement on the primary scale.

We use the calibration curve to estimate the value of the measurement on the primary scale. In addition, we estimate a confidence interval for the estimated value on the primary scale.

The calibration problem has recieved significant attention and a number of different methods have been proposed for the calibration estimates. Most of these methods return the same value for the point estimate. However, the method for obtaining the confidence interval is typically different. We describe the "classical" method in some detail. For the other methods, we give references to the literature.

Given that in the calibration problem the primary measurement (the higher quality measurement) is assigned to the independent variable(s) (x axis) and the secondary measurement is assigned to the dependent (y axis) variable, a reasonable question is why don't we simply switch the axes and assign the secondary measurement to the independent variable? The reason is that least squares fitting assumes that the values for the indpendent variable are fixed (i.e., there is no error). In order to satisfy this assumption, we need to assign the higher quality measurement to the independent variable.

When Dataplot performs a calibration, it first prints out a summary of the initial fit. It then loops through each point being calibrated and prints the estimate for the primary scale and the corresponding confidence limits.

Calibration is discussed in the NIST/SEMATECH e-Handbook of Statistical Methods.

Description of Methods:
In this section, we only give the final computational formulas. A reference is given for most methods that discusses the derivation of the formula.

The following are some quantities that are used by several methods:

 $$\hat{y}$$ mean of the Y (secondary measurement) values $$\bar{x}$$ mean of the X (primary measurement) values A0: intercept value for the fit between Y and X A1: slope value for the fit between Y and X ssdx: $$\sum_{i=1}^{n}{(X_{i} - \hat{x})^2}$$ ssx: $$\sum_{i=1}^{n}{X_{i}^2}$$ ssdy: $$\sum_{i=1}^{n}{(Y_{i} - \hat{y})^2}$$ s: the residual standard deviation

For most of these methods, given a calibration point, Y0, the X0 is estimated from the original fit by

X0 = (Y0 - A0)/A1

with A0 and A1 denoting the coefficients from the original fit:

Y = A0 + A1 X

Dataplot generates the linear calibration using the following methods:

1. Inverse Prediction Limits (Eisenhart)

This method was originally recommended by Churchill Eisenhart and is based on inverting the prediction limits for Y given X0. The prediction interval is

$$Y_0 = \bar{Y} + A1 X_0 \pm t_{(1-\alpha/2,N-2)}s \sqrt{1 + \frac{1}{N} + \frac{X_0^2}{ssdx}}$$

The uncertainty is obtained from the linear regression prediction interval

$$\hat{Y} \pm t_{1 - \alpha/2,\nu} \hat{\sigma}_{p}$$

with $$\hat{\sigma}_{p}$$ denoting the standard deviation of the predicted value. The formula for $$\hat{\sigma}_{p}$$ is

$$\hat{\sigma}_{p} = \sqrt{\hat{\sigma}^2 + \hat{\sigma}_{f}^2}$$

with

$$\begin{array}{lcl} \hat{\sigma}^2 & = & \mbox{variance of the residuals} \\ & = & \sum_{i=1}^{N}{\frac{(Y - \hat{Y})^2}{N-1}} \end{array}$$

To find the confidence limits for X0 (X0L and X0U), we solve

$$\mbox{X0L} = (A0 + A1 \times X0) - t_{1 - \alpha/2,\nu} \hat{\sigma}_{p}$$

$$\mbox{X0U} = (A0 + A1 \times X0) + t_{1 - \alpha/2,\nu} \hat{\sigma}_{p}$$

2. Graybill-Iyer

This method is described on pages 427-431 of the Graybill and Iyer textbook (see the Reference section below).

3. Neter-Wasserman-Kutner

This method is described on pages 135-137 of the Neter, Wasserman, and Kutner textbook (see the Reference section below).

4. Propogation of Error

5. Inverse (Krutchkoff)

This method is described in the Krutchkoff paper (see the Reference Section below).

6. Maximum Likelihood

7. Bootstrap

For this method, the confidence limits are obtained by generating bootstrap samples, obtaining the point estimate for each bootstrap sample, and then computing a confidence interval based on the percentiles of these bootstrap point estimates. For example, a 95% confidence interval would be obtained from the 2.5 and 97.5 percentiles.

There are two methods for generating the bootstrap samples.

1. In the first approach, the least squares fit is computed from the original data. The residuals are then resampled. The residuals are added to the predicted values of the original fit to obtain a new Y vector. This new Y vector is then fit against the original X variable and the point estimate for the calibration is obtained from these Y and X. We call this approach residual resampling (or the Efron approach).

2. In the second approach, rows of the original data (both the Y vector and the corresponding rows of the X variables) are resampled. The resampled data are then fit. We call this approach data resampling (or the Wu approach).

Hamilton (see Reference below) gives some guidance on the contrasts between these approaches.

1. Residual resampling assumes fixed X values and independent and identically distributed residuals.

2. Data resampling does not assume independent and identically distributed residuals.

Given the above, if the assumption of fixed X is realistic (that is, we could readily collect new Y's with the same X values), then residual resampling is justified. For example, this would be the case in a designed experiment. However, if this assumption is not realistic (i.e., the X values vary randomly as well as the Y's), then data resampling is preferred.

The CALIBRATION command will generate estimates for both types of bootstrap sampling.

Additional methods may be supported in future releases.

Dataplot generates the quadratic calibration using the following methods:

1. Inverse Prediction Limits (Eisenhart)

This uses the same idea as the linear calibration inverse prediction limits. That is, we invert the quadratic regression equation to obtain the point estimates and the confidence intervals are based on inverting the quadratic prediction limits. The algegraic details are not given here.

2. Bootstrap

See the comments above for the bootstrap using linear calibration. The same basic ideas apply except that we perform a quadratic rather than a linear fit.

Additional methods may be supported in future releases.

Syntax 1:
LINEAR CALIBRATION <y> <x> <y0>
<SUBSET/EXCEPT/FOR qualification>
where <y> is the response variable (secondary measurements);
<x> is the independent variable (primary measurements);
<y0> is a number, parameter, or variable containing the secondary measurements where the calibration is to be performed;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

This syntax computes a linear calibration analysis.

Syntax 2:
<SUBSET/EXCEPT/FOR qualification>
where <y> is the response variable (secondary measurements);
<x> is the independent variable (primary measurements);
<y0> is a number, parameter, or variable containing the secondary measurements where the calibration is to be performed;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

This syntax computes a quadratic calibration analysis.

Examples:
LINEAR CALIBRATION Y X Y0
LINEAR CALIBRATION Y X Y0SUBSET X > 2
Note:
To simplify the generation of additional plots and analysis, a number of results are written to external files.

The following variables are written to the file dpst1f.dat.

 Column 1 - method id Column 2 - Y0 (i.e., calibration point on secondary scale) Column 3 - X0 (i.e., for Y0, the estimate on the primary scale) Column 4 - lower confidence limit Column 5 - upper confidence limit

The following variables are written to the file dpst2f.dat.

 Column 1 - Y0 Columns 2 thru 9 - X0 for each of the 8 methods (only 3 methods for quadratic calibration)

The following variables are written to the file dpst3f.dat.

 Column 1 - Y0 Columns 2 thru 9 - lower limit for X0 for each of the 8 methods (only 3 methods for quadratic calibration)

The following variables are written to the file dpst4f.dat.

 Column 1 - Y0 Columns 2 thru 9 - upper limit for X0 for each of the 8 methods (only 3 methods for quadratic calibration)
Note:
The default confidence limits are for a 95% confidence interval (i.e., $$\alpha$$ = 0.05). To use a different alpha, enter the command (before entering the CALIBRATION command):

LET ALPHA = <value>

For example, to generate 90% confidence intervals, enter

LET ALPHA = 0.10
Default:
None
Synonyms:
None
Related Commands:
 FIT = Perform a fit. BOOTSTRAP FIT = Generate a bootstrap fit. BOOTSTRAP PLOT = Generate a bootstrap plot.
References:
Churchill Eisenhart (1939). "The Interpretation of Certain Regression Methods and Their Use in Biological and Industrial Research," Annals of Mathematical Statistics, Vol. 10, pp. 162-182.

F. Graybill and H. Iyer. "Regression Anaysis," First Edition, Duxbury Press, pp. 427-431.

R. G. Krutchkoff (1967). "Classical and Inverse Methods of Calibration," Technometrics, Vol. 9, pp. 425-439.

Neter, Wasserman, and Kuttner. "Applied Linear Statistical Models," Third Edition, Irwin, pp. 173-175.

B. Hoadley (1970). "A Bayesian Look at Inverse Linear Regresssion," Journal of the American Statistical Association, Vol. 65, pp. 356-369.

H. Scheffe (1973). "A Statistical Theory of Calibration," Annals of Statistics, Vol. 1, pp. 1-37.

P. J. Brown (1982). "Multivariate Calibration," (with discussion), JRSBB, Vol. 44, pp. 287-321.

A. Racine-Poon (1988). "A Bayesian Approach to Nonlinear Calibration Problems," Journal of the American Statistical Association, Vol. 83, pp. 650-656.

C. Osborne (1991). "Statistical Calibration: A Review," International Statistical Review, Vol. 59, pp. 309-336.

Hamilton (1992). "Regression with Graphics: A Second Course in Applied Statistics," Duxbury Press.

Applications:
Calibration
Implementation Date:
2003/7
Program:

SKIP 25
LET Y0 = DATA 150 200 250 300
.
LINEAR CALIBRATION Y X X0

The following output is generated:
            Linear Calibration Analysis
Summary of Linear Fit Between Y        and X

Number of Observations:                              16
Estimate of Intercept:                          13.5058
SD(Intercept):                                  21.0476
t(Intercept):                                    0.6416
Estimate of Slope:                               0.7902
SD(Slope):                                       0.0710
t(Slope):                                       11.1236
Residual Standard Deviation:                    26.2077

Linear Calibration Summary

Y0 =     150.0000
--------------------------------------------------------------------------
95%            95%
Method             X0    Lower Limit    Upper Limit
--------------------------------------------------------------------------
1. Inverse Prediction Limits:       172.7309        90.6910       246.3664
2. Graybill-Iyer:                   172.7309        90.6910       246.3664
3. Neter-Wasserman-Kutner:          172.7309        96.4652       248.9967
4. Propogation of Error:            172.7309        83.3505       262.1114
5. Inverse (Krutchkoff):            183.7929       106.8873       260.6986
6. Maximum Likelihood:              172.7309       142.7454       194.8587
7. Bootstrap (Residuals):           172.7309       145.7333       192.7617
8. Bootstrap (Data):                172.7309       146.6036       193.8464
--------------------------------------------------------------------------

Y0 =     200.0000
--------------------------------------------------------------------------
95%            95%
Method             X0    Lower Limit    Upper Limit
--------------------------------------------------------------------------
1. Inverse Prediction Limits:       236.0051       158.9668       309.5252
2. Graybill-Iyer:                   236.0051       158.9668       309.5252
3. Neter-Wasserman-Kutner:          236.0051       162.1587       309.8515
4. Propogation of Error:            236.0051       134.6394       337.3707
5. Inverse (Krutchkoff):            240.6357       168.0773       313.1941
6. Maximum Likelihood:              236.0051       216.1080       253.3034
7. Bootstrap (Residuals):           236.0051       217.8168       251.9630
8. Bootstrap (Data):                236.0051       220.3722       254.0829
--------------------------------------------------------------------------

Y0 =     250.0000
--------------------------------------------------------------------------
95%            95%
Method             X0    Lower Limit    Upper Limit
--------------------------------------------------------------------------
1. Inverse Prediction Limits:       299.2792       225.1548       374.7718
2. Graybill-Iyer:                   299.2792       225.1548       374.7718
3. Neter-Wasserman-Kutner:          299.2792       225.8776       372.6809
4. Propogation of Error:            299.2792       185.8826       412.6759
5. Inverse (Krutchkoff):            297.4784       225.7040       369.2529
6. Maximum Likelihood:              299.2792       283.0275       316.7070
7. Bootstrap (Residuals):           299.2792       283.5742       316.0866
8. Bootstrap (Data):                299.2792       283.3301       318.4212
--------------------------------------------------------------------------

Y0 =     300.0000
--------------------------------------------------------------------------
95%            95%
Method             X0    Lower Limit    Upper Limit
--------------------------------------------------------------------------
1. Inverse Prediction Limits:       362.5533       289.2164       442.1448
2. Graybill-Iyer:                   362.5533       289.2164       442.1448
3. Neter-Wasserman-Kutner:          362.5533       287.5867       437.5200
4. Propogation of Error:            362.5533       237.0930       488.0137
5. Inverse (Krutchkoff):            354.3211       279.7673       428.8750
6. Maximum Likelihood:              362.5533       343.2002       387.2145
7. Bootstrap (Residuals):           362.5533       345.3757       384.6494
8. Bootstrap (Data):                362.5533       342.0917       388.3067
--------------------------------------------------------------------------


NIST is an agency of the U.S. Commerce Department.

Date created: 09/09/2010
Last updated: 10/13/2015