
ORTHOGONAL DISTANCE FITName:
where y is a response variable, f is a linear or nonlinear function, x is a list of one or more independent (or factor) variables, and beta is a list of parameters in the function to be estimated. The least squares fit generates estimates for beta and predicted and residual values for y. You can also specify weights for the response variable y. Weighting is typically applied to give more weight to observations that are known to more precise. In ordinary least squares fitting, the independent variables are assumed to be fixed (i.e., there is no measurement error). However, in many measurement processes, there can be significant error in the independent variables as well as the dependent variables. This is commonly referred to as the measurement error model or the errors in variables problem. Orthogonal distance fitting provides one method for fitting these error in variables model. Dataplot supports orthogonal distance fitting using the ODRPACK library (see the References section below). A mathematical description of orthogonal distance fitting is beyond the scope of this help file. We have placed a Postscript copy of the ODRPACK User's Guide on the Dataplot web site (see the References section below) for those who are interested in the mathematical details of orthogonal distance regression. This help file will concentrate on applying orthogonal distance fitting within Dataplot. As mentioned above, ordinary least squares allows you to specify weights for the response variable and starting values for the parameters. It returns estimates for the model parameters (beta) and predicted and residual values for the response variable. For orthogonal distance fitting, you can additionally specify the following:
In addition to the estimated (i.e., predicted) response variable, orthogonal distance fitting returns an estimate for the design matrix. More specifically, it returns the residuals, DELTA, which are added to the original design matrix to obtain the estimated (i.e., predicted) design matrix. These topics are discussed in more detail in the various "Note:" sections below. Although errors in variables models were the primary motivation for incorporating ODRPACK into Dataplot, ODRPACK provides the following 2 additional capabilities:
These topics are discussed in "Note:" sections below.
<SUBSET/EXCEPT/FOR qualification> where <y> is the response (= dependent) variable; <f> is: This syntax is appropriate for applying orthogonal distance fitting to linear, polynomial, multilinear, and nonlinear models.
<SUBSET/EXCEPT/FOR qualification> where <f> is: This syntax is appropriate for applying orthogonal distance fitting to implicit models.
<SUBSET/EXCEPT/FOR qualification> where <y1>... <yk> is a list of 2 to 5 response (= dependent) variables; <f1> ... <fk> is a list of 2 to 5 function names (must equal the number of response variables; and where the <SUBSET/EXCEPT/FOR qualification> is optional. This syntax is used when there are multiple response variables. NOTE: This syntax is still being tested.
ORTHOGONAL DISTANCE FIT Y = A0 + A1*X1 + A2*X1**2 ORTHOGONAL DISTANCE FIT Y = A0 + A1*X1 + A2*X2 ORTHOGONAL DISTANCE FIT Y = A0 + A1*X1 SUBSET X1 > 1 ORTHOGONAL DISTANCE FIT Y = A+B*EXP(C*X)
Starting values are often determined from previous fits to similar data. If this is not available, you may need to do some preliminary analysis to determine starting values. In Dataplot, the PREFIT command can often be useful for this purpose.
where See the Note below regarding multiresponse fits for a discussion of how weights are specified for multiresponse fits.
A common case is to specify which columns of the design matrix are to be fixed or estimated. For example, suppose there are three independent variables where the first and third are to be estimated while the second is fixed, then enter the commands
ORTHOGONAL DISTANCE ERROR YERR That is, a single variable (it does not have to be called YERR) is specified and the number of rows must be equal to the number of columns in the design matrix. A zero indicates that the corresponding column is considered fixed and a nonzero (here, that means the abosolute value is greater than 0.5) means it it will be estimated. Note that order is important in the above command. That is, Dataplot creates a list of independent variables when it parses the function name in the ORTHOGONAL DISTANCE FIT command. This parsing is left to right, so the order of the values in the YERR variable is relative to the variable names as they are first encountered (left to right) in the function. You can also specify which are fixed at the observation level. For example, suppose there are two independent variables with eight observations each. For the first variable, we want the first two observatons to be fixed and for the second variable we want the first four observations to be fixed. We would enter the commands
LET YERR2 = DATA 0 0 0 0 1 1 1 1 ORTHOGONAL DISTANCE ERROR YERR1 YERR2 As for the one variable case, order is important in the command. YERR1 applies to the first variable name encountered in the function (left to right), YERR2 corresponds to the second variable name encountered, and so on. The default is to assume all values in the design matrix are to be estimated. In this case, the ORTHOGONAL DISTANCE ERROR command does not need to be entered. If you entered a previous ORTHOGONAL DISTANCE ERROR command, you can reset the default by entering
The general idea is to "fix" values in the independent variable that are known to be precise and to estimate values that have significant measurement errors. In most cases, sufficient precision will be determined at the variable level. However, there may be cases where a measurement for a given independent variable is known to be precise within a given range, but it may be error prone outside of that range.
ORTHOGONAL DISTANCE DELTA WEIGHTS RHO That is, a single variable (it does not have to be called YERR) is specified and the number of rows must be equal to the number of columns in the design matrix. A zero indicates that the corresponding column is considered fixed and a nonzero (here, that means the abosolute value is greater than 0.5) means it it will be estimated. Note that order is important in the above command. That is, Dataplot creates a list of independent variables when it parses the function name in the ORTHOGONAL DISTANCE FIT command. This parsing is left to right, so the order of the values in the RHO variable is relative to the variable names as they are first encountered (left to right) in the function. You can also specify weights at the observation level. For example, suppose there are two independent variables with eight observations each. The following shows an example of specifying individual weights.
LET RHO2 = DATA 1 1 1 1 2 2 2 2 ORTHOGONAL DISTANCE DELTA WEIGHTS RHO1 RHO2 As for the one variable case, order is important in the command. RHO1 applies to the first variable name encountered in the function (left to right), RHO2 corresponds to the second variable name encountered, and so on. The default is the unweighted case. That is, all points in the design matrix will have a weight of 1. Note that if an observation or column has been designated as fixed, the weight is ignored. If you entered a previous ORTHOGONAL DISTANCE DELTA WEIGHT command, you can reset the default by entering
The general idea is to provide greater weight to independent variables or observations that are known to be more precise.
You can specify starting values by columns of the design matrix. For example, suppose there are two independent variables where the first independent variable will be assigned a starting value of 2 and the second independent variable will be assigned a starting value of 7. You can enter the commands
ORTHOGONAL DISTANCE DELTA DEL That is, a single variable (it does not have to be called DEL) is specified and the number of rows must be equal to the number of columns in the design matrix. A zero indicates that the corresponding column is considered fixed and a nonzero (here, that means the abosolute value is greater than 0.5) means it it will be estimated. Note that order is important in the above command. That is, Dataplot creates a list of independent variables when it parses the function name in the ORTHOGONAL DISTANCE FIT command. This parsing is left to right, so the order of the values in the RHO variable is relative to the variable names as they are first encountered (left to right) in the function. You can also specify starting values at the observation level. For example, suppose there are two independent variables with eight observations each. The following shows an example of specifying individual weights.
LET DEL2 = DATA 0 0 0 0 1 1 1 1 ORTHOGONAL DISTANCE DELTA DEL1 DEL2 As for the one variable case, order is important in the command. DEL1 applies to the first variable name encountered in the function (left to right), DEL2 corresponds to the second variable name encountered, and so on.
You can define a number of convergence critierion. By default, these are chosen automatically by ODRPACK and we recommend that these default values be used unless you have a good reason for changing them. See the ODRPACK User's Guide for more information on these parameters.
The default is INTERMEDIATE. In addition, Dataplot writes the following information to files after an orthogonal distance fit:
In addition, the internal parameters RESSD and RESDF will contain the residual standard deviation and the residual degrees of freedom for the fitted model.
On most platforms, Dataplot can be compiled in a mode where single precision is treated as double precision (so Dataplot function evaluation returns double precision results). If you desire higher precision results from the orthogonal distance regression, you should install a double precision version of Dataplot on your system. Contact Alan Heckert for additional information. For NIST users, we maintain a double precision version on the Sun that can be crossmounted from the /itl/apps directory. If you have crossmounted this directory, enter
to run the double precision version.
An implicit function is defined as:
The ODRPACK software can fit implicit models and Dataplot supports this capability. See Syntax 2 above and the Program 2 example below. Basically, the response variable is omitted. The other options described above work the same for the implicit model as for the explicit model.
Syntax 3 shows the basic syntax for the multiresponse case. The primary point is that a function is specified for each response variable (you must give the name of a previously defined function, not a functional expression). The other difference is that you can specify a weight variable for each response variable. Use the command
where <varlist> is a list of variables that define the weights for each of the response variables (i.e., if there are 3 response variables, there should be a list of 3 weight variables). Note that for the single response variable case, you can specify the weights either with this command or with the WEIGHTS command. If both are given, the ORTHOGONAL DISTANCE Y WEIGHTS command takes precedence.
If you need to run an orthogonal distance fit that is beyond Dataplot's capabilities, we recommend you download the ODRPACK Fortran source and use ODRPACK directly.
ERRORS IN VARIABLES FIT ERRORS IN VARIABLES REGRESSION
"User's Reference Guide for ODRPACK Version 2.01: Software for Weighted Orthogonal Distance Regression", Paul Boggs, Janet Donaldson, Richard Byrd, and Robert Schnabel, NISTIR 894103, Revised). Note: for those interested in the mathematics of orhogonal distance regression, we have put a Postscript copy of the ODRPACK User's Guide on the Dataplot web site at:
. Performs an orthogonal distance analysis of example from . Wayne Fuller's book (see header of FULLODR1.DAT for reference). . This is example 1 from version 2.01 of ODRPACK User's Guide. . skip 25 read fullodr1.dat x y let n = size y . let b1 = 1500.0 let b2 = 50.0 let b3 = 0.1 let function f = b1 + b2*(exp(b3*x)  1)**2 . let yerr = 0 for i = 1 1 n let yerr = 1 subset x = 1 to 99 . orthogonal distance error yerr . orthogonal distance fit y = f
. Performs an orthogonal distance analysis of example from . Wayne Fuller's book (see header of FULLODR2.DAT for reference). . This is example 2 from version 2.01 of ODRPACK User's Guide. . . Note that this is an example of fitting an implicit function. . skip 25 read fullodr2.dat v h let n = size v . let b1 = 1.0 let b2 = 3.0 let b3 = 0.09 let b4 = 0.02 let b5 = 0.08 let function f1 = b3*(vb1)**2 let function f2 = 2*b4*(vb1)*(hb2) let function f3 = b5*(hb2)**2 let function f = f1 + f2 + f3  1.0 . orthogonal distance fit f
. Performs an orthogonal distance analysis of Draper/Smith . data set (p. 521 of 2nd. ed.). From version 1.3 of ODRPACK . User's Guide. . skip 25 read draps521.dat y x1 x2 . let a1 = 0.01155 let a2 = 5000.0 let function f = exp(a1*x1*exp(a2*(1/x2  1/620))) . let rho = data 5.0 3.0 let yerr = data 1 0 . orthogonal distance error yerr orthogonal distance delta weights rho . orthogonal distance fit y = f
. Performs an orthogonal distance analysis of Draper/Smith . data set (p. 518 of 2nd. ed.). From ODRPACK ACM article. . skip 25 read draps518.dat y x . let b1 = 4 let b2 = 5 let b3 = 200 let function f = B1*10**(b2*x/(b3+x)) . let yerr = data 1 . orthogonal distance error yerr . orthogonal distance fit y = f
Date created: 6/5/2001 