CONDITION INDICES
Name:
Type:
Purpose:
Compute condition indices of a regression design matrix.
Description:
Condition indices are a measure of the multi-colinearity in a
regression design matrix (i.e., the independent variables).
Multi-colinearity results when the columns of X have
significant interdependence (i.e., one or more columns of X
is close to a linear combination of the other columns).
Multi-colinearity can result in numerically unstable estimates
of the regression coefficients (small changes in X can
result in large changes to the estimated regression coefficients).
Pairwise colinearity can be determined from viewing a correlation
matrix of the independent variables. However, correlation
matrices will not reveal higher order colinearity.
There are a number of approaches to dealing with
multi-colinearity. Some of these include:
- Delete one or more of the independent variables from
the fit.
- Perform a principal components regression.
- Compute the regression using a singular value
decomposition approach. Note that Dataplot uses
a modified Gram-Schmidt method (Dataplot can perform
a singular value decomposition, however this has not
been incorporated into the fit).
Condition indices are one measure that can be used to
detect multi-colinearity (variance inflation factors are
another). The condition indices are calculated as follows:
- Scale the columns of the X matrix to have unit
sums of squares.
- Calculate the singular values of the scaled X
matrix and square them.
Condition indices between 30 and 100 indicate moderate to
strong colinearity.
Syntax:
LET <y1> = CONDITION INDICES <mat1>
<SUBSET/EXCEPT/FOR qualification>
where <mat1> is the design matrix for which the condition
indices are to be computed;
<y1> is a vector where the resulting condition
indices are saved;
and where the <SUBSET/EXCEPT/FOR qualification> is
optional (and rarely used in this context).
Examples:
LET Y = CONDITION INDICES X
Note:
Matrices are created with either the READ MATRIX, CREATE MATRIX,
or MATRIX DEFINITION command. Enter HELP MATRIX DEFINITION,
HELP CREATE MATRIX, and HELP READ MATRIX for details.
Note:
The columns of a matrix are accessible as variables by appending
an index to the matrix name. For example, the 4x4 matrix C has
columns C1, C2, C3, and C4. These columns can be operated on
like any other DATAPLOT variable.
Note:
The maximum size matrix that DATAPLOT can handle is set when
DATAPLOT is built on a particular site. Enter the command
HELP MATRIX DIMENSION for details on the maximum size matrix
that can be accomodated.
Default:
Synonyms:
Related Commands:
VARIANCE INFLATION FACTORS
|
= Compute variance inflation factors.
|
CREATE MATRIX
|
= Create a matrix from a list of variables.
|
FIT
|
= Perform a least squares fit.
|
CATCHER MATRIX
|
= Compute the catcher matrix.
|
PARTIAL REGRESSION PLOT
|
= Compute the catcher matrix.
|
Reference:
"Efficient Computing of Regression Diagnostics", Velleman and
Welsch, American Statistician, November, 1981, Vol. 35, No. 4,
pp. 234-242.
Applications:
Implementation Date:
Program:
DIMENSION 100 COLUMNS
SKIP 25
READ HALD647.DAT Y X1 X2 X3 X4
SKIP 0
LET N = SIZE X1
LET X0 = SEQUENCE 1 1 N
LET Z = CREATE MATRIX X0 X1 X2 X3 X4
LET C = CONDITION INDICES Z
SET WRITE DECIMALS 2
PRINT C
The following ouput is generated.
VARIABLES--C
1.00
7.11
10.19
55.34
149.90
Date created: 8/6/2002
Last updated: 4/4/2003
Please email comments on this WWW page to
alan.heckert@nist.gov.
|