Dataplot Vol 2 Vol 1

# CORRELATION ABSOLUTE VALUE

Name:
CORRELATION ABSOLUTE VALUE (LET)
Type:
Let Subcommand
Purpose:
Compute the absolute value of the correlation coefficient between two variables.
Description:
The correlation coefficient is a measure of the linear relationship between two variables. It is computed as:

$$S_{xx} = \sum_{i=1}^{N}{(X_{i}-\bar{X})^2}$$

$$S_{yy} = \sum_{i=1}^{N}{(Y_{i}-\bar{Y})^2}$$

$$S_{xy} = \sum_{i=1}^{N}{(X_{i}-\bar{X}) (Y_{i} - \bar{Y})}$$

$$r = \frac{S_{xy}}{S_{xx}S_{yy}}$$

A perfect linear relationship yields a correlation coefficient of +1 (or -1 for a negative relationship) and no linear relationship yields a correlation coefficient of 0.

This command takes the absolute value of the correlation coefficient. That is, we are interested in the magnitude of the correlation without without regard to direction. For example, if we are screening a large number of pairwise correlations, we may want to identify correlations that exceed a threshold without taking into account the direction of the relationship.

Syntax:
LET <par> = CORRELATION ABSOLUTE VALUE <y1> <y2>
<SUBSET/EXCEPT/FOR qualification>
where <y1> is the first response variable;
<y2> is the second response variable;
<par> is a parameter where the computed correlation is stored;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
Examples:
LET A = CORRELATION ABSOLUTE VALUE Y1 Y2
LET A = CORRELATION ABSOLUTE VALUE Y1 Y2 SUBSET TAG > 2
Note:
The two variables must have the same number of elements.
Note:
Dataplot statistics can be used in a number of commands. For details, enter

This statistic is most commonly used when displaying a large number of correlations (e.g., the STATISTIC PLOT, CROSS TABULATE or FLUCTUATION PLOT) where we want to identify the "large" correlations without distinguishing between positive or negative relationships.

Default:
None
Synonyms:
None
Related Commands:
 CORRELATION = Compute the correlation. RANK CORRELATION = Compute the rank correlation. COVARIANCE = Compute the covariance. CROSS TABULATE = Compute a statistic based on a cross-tabulation. STATISTIC PLOT = Generate a statistic versus subset plot. FLUCTUATION PLOT = Generate a fluctuation plot.
Reference:
Consult any introductory statistics text.
Applications:
Linear Regression
Implementation Date:
2011/08
Program:

SKIP 25
READ IRIS.DAT Y1 Y2 Y3 Y4 TAG
.
TITLE CASE ASIS
TITLE OFFSET 2
LABEL CASE ASIS
TIC MARK OFFSET UNITS DATA
Y1LABEL |Correlation|
YLIMITS 0 1
MAJOR YTIC MARK NUMBER 6
MINOR YTIC MARK NUMBER 1
Y1TIC MARK LABEL DECIMAL 1
Y1LABEL DISPLACEMENT 20
X1LABEL Species
XLIMITS 1 3
MAJOR XTIC MARK NUMBER 3
MINOR XTIC MARK NUMBER 0
XTIC MARK OFFSET 0.3 0.3
X1LABEL DISPLACEMENT 14
CHARACTER X BLANK
LINES BLANK SOLID
.
MULTIPLOT CORNER COORDINATES 5 5 95 95
MULTIPLOT SCALE FACTOR 2
MULTIPLOT 2 3
.
TITLE Sepal Length vs Sepal Width
CORRELATION ABSOLUTE VALUE PLOT Y1 Y2 TAG
.
TITLE Sepal Length vs Petal Length
CORRELATION ABSOLUTE VALUE PLOT Y1 Y3 TAG
.
TITLE Sepal Length vs Petal Width
CORRELATION ABSOLUTE VALUE PLOT Y1 Y4 TAG
.
TITLE Sepal Width vs Petal Length
CORRELATION ABSOLUTE VALUE PLOT Y2 Y3 TAG
.
TITLE Sepal Width vs Petal Width
CORRELATION ABSOLUTE VALUE PLOT Y2 Y4 TAG
.
TITLE Petal Length vs Petal Width
CORRELATION ABSOLUTE VALUE PLOT Y3 Y4 TAG
.
END OF MULTIPLOT


NIST is an agency of the U.S. Commerce Department.

Date created: 08/24/2011
Last updated: 11/02/2015

Please email comments on this WWW page to alan.heckert@nist.gov.