SPEARMAN DISSIMILARITY

Name:

Type:

Let Subcommand Purpose:

Spearman rank correlation coefficient

Description:

Pearson correlation coefficient

The rank correlation is recommended in the following cases:

When the underlying data does not have a meaningful numerical measure, but it can be ranked;
When the relationship between the two variables is not linear;
When the normality assumption for two variables is not valid.

A perfect linear relationship yields a correlation coefficient of +1 (or -1 for a negative relationship) and no linear relationship yields a correlation coefficient of 0.

In some applications, such as clustering, it can be useful to transform the correlation coefficient to a dissimilarity measure. The transformation used here is

\( d = \frac{1 - r}{2} \)

This converts the correlation coefficient with values between -1 and 1 to a score between 0 and 1. High positive correlation (i.e., very similar) results in a dissimilarity near 0 and high negative correlation (i.e., very dissimilar) results in a dissimilarity near 1.

If a similarity score is preferred, you can use

\( s = 1 - d \)

where d is defined as above.

Syntax 1:

Syntax 2:

Examples:

Note:

The two variables must have the same number of elements. Default:

None Synonyms:

SPEARMAN DISTANCE is a synonym for SPEARMAN DISSIMILARITY Related Commands:

RANK CORRELATION	=	Compute Spearman rank correlation coefficient.
PEARSON DISSIMILARITY	=	Compute the dissimilarity of two variables based on Pearson correlation.
CORRELATION	=	Compute the Pearson correlation of two variables.
KENDALL TAU DISSIMILARITY	=	Compute the dissimilarity of two variables based on Kendall's tau correlation.
COSINE DISTANCE	=	Compute the cosine distance.
MANHATTAN DISTANCE	=	Compute the Euclidean distance.
EUCLIDEAN DISTANCE	=	Compute the Euclidean distance.
MATRIX DISTANCE	=	Compute various distance metrics for a matrix.
GENERATE MATRIX <stat>	=	Compute a matrix of pairwise statistic values.

Reference:

Wiley

Applications:

Clustering Implementation Date:

Program 1:

 
SKIP 25
READ BERGER1.DAT Y X
LET CORR = RANK CORRELATION Y X
LET D    = SPEARMAN DISSIMILARITY Y X
SET WRITE DECIMALS 4
PRINT CORR D

 PARAMETERS AND CONSTANTS--

    CORR    --         0.9486
    D       --         0.0257

Program 2:

 
SKIP 25
READ IRIS.DAT Y1 Y2 Y3 Y4
SET WRITE DECIMALS 3
.
LET M = GENERATE MATRIX SPEARMAN DISSIMILARITY Y1 Y2 Y3 Y4
PRINT M

        MATRIX M       --            4 ROWS
                       --            4 COLUMNS

 VARIABLES--M1             M2             M3             M4      

          0.000          0.583          0.059          0.131
          0.583          0.000          0.655          0.506
          0.059          0.655         -0.000          0.084
          0.131          0.506          0.084         -0.000

Program 3:

 
SKIP 25
READ IRIS.DAT Y1 Y2 Y3 Y4 TAG
.
TITLE CASE ASIS
TITLE OFFSET 2
CASE ASIS
TIC MARK OFFSET UNITS DATA
YLIMITS 0 1
MAJOR YTIC MARK NUMBER 6
MINOR YTIC MARK NUMBER 1
Y1TIC MARK LABEL DECIMAL 1
XLIMITS 1 3
MAJOR XTIC MARK NUMBER 3
MINOR XTIC MARK NUMBER 0
XTIC MARK OFFSET 0.3 0.3
CHARACTER X BLANK
LINES BLANK SOLID
.
MULTIPLOT CORNER COORDINATES 5 5 95 95
MULTIPLOT SCALE FACTOR 2
MULTIPLOT 2 3
.
TITLE Sepal Length vs Sepal Width
SPEARMAN DISSIMILARITY PLOT Y1 Y2 TAG
.
TITLE Sepal Length vs Petal Length
SPEARMAN DISSIMILARITY PLOT Y1 Y3 TAG
.
TITLE Sepal Length vs Petal Width
SPEARMAN DISSIMILARITY PLOT Y1 Y4 TAG
.
TITLE Sepal Width vs Petal Length
SPEARMAN DISSIMILARITY PLOT Y2 Y3 TAG
.
TITLE Sepal Width vs Petal Width
SPEARMAN DISSIMILARITY PLOT Y2 Y4 TAG
.
TITLE Petal Length vs Petal Width
SPEARMAN DISSIMILARITY PLOT Y3 Y4 TAG
.
END OF MULTIPLOT
.
JUSTIFICATION CENTER
MOVE 50 5
TEXT Species
DIRECTION VERTICAL
MOVE 5 50
TEXT Spearman Rank Dissimilarity Coefficient
DIRECTION HORIZONTAL