Many statistics have one of these properties. However, it can be difficult to find statistics that are both resistant and have robustness of efficiency.
The standard covariance estimate is the optimal estimator for Gaussian data. However, it is not resistant and it does not have robustness of efficiency. The rank covariance statistic is one example of a robust estimate of correlation.
The biweight midcovariance estimator is both resistant and robust of efficiency. Mosteller and Tukey recommend using the MAD or interquartile range for exploratory work where moderate efficiency in a variety of situations is adequate. The biweight midcovariance estimator can be considered for situations where high performance is needed.
The biweight midcovariance estimate is defined as:
My = median Y
ui = (Xi - Mx)/[9*MAD]
vi = (Yi - My)/[9*MAD]
ai = 1 if -1 <= ui <= 1
bi = 1 if -1 <= vi <= 1
MAD = median absolute deviation
where <y1> is the first response variable;
<y2> is the second response variable;
<par> is a parameter where the computed biweight midcovariance is stored;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
LET A = BIWEIGHT MIDCOVARIANCE Y1 SUBSET TAG > 2
can be used to specify an alternate covariance measure to compute in the COVARIANCE MATRIX command. The following types are supported:
Mosteller and Tukey (1977), "Data Analysis and Regression: A Second Course in Statistics," Addison-Wesley, pp. 203-209.
SKIP 25 READ IRIS.DAT Y1 Y2 Y3 Y4 X LET M = CREATE MATRIX Y1 Y2 Y3 Y4 SET COVARIANCE TYPE BIWEIGHT LET B = COVARIANCE MATRIX Y1 Y2 Y3 Y4Program 2:
SKIP 25 READ IRIS.DAT Y1 Y2 Y3 Y4 X . MULTIPLOT CORNER COORDINATES 0 0 100 95 MULTIPLOT 2 1 BOOTSTRAP SAMPLES 500 BOOTSTRAP BIWEIGHT MIDCOVARIANCE PLOT Y1 Y2 X1LABEL B025 = ^B025, B975=^B975 HISTOGRAM YPLOT END OF MULTIPLOT MOVE 50 96 JUSTIFICATION CENTER TEXT BIWEIGHT MIDCOVARIANCE BOOTSTRAP: IRIS DATA