SED navigation bar go to SED home page go to Dataplot home page go to NIST home page SED Home Page SED Contacts SED Projects SED Products and Publications Search SED Pages
Dataplot Vol 2 Auxiliary Chapter

PERCENTAGE BEND CORRELATION

Name:
    PERCENTAGE BEND CORRELATION (LET)
Type:
    Let Subcommand
Purpose:
    Compute the percentage bend correlation for a variable.
Description:
    Mosteller and Tukey (see Reference section below) define two types of robustness:

    1. resistance means that changing a small part, even by a large amount, of the data does not cause a large change in the estimate

    2. robustness of efficiency means that the statistic has high efficiency in a variety of situations rather than in any one situation. Efficiency means that the estimate is close to optimal estimate given that we know what distribution that the data comes from. A useful measure of efficiency is:

        Efficiency = (lowest variance feasible)/ (actual variance)

    Many statistics have one of these properties. However, it can be difficult to find statistics that are both resistant and have robustness of efficiency.

    The Pearson correlation coefficient is an optimal estimator for Gaussian data. However, it is not resistant and it does not have robustness of efficiency.

    The percentage bend correlation estimator, discussed in Shoemaker and Hettmansperger and also by Wilcox, is both resistant and robust of efficiency. The rationale and derivation for this estimate is given in these references.

    The percentage bend correlation between two variables X and Y is computed as follows:

    1. Set m = (1-betan) + 0.5. Round m down to the nearest integer.

    2. Let W(i) = |X(i) - M(x)| for i = 1, ..., n where Mx. is the median of X.

    3. Sort the Wi in ascending order.

    4. what(x) = W(m) (i. e., the mth order statistic). W(m) is the estimate of the (1-beta) quantile of W.

    5. Sort the X values. Compute the number of values of (X(i) - M(x))/whatx(beta) that are less than -1 and the number that are greater than +1 and store in i1 and i2 respectively. Then compute

        S(x) = SUM[i=i1+1 to n-i2][X(i)]
        phihat(x) = (what(x)*(i2-i1) + S(x))/(n - i1 - i2)
        U(i) = (X(i) - phihat(x)/what(x)

    6. Repeat the above calculations on the Y variable. Store corresponding quantities in what(y), phihat(y), and Vi.

    7. Define the function

        phi(x) = max[-1, min(1,x)]

    8. Compute

        Ai = PSIi (Ui)
        Bi = PSIi (Vi)

    9. Compute the percentage bend correlation

      rho(pb) = SUM[i=1 to n][A(i)*B(i)]/
SQRT(SUM[i=1 to n][A(i)**2]*SUM[i=1 to n][B(i)**2])

    The value of beta is selected between 0 and 0.5. Higher values of beta result in a higher breakdown point at the expense of lower efficiency.

Syntax:
    LET <par> = PERCENTAGE BEND CORRELATION <y1> <y2>
                                <SUBSET/EXCEPT/FOR qualification>
    where <y1> is the first response variable;
                  <y2> is the second response variable;
                  <par> is a parameter where the computed percentage bend correlation is stored;
    and where the <SUBSET/EXCEPT/FOR qualification> is optional.
Examples:
    LET A = PERCENTAGE BEND CORRELATION Y1 Y2
    LET A = PERCENTAGE BEND CORRELATION Y1 Y2 SUBSET TAG > 2
Note:
    To set the value of beta, enter the command

      LET BETA = <value>
    where <value> is greater than 0 and less than or equal to 0.5. The default value for beta is 0.1.
Note:
    Support for the percentage bend correlation has been added to the following plots and commands:

      PERCENTAGE BEND CORRELATION PLOT
      CROSS TABULATE PERCENTAGE BEND CORRELATION PLOT
      BOOTSTRAP PERCENTAGE BEND CORRELATION PLOT
      JACKNIFE PERCENTAGE BEND CORRELATION PLOT
      PERCENTAGE BEND CORRELATION INTERACTION STATISTIC PLOT
Default:
    None
Synonyms:
    None
Related Commands:
    PERCENTAGE BEND MIDVARIANCE = Compute the percentage bend midvariance of a variable.
    BIWEIGHT CORRELATION = Compute a biweight correlation estimate of a variable.
    WINSORIZED CORRELATION = Compute a Winsorized correlation estimate of a variable.
    CORRELATION = Compute the correlation between two variables.
    RANK CORRELATION = Compute the rank correlation between two variables.
    VARIANCE = Compute the variance of a variable.
    STATISTIC PLOT = Generate a statistic versus group plot for a given statistic.
    CROSS TABULATE PLOT = Generate a statistic versus group plot for a given statistic and two group variables.
    BOOTSTRAP PLOT = Generate a bootstrap plot for a given statistic.
    INFLUENCE CURVE = Generate an influence curve for a given statistic.
    INTERACTION STATISTIC PLOT = Generate an interaction plot for a given statistic.
Reference:
    "Robust Estimates of and Tests for the One- and Two-Sample Scale Models", Shoemaker and Hettmansperger, Biometrika 69, 1982, pp. 47-54.

    "Introduction to Robust Estimation and Hypothesis Testing", Rand Wilcox, Academic Press, 1997.

    "Data Analysis and Regression: A Second Course in Statistics", Mosteller and Tukey, Addison-Wesley, 1977, pp. 203-209.

Applications:
    Robust Data Analysis
Implementation Date:
    2002/7
Program 1:
    SKIP 25 
    READ MATRIX IRIS.DAT Y1 Y2 Y3 Y4 X 
    LET M = CREATE MATRIX Y1 Y2 Y3 Y4 
    SET CORRELATION TYPE PERCENTAGE BEND 
    LET B = CORRELATION MATRIX Y1 Y2 Y3 Y4 
        
Program 2:
     
    SKIP 25
    READ IRIS.DAT Y1 Y2 Y3 Y4 X
    .
    MULTIPLOT CORNER COORDINATES 0 0 100 95
    MULTIPLOT SCALE FACTOR 2
    MULTIPLOT 2 1
    BOOTSTRAP SAMPLES 500
    BOOTSTRAP PERCENTAGE BEND CORRELATION PLOT Y1 Y2
    X1LABEL DISPLACEMENT 12
    X1LABEL B025 = ^B025, B975=^B975
    HISTOGRAM YPLOT
    END OF MULTIPLOT
    MOVE 50 96
    JUSTIFICATION CENTER
    TEXT PERCENTAGE BEND CORRELATION BOOTSTRAP: IRIS DATA
        
    plot generated by sample program

Date created: 8/12/2002
Last updated: 4/4/2003
Please email comments on this WWW page to alan.heckert@nist.gov.