SED navigation bar go to SED home page go to Dataplot home page go to NIST home page SED Home Page SED Staff SED Projects SED Products and Publications Search SED Pages
Dataplot Vol 1 Vol 2

RANK CORRELATION INDEPENDENCE TEST

Name:
    RANK CORRELATION INDEPENDENCE TEST
Type:
    Analysis Command
Purpose:
    Perform a Spearman rho rank correlation test for whether two samples are independent (i.e., not correlated).
Description:
    If the measurements in the two samples are replaced with their ranks (and average ranks in the case of ties) and the Pearson correlation coefficient is computed, the result is the Spearman rho correlation coefficient.

    A value of +1 indicates perfect positive correlation, a value of -1 indicates perfect negative correlation, and a value of 0 indicates no relation (i.e., independence).

    The rank correlation independence test is a test whether the rank correlation coefficient is equal to zero.

    For larger n (e.g., n > 30) or the case where there are many ties, the p-th upper quantile of the rank correlation statistic can be approximated by

      \( w_{p} = \frac{z_{p}}{\sqrt{n-1}} \)

    with zp and n denoting the p-th quantile of the standard normal distribution and the sample size, respectively. The lower quantile is the negative of the upper quantile.

    For a two-sided test, the p-value is computed as twice the minimum of the lower tailed and upper tailed quantiles.

    For n ≤ 30, tabulated quantiles (from Table A10 on p. 542 of Conover) are used. These quantiles are exact when there are no ties in the data.

Syntax 1:
    <LOWER TAILED/UPPER TAILED> RANK CORRELATION
                            INDEPENDENCE TEST <y1> <y2>
                            <SUBSET/EXCEPT/FOR qualification>
    where <LOWER TAILED/UPPER TAILED> is an optional keyword that specifies either a lower tailed or an upper tailed test;
                <y1> is the first response variable;
                <y2> is the second response variable;
    and where the <SUBSET/EXCEPT/FOR qualification> is optional.

    If neither LOWER TAILED or UPPER TAILED is specified, a two-tailed test is performed.

    Lower tailed tests are used to test for negative correlation and upper tailed tests are used to test for positive correlation).

Syntax 2:
    <LOWER TAILED/UPPER TAILED> RANK CORRELATION
                            INDEPENDENCE TEST <y1> ... <yk>
                            <SUBSET/EXCEPT/FOR qualification>
    where <LOWER TAILED/UPPER TAILED> is an optional keyword that specifies either a lower tailed or an upper tailed test;
                <y1> ... <yk> is a list of 1 to 30 response variables;
    and where the <SUBSET/EXCEPT/FOR qualification> is optional.

    This syntax will perform all the pair-wise tests for the <y1> ... <yk> response variables. For example,

      RANK CORRELATION INDEPENDENCE TEST Y1 TO Y4

    is equivalent to

      RANK CORRELATION INDEPENDENCE TEST Y1 Y2
      RANK CORRELATION INDEPENDENCE TEST Y1 Y3
      RANK CORRELATION INDEPENDENCE TEST Y1 Y4
      RANK CORRELATION INDEPENDENCE TEST Y2 Y3
      RANK CORRELATION INDEPENDENCE TEST Y2 Y4
      RANK CORRELATION INDEPENDENCE TEST Y3 Y4
Examples:
    RANK CORRELATION INDEPENDENCE TEST Y1 Y2
    RANK CORRELATION INDEPENDENCE TEST Y1 TO Y5
    LOWER TAILED RANK CORRELATION INDEPENDENCE TEST Y1 Y2
    UPPER TAILED RANK CORRELATION INDEPENDENCE TEST Y1 Y2
Note:
    This command can be used to test for trend in a univariate variable. For example

      LET N = SIZE Y
      LET X = SEQUENCE 1 1 N
      RANK CORRELATION INDEPENDENCE TEST Y X

    According to Conover, this test is more powerful than the Cox and Stuart test. However, it is not as widely applicable as the Cox and Stuart test.

    This test for trend is referred to as the Daniels test for trend.

Note:
    The KENDALL TAU INDEPENDENCE TEST can be used to perform a test for independence based on the Kendall tau statistic.

    The CORRELATION CONFIDENCE LIMITS command can be used to generate a confidence interval for the Pearson correlation coefficient. This can be used for a parametric test for independence (i.e., does the confidence interval contain zero?).

Note:
    By default, critical values are based on tabulated values for n ≤ 30. The command

      SET RANK CORRELATION CRITICAL VALUES NORMAL APPROXIMATION

    can be used to specify that they should be based on the normal approximation given above. This may be preferred if there are ties in the data. To reset the default, enter the command

      SET RANK CORRELATION CRITICAL VALUES TABLE
Note:
    The RANK CORRELATION INDEPENDENCE TEST will accept matrix arguments. If a matrix is given, the data elements in the matrix will be collected in column order to form a vector before performing the test.
Note:
    Dataplot saves the following internal parameters after a rank correlation independence test:

      STATVAL = the value of the test statistic
      STATCDF = the CDF of the test statistic
      PVALUE = the p-value for the two-sided test
      PVALUELT = the p-value for the lower tailed test
      PVALUEUT = the p-value for the upper tailed test
      CUTLOW90 = the 90% lower tailed critical value
      CUTUPP90 = the 90% upper tailed critical value
      CUTLOW95 = the 95% lower tailed critical value
      CUTUPP95 = the 95% upper tailed critical value
      CTLOW975 = the 97.5% lower tailed critical value
      CTUPP975 = the 97.5% upper tailed critical value
      CUTLOW99 = the 99% lower tailed critical value
      CUTUPP99 = the 99% upper tailed critical value
      CTLOW995 = the 99.5% lower tailed critical value
      CTUPP995 = the 99.5% upper tailed critical value
      CTLOW999 = the 99.9% lower tailed critical value
      CTUPP999 = the 99.9% upper tailed critical value
Note:
    The following statistics can also be computed

      LET A = RANK CORRELATION Y1 Y2
      LET A = RANK CORRELATION CDF Y1 Y2
      LET A = RANK CORRELATION PVALUE Y1 Y2
      LET A = RANK CORRELATION LOWER TAILED PVALUE Y1 Y2
      LET A = RANK CORRELATION UPPER TAILED PVALUE Y1 Y2

    The cdf and p-values are based on the normal approximation given above.

Note:
    The run sequence plot can be used to graphically assess whether or not there is trend in the data. The 4-plot can be used to assess the more general assumption of "independent, identically distributed" data.

    The paired data can also be analyzed using other techniques for comparing two response variables (e.g., t-test, bihistogram, quantile-quantile plot).

Default:
    None
Synonyms:
    None
Related Commands: Reference:
    Conover (1999), "Practical Nonparametric Statistics," Third Edition, Wiley, pp. 319-327.
Applications:
    Confirmatory Data Analysis
Implementation Date:
    2013/3
Program:
     
    skip 25
    read kendall.dat y1 y2
    set write decimals 5
    .
    let statval  = rank correlation y1 y2
    let statcdf  = rank correlation cdf y1 y2
    let pvalue   = rank correlation pvalue y1 y2
    let pvallt   = rank correlation lower tailed pvalue y1 y2
    let pvalut   = rank correlation upper tailed pvalue y1 y2
    print statval statcdf pvalue pvallt pvalut
    .
    rank correlation independence test y1 y2
    .
    upper tailed rank correlation independence test y1 y2
    .
    set rank correlation critical values normal approximation
    upper tailed rank correlation independence test y1 y2
        
    The following output is generated.
     
     PARAMETERS AND CONSTANTS--
    
        STATVAL --        0.59002
        STATCDF --        0.97482
        PVALUE  --        0.05036
        PVALLT  --        0.97482
        PVALUT  --        0.02518
     
    
                Two Sample Rank Correlation Test for Independence
     
    First Response Variable:  Y1
    Second Response Variable: Y2
     
    H0: The Two Samples are Independent
    Ha: The Two Samples Are Not Independent
     
    Number of Observations:                               12
     
    Sample One Summary Statistics:
    Sample Mean:                                   587.08333
    Sample Standard Deviation:                      58.01482
    Sample Minimum:                                530.00000
    Sample Maximum:                                740.00000
     
    Sample Two Summary Statistics:
    Sample Mean:                                     3.59999
    Sample Standard Deviation:                       0.28603
    Sample Minimum:                                  3.20000
    Sample Maximum:                                  4.00000
     
    Test:
    Spearman Rho Rank Correlation Value:             0.59001
    CDF Value (Normal Approximation):                0.97481
    Two-Sided P-Value (Normal Approximation):        0.05036
     
     
                Conclusions (Two-Tailed Test)
     
    H0: Samples are Independent
    ------------------------------------------------------------
                                                            Null
       Significance           Test       Critical     Hypothesis
              Level      Statistic   Region (+/-)     Conclusion
    ------------------------------------------------------------
              80.0%        0.59001        0.39860         REJECT
              90.0%        0.59001        0.49650         REJECT
              95.0%        0.59001        0.58040         REJECT
              99.0%        0.59001        0.72030         ACCEPT
     
     
                Two Sample Rank Correlation Test for Independence
     
    First Response Variable:  Y1
    Second Response Variable: Y2
     
    H0: The Two Samples are Independent
    Ha: The Two Samples Are Positively Correlated
     
    Number of Observations:                                   12
     
    Sample One Summary Statistics:
    Sample Mean:                                       587.08333
    Sample Standard Deviation:                          58.01482
    Sample Minimum:                                    530.00000
    Sample Maximum:                                    740.00000
     
    Sample Two Summary Statistics:
    Sample Mean:                                         3.59999
    Sample Standard Deviation:                           0.28603
    Sample Minimum:                                      3.20000
    Sample Maximum:                                      4.00000
     
    Test:
    Spearman Rho Rank Correlation Value:                 0.59001
    CDF Value (Normal Approximation):                    0.97481
    Upper Tailed P-Value (Normal Approximation):         0.02518
     
     
                Conclusions (Upper 1-Tailed Test)
     
    H0: Samples are Independent
    ------------------------------------------------------------
                                                            Null
       Significance           Test       Critical     Hypothesis
              Level      Statistic     Region (>)     Conclusion
    ------------------------------------------------------------
              90.0%        0.59001        0.39860         REJECT
              95.0%        0.59001        0.49650         REJECT
              97.5%        0.59001        0.58040         REJECT
              99.0%        0.59001        0.67130         ACCEPT
              99.5%        0.59001        0.72030         ACCEPT
              99.9%        0.59001        0.81120         ACCEPT
     
     
                Two Sample Rank Correlation Test for Independence
     
    First Response Variable:  Y1
    Second Response Variable: Y2
     
    H0: The Two Samples are Independent
    Ha: The Two Samples Are Positively Correlated
     
    Number of Observations:                                   12
     
    Sample One Summary Statistics:
    Sample Mean:                                       587.08333
    Sample Standard Deviation:                          58.01482
    Sample Minimum:                                    530.00000
    Sample Maximum:                                    740.00000
     
    Sample Two Summary Statistics:
    Sample Mean:                                         3.59999
    Sample Standard Deviation:                           0.28603
    Sample Minimum:                                      3.20000
    Sample Maximum:                                      4.00000
     
    Test:
    Spearman Rho Rank Correlation Value:                 0.59001
    CDF Value (Normal Approximation):                    0.97481
    Upper Tailed P-Value (Normal Approximation):         0.02518
     
     
                Conclusions (Upper 1-Tailed Test)
     
    H0: Samples are Independent
    ------------------------------------------------------------
                                                            Null
       Significance           Test       Critical     Hypothesis
              Level      Statistic     Region (>)     Conclusion
    ------------------------------------------------------------
              90.0%        0.59001        0.38640         REJECT
              95.0%        0.59001        0.49594         REJECT
              97.5%        0.59001        0.59095         ACCEPT
              99.0%        0.59001        0.70142         ACCEPT
              99.5%        0.59001        0.77664         ACCEPT
              99.9%        0.59001        0.93174         ACCEPT
     
        
Date created: 03/08/2013
Last updated: 12/11/2023

Please email comments on this WWW page to alan.heckert@nist.gov.