Dataplot Vol 1 Vol 2

KENDALL TAU INDEPENDENCE TEST

Name:
KENDALL TAU INDEPENDENCE TEST
Type:
Analysis Command
Purpose:
Perform a Kendall tau test for whether two samples are independent (i.e., not correlated).
Description:
Kendall's tau coefficient is a measure of concordance between two paired variables. Given the pairs (Xi,Yi) and (Xj,Yj), then

$$\frac{Y_j - Y_i}{X_j - X_i}$$ > 0 - pair is concordant

$$\frac{Y_j - Y_i}{X_j - X_i}$$ < 0 - pair is discordant

$$\frac{Y_j - Y_i}{X_j - X_i}$$ = 0 - pair is considered a tie

Xi = Xj - pair is not compared

Kendall's tau is computed as

$$\tau = \frac{N_c - N_d}{N_c + N_d}$$

with Nc and Nd denoting the number of concordant pairs and the number of discordant pairs, respectively, in the sample. Ties add 0.5 to both the concordant and discordant counts. There are $$\left( \begin{array}{c} n \\ 2 \end{array} \right)$$ possible pairs in the bivariate sample.

A value of +1 indicates that all pairs are concordant, a value of -1 indicates that all pairs are discordant, and a value of 0 indicates no relation (i.e., independence).

The Kendall tau independence test is a test of whether the Kendall tau coefficient is equal to zero.

For larger n (e.g., n > 60) or the case where there are many ties, the p-th upper quantile of the Kendall tau statistic can be approximated by

$$w_{p} = z_{p} \frac{\sqrt{2(2n + 5)}}{3\sqrt{n(n-1)}}$$

with zp and n denoting the p-th quantile of the standard normal distribution and the sample size, respectively. The lower quantile is the negative of the upper quantile.

For a two-sided test, the p-value is computed as twice the minimum of the lower tailed and upper tailed quantiles.

For n ≤ 60, tabulated quantiles (from Table A11 on pp. 543-544 of Conover) are used. These quantiles are exact when there are no ties in the data.

Syntax 1:
<LOWER TAILED/UPPER TAILED> KENDALL TAU INDEPENDENCE TEST
<y1> <y2>       <SUBSET/EXCEPT/FOR qualification>
where <LOWER TAILED/UPPER TAILED> is an optional keyword that specifies either a lower tailed or an upper tailed test;
<y1> is the first response variable;
<y2> is the second response variable;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

If neither LOWER TAILED or UPPER TAILED is specified, a two-tailed test is performed.

Lower tailed tests are used to test for discordance (i.e., negative correlation) and upper tailed tests are used to test for concordance (i.e., positive correlation).

Syntax 2:
<LOWER TAILED/UPPER TAILED> KENDALL TAU INDEPENDENCE TEST
<y1> ... <yk>       <SUBSET/EXCEPT/FOR qualification>
where <LOWER TAILED/UPPER TAILED> is an optional keyword that specifies either a lower tailed or an upper tailed test;
<y1> ... <yk> is a list of 1 to 30 response variables;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

This syntax will perform all the pair-wise tests for the <y1> ... <yk> response variables. For example,

KENDALL TAU INDEPENDENCE TEST Y1 TO Y4

is equivalent to

KENDALL TAU INDEPENDENCE TEST Y1 Y2
KENDALL TAU INDEPENDENCE TEST Y1 Y3
KENDALL TAU INDEPENDENCE TEST Y1 Y4
KENDALL TAU INDEPENDENCE TEST Y2 Y3
KENDALL TAU INDEPENDENCE TEST Y2 Y4
KENDALL TAU INDEPENDENCE TEST Y3 Y4
Examples:
KENDALL TAU INDEPENDENCE TEST Y1 Y2
KENDALL TAU INDEPENDENCE TEST Y1 TO Y5
LOWER TAILED KENDALL TAU INDEPENDENCE TEST Y1 Y2
UPPER TAILED KENDALL TAU INDEPENDENCE TEST Y1 Y2
Note:
This command can be used to test for trend in a univariate variable. For example

LET N = SIZE Y
LET X = SEQUENCE 1 1 N
KENDALL TAU INDEPENDENCE TEST Y X

According to Conover, this test is more powerful than the Cox and Stuart test. However, it is not as widely applicable as the Cox and Stuart test.

Note:
This test can be used to perform the Jonckheere-Terpstra test. Given two or more independent samples, the Jonckheere-Terpstra is used to test the null hypothesis that all the samples came from the same distribution against the ordered alternative that the distributions differ in a specified direction. That is

H0: F1(x) = F2(x) = .... = Fk(x)

versus

Ha: F1(x) <= F2(x) <= .... <= Fk(x)

or

Ha: F1(x) >= F2(x) >= .... >= Fk(x)

Although the Jonckheere-Terpstra test is based only on the number of concordant pairs (Nc above), applying the Kendall tau independence test gives an equivalent result.

For Dataplot, if Y is a response variable and X is a group-id variable, then to test for "less than" (positive concordance), use

UPPER TAILED KENDALL TAU INDEPENDENCE TEST Y X

and to test for "greater than" (negative concordance), use

LOWER TAILED KENDALL TAU INDEPENDENCE TEST Y X
Note:
The RANK CORRELATION INDEPENDENCE TEST can be used to perform a test for independence based on the Spearman rho rank correlation.

The CORRELATION CONFIDENCE LIMITS command can be used to generate a confidence interval for the Pearson correlation coefficient. This can be used for a parametric test for independence (i.e., does the confidence interval contain zero?).

Note:
By default, critical values are based on tabulated values for n <= 60. The command

SET KENDALL TAU CRITICAL VALUES NORMAL APPROXIMATION

can be used to specify that they should be based on the normal approximation given above. This may be preferred if there are ties in the data. To reset the default, enter the command

SET KENDALL TAU CRITICAL VALUES TABLE
Note:
The KENDALL TAU INDEPENDENCE TEST will accept matrix arguments. If a matrix is given, the data elements in the matrix will be collected in column order to form a vector before performing the test.
Note:
Dataplot saves the following internal parameters after a Kendall tau independence test:

 STATVAL = the value of the test statistic STATCDF = the CDF of the test statistic PVALUE = the p-value for the two-sided test PVALUELT = the p-value for the lower tailed test PVALUEUT = the p-value for the upper tailed test CUTLOW90 = the 90% lower tailed critical value CUTUPP90 = the 90% upper tailed critical value CUTLOW95 = the 95% lower tailed critical value CUTUPP95 = the 95% upper tailed critical value CTLOW975 = the 97.5% lower tailed critical value CTUPP975 = the 97.5% upper tailed critical value CUTLOW99 = the 99% lower tailed critical value CUTUPP99 = the 99% upper tailed critical value CTLOW995 = the 99.5% lower tailed critical value CTUPP995 = the 99.5% upper tailed critical value
Note:
The following statistics can also be computed

LET A = KENDALL TAU Y1 Y2
LET A = KENDALL TAU CDF Y1 Y2
LET A = KENDALL TAU PVALUE Y1 Y2
LET A = KENDALL TAU LOWER TAILED PVALUE Y1 Y2
LET A = KENDALL TAU UPPER TAILED PVALUE Y1 Y2

The cdf and p-values are based on the normal approximation given above.

To see a list of commands in which these statistics can be used, enter

Note:
The run sequence plot can be used to graphically assess whether or not there is trend in the data. The 4-plot can be used to assess the more general assumption of "independent, identically distributed" data.

The paired data can also be analyzed using other techniques for comparing two response variables (e.g., t-test, bihistogram, quantile-quantile plot).

Default:
None
Synonyms:
None
Related Commands:
 RANK CORRELATION INDEPENDENCE TEST = Compute an independence test based on the Spearman rho rank correlation statistic. CORRELATION CONFIDENCE LIMITS = Generate confidence limits for the Pearson correlation coefficient. COX AND STUART TEST = Compute a Cox and Stuart trend test. T-TEST = Compute a t-test. 4-PLOT = Generate a 4-plot. RUN SEQUENCE PLOT = Generate a run sequence plot. BIHISTOGRAM = Generates a bihistogram. QUANTILE-QUANTILE PLOT = Generate a quantile-quantile plot.
Reference:
Conover (1999), "Practical Nonparametric Statistics", Third Edition, Wiley, pp. 319-327.
Applications:
Confirmatory Data Analysis
Implementation Date:
2013/3
Program:

read kendall.dat y1 y2
set write decimals 5
.
let statval  = kendall tau y1 y2
let statcdf  = kendall tau cdf y1 y2
let pvalue   = kendall tau pvalue y1 y2
let pvallt   = kendall tau lower tailed pvalue y1 y2
let pvalut   = kendall tau upper tailed pvalue y1 y2
print statval statcdf pvalue pvallt pvalut
.
kendall tau independence test y1 y2
upper tailed kendall tau independence test y1 y2
.
set kendall tau critical values normal approximation
upper tailed kendall tau independence test y1 y2

The following output is generated.

PARAMETERS AND CONSTANTS--

STATVAL --        0.43548
STATCDF --        0.97563
PVALUE  --        0.04873
PVALLT  --        0.97563
PVALUT  --        0.02437

Two Sample Kendall Tau Test for Independence

First Response Variable:  Y1
Second Response Variable: Y2

H0: The Two Samples are Independent
Ha: Pairs of Samples Tend to be Either
Concordant or Discordant

Number of Observations:                               12

Sample One Summary Statistics:
Sample Mean:                                   587.08333
Sample Standard Deviation:                      58.01482
Sample Minimum:                                530.00000
Sample Maximum:                                740.00000

Sample Two Summary Statistics:
Sample Mean:                                     3.59999
Sample Standard Deviation:                       0.28603
Sample Minimum:                                  3.20000
Sample Maximum:                                  4.00000

Test:
Kendall Tau Test Statistic Value:                0.43548
CDF Value (Normal Approximation):                0.97563
Two-Sided P-Value (Normal Approximation):        0.04873

Conclusions (Two-Tailed Test)

H0: Samples are Independent
------------------------------------------------------------
Null
Significance           Test       Critical     Hypothesis
Level      Statistic   Region (+/-)     Conclusion
------------------------------------------------------------
80.0%        0.43548        0.27270         REJECT
90.0%        0.43548        0.36360         REJECT
95.0%        0.43548        0.42420         REJECT
99.0%        0.43548        0.54550         ACCEPT

Two Sample Kendall Tau Test for Independence

First Response Variable:  Y1
Second Response Variable: Y2

H0: The Two Samples are Independent
Ha: Pairs of Samples Tend to be Concordant

Number of Observations:                                  12

Sample One Summary Statistics:
Sample Mean:                                      587.08333
Sample Standard Deviation:                         58.01482
Sample Minimum:                                   530.00000
Sample Maximum:                                   740.00000

Sample Two Summary Statistics:
Sample Mean:                                        3.59999
Sample Standard Deviation:                          0.28603
Sample Minimum:                                     3.20000
Sample Maximum:                                     4.00000

Test:
Kendall Tau Test Statistic Value:                   0.43548
CDF Value (Normal Approximation):                   0.97563
Upper Tailed P-Value (Normal Approximation):        0.02436

Conclusions (Upper 1-Tailed Test)

H0: Samples are Independent
------------------------------------------------------------
Null
Significance           Test       Critical     Hypothesis
Level      Statistic     Region (>)     Conclusion
------------------------------------------------------------
90.0%        0.43548        0.27270         REJECT
95.0%        0.43548        0.36360         REJECT
97.5%        0.43548        0.42420         REJECT
99.0%        0.43548        0.51520         ACCEPT
99.5%        0.43548        0.54550         ACCEPT

Two Sample Kendall Tau Test for Independence

First Response Variable:  Y1
Second Response Variable: Y2

H0: The Two Samples are Independent
Ha: Pairs of Samples Tend to be Concordant

Number of Observations:                                  12

Sample One Summary Statistics:
Sample Mean:                                      587.08333
Sample Standard Deviation:                         58.01482
Sample Minimum:                                   530.00000
Sample Maximum:                                   740.00000

Sample Two Summary Statistics:
Sample Mean:                                        3.59999
Sample Standard Deviation:                          0.28603
Sample Minimum:                                     3.20000
Sample Maximum:                                     4.00000

Test:
Kendall Tau Test Statistic Value:                   0.43548
CDF Value (Normal Approximation):                   0.97563
Upper Tailed P-Value (Normal Approximation):        0.02436

Conclusions (Upper 1-Tailed Test)

H0: Samples are Independent
------------------------------------------------------------
Null
Significance           Test       Critical     Hypothesis
Level      Statistic     Region (>)     Conclusion
------------------------------------------------------------
90.0%        0.43548        0.28316         REJECT
95.0%        0.43548        0.36344         REJECT
97.5%        0.43548        0.43306         REJECT
99.0%        0.43548        0.51402         ACCEPT
99.5%        0.43548        0.56914         ACCEPT



NIST is an agency of the U.S. Commerce Department.

Date created: 03/08/2013
Last updated: 10/30/2015

Please email comments on this WWW page to alan.heckert@nist.gov.