
PEARSON CONTINGENCY COEFICIENTName:
A common question with regards to a twoway contingency table is whether we have independence. By independence, we mean that the row and column variables are unassociated (i.e., knowing the value of the row variable will not help us predict the value of column variable and likewise knowing the value of the column variable will not help us predict the value of the row variable). A more technical definition for independence is that
The standard test statistic for determing independence is the chisquare test statistic:
One criticism of this statistic is that it does not give a meaningful description of the degree of dependence (or strength of association). That is, it is useful for determining whether there is dependence. However, since the strength of that association also depends on the degrees of freedom as well as the value of the test statistic, it is not easy to interpert the strength of association. The Pearson's contingency coefficient is one method to provide an easier to interpret measure of strength of association. Specifically, it is:
where
N = the total sample size So this statistic basically scales the chisquare statistic to a value between 0 (no association) and 1 (maximum association). It has the desirable property of scale invariance. That is, if the sample size increases, the value of Pearson's contingency coefficient does not change as long as values in the table change the same relative to each other. The data for the contingency table can be specified in either of the following two ways:
<SUBSET/EXCEPT/FOR qualification> where <y1> is the first response variable; <y2> is the second response variable; <par> is a parameter where the computed Pearson contingency coefficient is stored; and where the <SUBSET/EXCEPT/FOR qualification> is optional. Use this syntax for raw data.
<y1> <y2> <SUBSET/EXCEPT/FOR qualification> where <m> is a matrix containing the contingency table; <p> is a parameter where the computed Pearson contingency coefficient is stored; and where the <SUBSET/EXCEPT/FOR qualification> is optional. Use this syntax if your data is a contingency table.
LET A = MATRIX GRAND PEARSON CONTINGENCY COEFICIENT M
CROSS TABULATE PEARSON CONTINGENCY COEFICIENT ... Y1 Y2 X1 X2
PEARSON CONTINGENCY COEFICIENT PLOT Y1 Y2 X
BOOTSTRAP PEARSON CONTINGENCY COEFICIENT PLOT Y1 Y2 The above commands expect the variables to have the same number of observations. Note that the above commands are only available if you have raw data.
Friendly (2000), "Visualizing Categorical Data", SAS Institute Inc., p. 61.
. Sample data from page 61 of Friendly read matrix m 5 29 14 16 15 54 14 10 20 84 17 94 68 119 26 7 end of data . let a = matrix pearson contingency coefficient mThe resulting Pearson's contingency coefficient is 0.435.
Date created: 7/24/2007 