|
CRAMER CONTINGENCY COEFICIENTName:
A common question with regards to a two-way contingency table is whether we have independence. By independence, we mean that the row and column variables are unassociated (i.e., knowing the value of the row variable will not help us predict the value of column variable and likewise knowing the value of the column variable will not help us predict the value of the row variable). A more technical definition for independence is that
The standard test statistic for determing independence is the chi-square test statistic:
One criticism of this statistic is that it does not give a meaningful description of the degree of dependence (or strength of association). That is, it is useful for determining whether there is dependence. However, since the strength of that association also depends on the degrees of freedom as well as the value of the test statistic, it is not easy to interpert the strength of association. The Cramer's contingency coefficient is one method to provide an easier to interpret measure of strength of association. Specifically, it is:
where
N = the total sample size q = minimum(number of rows,number of columns) This statistic is based on the fact that the maximum value of T is:
So this statistic basically scales the chi-square statistic to a value between 0 (no association) and 1 (maximum association). It has the desirable property of scale invariance. That is, if the sample size increases, the value of Cramer's contingency coefficient does not change as long as values in the table change the same relative to each other. The data for the contingency table can be specified in either of the following two ways:
<SUBSET/EXCEPT/FOR qualification> where <y1> is the first response variable; <y2> is the second response variable; <par> is a parameter where the computed Cramer contingency coefficient is stored; and where the <SUBSET/EXCEPT/FOR qualification> is optional. Use this syntax for raw data.
<SUBSET/EXCEPT/FOR qualification> where <m> is a matrix containing the contingency table; <p> is a parameter where the computed Cramer contingency coefficient is stored; and where the <SUBSET/EXCEPT/FOR qualification> is optional. Use this syntax if your data is a contingency table.
LET A = MATRIX GRAND CRAMER CONTINGENCY COEFICIENT M
Note that these commands are only available if you have raw data.
Friendly (2000), "Visualizing Categorical Data", SAS Institute Inc., p. 61.
. Example from page 61 of Friendly read matrix m 5 29 14 16 15 54 14 10 20 84 17 94 68 119 26 7 end of data . let a = matrix cramer contingency coefficient mThe result is 0.279.
|
Privacy
Policy/Security Notice
NIST is an agency of the U.S.
Commerce Department.
Date created: 07/24/2007 |