 Dataplot Vol 1 Vol 2

# ODDS RATIO INDEPENDENCE TEST

Name:
ODDS RATIO INDEPENDENCE TEST (LET)
Type:
Analysis Command
Purpose:
Perform a log odds ratio test of independence for a 2x2 contingency table.
Description:
Given two variables where each variable has exactly two possible outcomes (typically defined as success and failure), we define the odds ratio as:

o = (N11/N12)/ (N21/N22)
= (N11N22)/ (N12N21)

where

N11 = number of successes in sample 1
N21 = number of failures in sample 1
N12 = number of successes in sample 2
N22 = number of failures in sample 2

The first definition shows the meaning of the odds ratio clearly, although it is more commonly given in the literature with the second definition.

The log odds ratio is the logarithm of the odds ratio:

l(o) = LOG{(N11/N12)/ (N21/N22)}
= LOG{(N11N22)/ (N12N21)}

Alternatively, the log odds ratio can be given in terms of the proportions

l(o) = LOG{(p11/p12)/ (p21/p22)}
= LOG{(p11p22)/ (p12p21)}

where

p11 = N11/ (N11 + N21)
= proportion of successes in sample 1
p21 = N21/ (N11 + N21)
= proportion of failures in sample 1
p12 = N12/ (N12 + N22)
= proportion of successes in sample 2
p22 = N22/ (N12 + N22)
= proportion of failures in sample 2

Success and failure can denote any binary response. Dataplot expects "success" to be coded as "1" and "failure" to be coded as "0".

The bias corrected version of the statistic is:

l'(o) = LOG[{(N11+0.5) (N22+0.5)}/ {(N12+0.5) (N21+0.5)}]

In addition to reducing bias, this statistic also has the advantage that the odds ratio is still defined even when N12 or N21 is zero (the uncorrected statistic will be undefined for these cases).

Note that N11, N21, N12, and N22 defines a 2x2 contingency table. These types of contingency tables are also referred to as fourfold tables.

A common question with regards to a two-way contingency table is whether we have independence. By independence, we mean that the row and column variables are unassociated (i.e., knowing the value of the row variable will not help us predict the value of column variable and likewise knowing the value of the column variable will not help us predict the value of the row variable).

A more technical definition for independence is that

P(row i, column j) = P(row i)*P(column j)       for all i,j

One such test for the special case described above (i.e., we have success/failure data) is the log odds ratio test for independence.

 H0: The two-way table is independent Ha: The two-way table is not independent Test Statistic: The log odds ratio independence test statistic is: $$T = \frac{(N_{1} + N_{2})(N_{11}N_{22} - N_{12}N_{21})^2} {N_{1}N_{2}(N_{11} + N_{12})(N_{21} + N_{22})}$$ Some analysts prefer to use the Yates corrected version of the test statistic: $$T = \frac{(N_{1} + N_{2})(|N_{11}N_{22} - N_{12}N_{21}| - 0.5(N_{1} + N_{2}))^2} {N_{1}N_{2}(N_{11} + N_{12})(N_{21} + N_{22})}$$ Significance Level: $$\alpha$$ Critical Region: T > $$\phi$$ ($$\alpha$$) where $$\phi$$ is the percent point function of the normal distribution Conclusion: Reject the independence hypothesis if the value of the test statistic is greater than the normal value.
Syntax 1:
ODDS RATIO INDEPENDENCE TEST <y1> <y2>
<SUBSET/EXCEPT/FOR qualification>
where <y1> is the first response variable;
<y2> is the second response variable;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

This syntax is used for the case where you have raw data (i.e., the data has not yet been cross tabulated into a two-way table).

Syntax 2:
ODDS RATIO INDEPENDENCE TEST <m>
<SUBSET/EXCEPT/FOR qualification>
where <m> is a matrix containing the two-way table;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

This syntax is used for the case where we the data have already been cross-tabulated into a two-way contingency table.

Syntax 3:
ODDS RATIO INDEPENDENCE TEST <n11> <n12> <n21> <n22>
where <n11> is a parameter containing the value for row 1, column 1 of a 2x2 table;
<n12> is a parameter containing the value for row 1, column 2 of a 2x2 table;
<n21> is a parameter containing the value for row 2, column 1 of a 2x2 table;
<n22> is a parameter containing the value for row 2, column 2 of a 2x2 table.

This syntax is used for the special case where you have a 2x2 table. In this case, you can enter the 4 values directly, although you do need to be careful that the parameters are entered in the order expected above.

Examples:
ODDS RATIO INDEPENDENCE TEST Y1 Y2
ODDS RATIO INDEPENDENCE TEST M
ODDS RATIO INDEPENDENCE TEST N11 N12 N21 N22
Note:
Dataplot performs this test for both the bias corrected log(odds ratio) case and the no bias correction case.
Note:
The following information is written to the file dpst1f.dat (in the current directory):

 Column 1 - significance level Column 2 - lower confidence limit (uncorrected case) Column 3 - upper confidence limit (uncorrected case) Column 4 - lower confidence limit (corrected case) Column 5 - upper confidence limit (corrected case)

To read this information into Dataplot, enter

READ DPST1F.DAT SIGLEV UNCLOWCL UNCUPPCL CORLOWCL CORUPPCL

The following internal parameters are automatically saved after running this command:

 STATVAL = test statistic (uncorrected) STATVALY = test statistic (with Yates correction) STATCDF = cdf value for test statistic (uncorrected) STATCDFY = cdf value for test statistic (with Yates correction) ODDSRATI = value of the log(odds ratio) ODDSRASE = value of the standard error of the log(odds ratio) ODDSRABC = value of the bias corrected log(odds ratio) ODDSBCSE = value of the bias corrected standard error of the log(odds ratio)
Note:
The CHI-SQUARE INDEPENDENCE TEST performs an alternative test for independence.

The chi-square independence test is more general in the sense that it applies to RxC contingency tables, not just 2x2 tables.

Default:
None
Synonyms:
None
Related Commands:
 CHI-SQUARE INDEPENDENCE TEST = Perform a chi-square test for independence. FISHER EXACT TEST = Perform Fisher's exact test. ASSOCIATION PLOT = Generate an association plot. SIEVE PLOT = Generate a sieve plot. ROSE PLOT = Generate a Rose plot. BINARY TABULATION PLOT = Generate a binary tabulation plot. ROC CURVE = Generate a ROC curve. ODDS RATIO = Compute the bias corrected odds ratio. LOG ODDS RATIO = Compute the bias corrected log(odds ratio).
References:
Andrew Ruhkin, private communication.

Fleiss, Levin, and Paik (2003), "Statistical Methods for Rates and Proportions," Third Edition, Wiley, pp. 234-238.

Applications:
Categorical Data Analysis
Implementation Date:
2007/2
Program 1:

let n11 = 53
let n21 = 7
let n12 = 48
let n22 = 12
.
set write decimals 4
odds ratio independence test n11 n21 n12 n22

The following output is generated.

Log(Odds Ratio) Test for Independence
2x2 Table (Log(Odds Ratio) = 0)

H0: The Two Variables Are Independent
Ha: The Two Variables Are Not Independent

Sample 1:
Number of Observations:                    60
Number of Successes:                       53
Number of Failures:                        7
Probability of Success:                    0.8833
Probability of Failure:                    0.1167

Sample 2:
Number of Observations:                    60
Number of Successes:                       48
Number of Failures:                        12
Probability of Success:                    0.8000
Probability of Failure:                    0.2000

Log(Odds Ratio) = Log(n11*n22/(n12*n21)):
Log(Odds Ratio):                           0.6381
Standard Error of Log(Odds Ratio):         0.5156

Log(Odds Ratio) (Bias Corrected):          0.6089
Standard Error (Bias Corrected):           0.5026

Large Sample Confidence Interval for Log(Odds Ratio)

---------------------------------------------------------------------------
Uncorrected Ratio            Bias Corrected Ratio
(  0.6380874    )             (  0.6089435    )
Confidence          Lower          Upper          Lower          Upper
Value (%)          Limit          Limit          Limit          Limit
---------------------------------------------------------------------------
50.00         0.6381         0.6381         0.6089         0.6089
80.00         0.2041         1.0721         0.1859         1.0320
90.00        -0.0227         1.2989        -0.0352         1.2531
95.00        -0.2101         1.4863        -0.2178         1.4357
97.50        -0.3726         1.6487        -0.3762         1.5941
99.00        -0.5615         1.8377        -0.5604         1.7783

Test for Independence:
Chi-Square Test Statistic:                   1.5633
CDF of Test Statistic:                       0.9410

Test Statistic with Yates Correction:        1.0005
CDF of Test Statistic with Yates Correction: 0.8415

Without Yates Correction:

---------------------------------------------------------------------------
Null Hypothesis           Null
Null     Confidence       Critical     Acceptance     Hypothesis
Hypothesis          Level          Value       Interval     Conclusion
---------------------------------------------------------------------------
Independent          50.0%           0.00      (0,0.500)         REJECT
Independent          80.0%           0.84      (0,0.800)         REJECT
Independent          90.0%           1.28      (0,0.900)         REJECT
Independent          95.0%           1.64      (0,0.950)         ACCEPT
Independent          97.5%           1.96      (0,0.975)         ACCEPT
Independent          99.0%           2.33      (0,0.990)         ACCEPT

With Yates Bias Correction:

---------------------------------------------------------------------------
Null Hypothesis           Null
Null     Confidence       Critical     Acceptance     Hypothesis
Hypothesis          Level          Value       Interval     Conclusion
---------------------------------------------------------------------------
Independent          50.0%           0.00      (0,0.500)         REJECT
Independent          80.0%           0.84      (0,0.800)         REJECT
Independent          90.0%           1.28      (0,0.900)         ACCEPT
Independent          95.0%           1.64      (0,0.950)         ACCEPT
Independent          97.5%           1.96      (0,0.975)         ACCEPT
Independent          99.0%           2.33      (0,0.990)         ACCEPT

Program 2:

let n = 1
let p = 0.9
let y1 = binomial rand numb for i = 1 1 200
let p = 0.68
let y2 = binomial rand numb for i = 1 1 130
.
set write decimals 4
odds ratio independence test y1 y2

The following output is generated.

Log(Odds Ratio) Test for Independence
2x2 Table (Log(Odds Ratio) = 0)

H0: The Two Variables Are Independent
Ha: The Two Variables Are Not Independent

Sample 1:
Number of Observations:                    200
Number of Successes:                       175
Number of Failures:                        25
Probability of Success:                    0.8750
Probability of Failure:                    0.1250

Sample 2:
Number of Observations:                    130
Number of Successes:                       88
Number of Failures:                        42
Probability of Success:                    0.6769
Probability of Failure:                    0.3231

Log(Odds Ratio) = Log(n11*n22/(n12*n21)):
Log(Odds Ratio):                           1.2062
Standard Error of Log(Odds Ratio):         0.2844

Log(Odds Ratio) (Bias Corrected):          1.1955
Standard Error (Bias Corrected):           0.2824

Large Sample Confidence Interval for Log(Odds Ratio)

---------------------------------------------------------------------------
Uncorrected Ratio            Bias Corrected Ratio
(   1.206243    )             (   1.195462    )
Confidence          Lower          Upper          Lower          Upper
Value (%)          Limit          Limit          Limit          Limit
---------------------------------------------------------------------------
50.00         1.2062         1.2062         1.1955         1.1955
80.00         0.9669         1.4456         0.9578         1.4331
90.00         0.8418         1.5707         0.8336         1.5574
95.00         0.7384         1.6741         0.7310         1.6599
97.50         0.6488         1.7637         0.6420         1.7489
99.00         0.5446         1.8679         0.5385         1.8524

Test for Independence:
Chi-Square Test Statistic:                   19.1040
CDF of Test Statistic:                       1.0000

Test Statistic with Yates Correction:        17.8995
CDF of Test Statistic with Yates Correction: 1.0000

Without Yates Correction:

---------------------------------------------------------------------------
Null Hypothesis           Null
Null     Confidence       Critical     Acceptance     Hypothesis
Hypothesis          Level          Value       Interval     Conclusion
---------------------------------------------------------------------------
Independent          50.0%           0.00      (0,0.500)         REJECT
Independent          80.0%           0.84      (0,0.800)         REJECT
Independent          90.0%           1.28      (0,0.900)         REJECT
Independent          95.0%           1.64      (0,0.950)         REJECT
Independent          97.5%           1.96      (0,0.975)         REJECT
Independent          99.0%           2.33      (0,0.990)         REJECT

With Yates Bias Correction:

---------------------------------------------------------------------------
Null Hypothesis           Null
Null     Confidence       Critical     Acceptance     Hypothesis
Hypothesis          Level          Value       Interval     Conclusion
---------------------------------------------------------------------------
Independent          50.0%           0.00      (0,0.500)         REJECT
Independent          80.0%           0.84      (0,0.800)         REJECT
Independent          90.0%           1.28      (0,0.900)         REJECT
Independent          95.0%           1.64      (0,0.950)         REJECT
Independent          97.5%           1.96      (0,0.975)         REJECT
Independent          99.0%           2.33      (0,0.990)         REJECT


NIST is an agency of the U.S. Commerce Department.

Date created: 01/07/2008
Last updated: 11/04/2015