SED navigation bar go to SED home page go to Dataplot home page go to NIST home page SED Home Page SED Contacts SED Projects SED Products and Publications Search SED Pages
Dataplot Vol 1 Auxillary Chapter

ODDS RATIO INDEPENDENCE TEST

Name:
    ODDS RATIO INDEPENDENCE TEST (LET)
Type:
    Analysis Command
Purpose:
    Perform a log odds ratio test of independence for a 2x2 contingency table.
Description:
    Given two variables where each variable has exactly two possible outcomes (typically defined as success and failure), we define the odds ratio as:

      o = (N11/N12)/ (N21/N22)
          = (N11N22)/ (N12N21)

    where

      N11 = number of successes in sample 1
      N21 = number of failures in sample 1
      N12 = number of successes in sample 2
      N22 = number of failures in sample 2

    The first definition shows the meaning of the odds ratio clearly, although it is more commonly given in the literature with the second definition.

    The log odds ratio is the logarithm of the odds ratio:

      l(o) = LOG{(N11/N12)/ (N21/N22)}
             = LOG{(N11N22)/ (N12N21)}

    Alternatively, the log odds ratio can be given in terms of the proportions

      l(o) = LOG{(p11/p12)/ (p21/p22)}
             = LOG{(p11p22)/ (p12p21)}

    where

      p11 = N11/ (N11 + N21)
            = proportion of successes in sample 1
      p21 = N21/ (N11 + N21)
            = proportion of failures in sample 1
      p12 = N12/ (N12 + N22)
            = proportion of successes in sample 2
      p22 = N22/ (N12 + N22)
            = proportion of failures in sample 2

    Success and failure can denote any binary response. Dataplot expects "success" to be coded as "1" and "failure" to be coded as "0".

    The bias corrected version of the statistic is:

      l'(o) = LOG[{(N11+0.5) (N22+0.5)}/ {(N12+0.5) (N21+0.5)}]

    In addition to reducing bias, this statistic also has the advantage that the odds ratio is still defined even when N12 or N21 is zero (the uncorrected statistic will be undefined for these cases).

    Note that N11, N21, N12, and N22 defines a 2x2 contingency table. These types of contingency tables are also referred to as fourfold tables.

    A common question with regards to a two-way contingency table is whether we have independence. By independence, we mean that the row and column variables are unassociated (i.e., knowing the value of the row variable will not help us predict the value of column variable and likewise knowing the value of the column variable will not help us predict the value of the row variable).

    A more technical definition for independence is that

      P(row i, column j) = P(row i)*P(column j)       for all i,j

    One such test for the special case described above (i.e., we have success/failure data) is the log odds ratio test for independence.

      H0: The two-way table is independent
      Ha: The two-way table is not independent
      Test Statistic: The log odds ratio independence test statistic is:

        T = (N1 + N2)*(N11*N22 - N12*N21)**2/{N1*N2*(N11+N12)*(N21+N22)}

      Some analysts prefer to use the Yates corrected version of the test statistic:

        T = (N1 + N2)*(|N11*N22 - N12*N21| - 0.5*(N1 + N2))**2/
{N1*N2*(N11+N12)*(N21+N22)}
      Significance Level: alpha
      Critical Region: T > NORPPF(alpha)

      where phi is the percent point function of the normal distribution

      Conclusion: Reject the independence hypothesis if the value of the test statistic is greater than the normal value.
    Syntax 1:
      ODDS RATIO INDEPENDENCE TEST <y1> <y2>
                              <SUBSET/EXCEPT/FOR qualification>
      where <y1> is the first response variable;
                  <y2> is the second response variable;
      and where the <SUBSET/EXCEPT/FOR qualification> is optional.

      This syntax is used for the case where you have raw data (i.e., the data has not yet been cross tabulated into a two-way table).

    Syntax 2:
      ODDS RATIO INDEPENDENCE TEST <m>
                              <SUBSET/EXCEPT/FOR qualification>
      where <m> is a matrix containing the two-way table;
      and where the <SUBSET/EXCEPT/FOR qualification> is optional.

      This syntax is used for the case where we the data have already been cross-tabulated into a two-way contingency table.

    Syntax 3:
      ODDS RATIO INDEPENDENCE TEST <n11> <n12> <n21> <n22>
      where <n11> is a parameter containing the value for row 1, column 1 of a 2x2 table;
                  <n12> is a parameter containing the value for row 1, column 2 of a 2x2 table;
                  <n21> is a parameter containing the value for row 2, column 1 of a 2x2 table;
                  <n22> is a parameter containing the value for row 2, column 2 of a 2x2 table.

      This syntax is used for the special case where you have a 2x2 table. In this case, you can enter the 4 values directly, although you do need to be careful that the parameters are entered in the order expected above.

    Examples:
      ODDS RATIO INDEPENDENCE TEST Y1 Y2
      ODDS RATIO INDEPENDENCE TEST M
      ODDS RATIO INDEPENDENCE TEST N11 N12 N21 N22
    Note:
      Dataplot performs this test for both the bias corrected log(odds ratio) case and the no bias correction case.
    Note:
      The following information is written to the file dpst1f.dat (in the current directory):

        Column 1 - significance level
        Column 2 - lower confidence limit (uncorrected case)
        Column 3 - upper confidence limit (uncorrected case)
        Column 4 - lower confidence limit (corrected case)
        Column 5 - upper confidence limit (corrected case)

      To read this information into Dataplot, enter

        SET READ FORMAT F10.5,1X,4E15.7
        READ DPST1F.DAT SIGLEV UNCLOWCL UNCUPPCL CORLOWCL CORUPPCL

      The following internal parameters are automatically saved after running this command:

        STATVAL = test statistic (uncorrected)
        STATVALY = test statistic (with Yates correction)
        STATCDF = cdf value for test statistic (uncorrected)
        STATCDFY = cdf value for test statistic (with Yates correction)
        ODDSRATI = value of the log(odds ratio)
        ODDSRASE = value of the standard error of the log(odds ratio)
        ODDSRABC = value of the bias corrected log(odds ratio)
        ODDSBCSE = value of the bias corrected standard error of the log(odds ratio)
    Note:
      The CHI-SQUARE INDEPENDENCE TEST performs an alternative test for independence.

      The chi-square independence test is more general in the sense that it applies to RxC contingency tables, not just 2x2 tables.

    Default:
      None
    Synonyms:
      None
    Related Commands: Reference:
      Andrew Ruhkin, private communication.

      Fleiss, Levin, and Paik (2003), "Statistical Methods for Rates and Proportions", Third Edition, pp. 234-238.

    Applications:
      Categorical Data Analysis
    Implementation Date:
      2007/2
    Program 1:
       
      let n11 = 53
      let n21 = 7
      let n12 = 48
      let n22 = 12
      .
      odds ratio independence test n11 n21 n12 n22
          
      The following output is generated.
       
                 LOG(ODDS RATIO) TEST FOR INDEPENDENCE (2X2 TABLE)
        
       NULL HYPOTHESIS: THE TWO VARIABLES ARE INDEPENDENT (LOG (ODDS RATIO) = 0)
       ALTERNATIVE HYPOTHESIS: THE TWO VARIABLES ARE NOT INDEPENDENT
        
       SAMPLE 1:
       NUMBER OF OBSERVATIONS                    =       60
       NUMBER OF SUCCESSES                       =       53
       NUMBER OF FAILURES                        =        7
       PROBABILITY OF SUCCESS                    =   0.8833333
       PROBABILITY OF FAILURE                    =   0.1166667
        
       SAMPLE 2:
       NUMBER OF OBSERVATIONS                    =       60
       NUMBER OF SUCCESSES                       =       48
       NUMBER OF FAILURES                        =       12
       PROBABILITY OF SUCCESS                    =   0.8000000
       PROBABILITY OF FAILURE                    =   0.2000000
        
       LOG(ODDS RATIO) = LOG(n11*n22/(n12*n21)):
       LOG(ODDS RATIO)                            =   0.6380874
       STANDARD ERROR OF LOG(ODDS RATIO)          =   0.5156469
        
       LOG(ODDS RATIO) (BIAS CORRECTED)           =   0.6089435
       STANDARD ERROR (BIAS CORRECTED)            =   0.5026365
        
       LARGE SAMPLE CONFIDENCE INTERVAL FOR LOG(ODDS RATIO)
                                UNCORRECTED RATIO         BIAS CORRECTED RATIO
                                (  0.6380874    )           (  0.6089435    )
          CONFIDENCE           LOWER         UPPER         LOWER         UPPER
          VALUE (%)            LIMIT         LIMIT         LIMIT         LIMIT
       -----------------------------------------------------------------------
            50.000          0.638087      0.638087      0.608943      0.608943
            80.000          0.204108       1.07207      0.185914       1.03197
            90.000          -.227407E-01   1.29892      -.352110E-01   1.25310
            95.000          -.210076       1.48625      -.217820       1.43571
            97.500          -.372562       1.64874      -.376206       1.59409
            99.000          -.561487       1.83766      -.560364       1.77825
        
       TEST FOR INDEPENDENCE:
       CHI-SQUARE TEST STATISTIC                  =    1.563314
       CDF OF TEST STATISTIC                      =   0.9410107
        
       TEST STATISTIC (WITH YATES CORRECTION)     =    1.000521
       CDF OF TEST STATISTIC (YATES CORRECTION)   =   0.8414708
        
       WITHOUT YATES BIAS CORRECTION:
                                             NULL HYPOTHESIS   NULL
       NULL          CONFIDENCE    CRITICAL  ACCEPTANCE        HYPOTHESIS
       HYPOTHESIS    LEVEL         VALUE     INTERVAL          CONCLUSION
       ===================================================================
       INDEPENDENT      50.0%        0.00     (0,0.500)        REJECT
       INDEPENDENT      80.0%        0.84     (0,0.800)        REJECT
       INDEPENDENT      90.0%        1.28     (0,0.900)        REJECT
       INDEPENDENT      95.0%        1.64     (0,0.950)        ACCEPT
       INDEPENDENT      97.5%        1.96     (0,0.975)        ACCEPT
       INDEPENDENT      99.0%        2.33     (0,0.990)        ACCEPT
        
       WITH YATES BIAS CORRECTION:
                                             NULL HYPOTHESIS   NULL
       NULL          CONFIDENCE    CRITICAL  ACCEPTANCE        HYPOTHESIS
       HYPOTHESIS    LEVEL         VALUE     INTERVAL          CONCLUSION
       ===================================================================
       INDEPENDENT      50.0%        0.00     (0,0.500)        REJECT
       INDEPENDENT      80.0%        0.84     (0,0.800)        REJECT
       INDEPENDENT      90.0%        1.28     (0,0.900)        ACCEPT
       INDEPENDENT      95.0%        1.64     (0,0.950)        ACCEPT
       INDEPENDENT      97.5%        1.96     (0,0.975)        ACCEPT
       INDEPENDENT      99.0%        2.33     (0,0.990)        ACCEPT
          
    Program 2:
       
      let n = 1
      let p = 0.9
      let y1 = binomial rand numb for i = 1 1 200
      let p = 0.68
      let y2 = binomial rand numb for i = 1 1 130
      .
      odds ratio independence test y1 y2
          
      The following output is generated.
       
       THE SUM OF THE      200 OBSERVATIONS =   0.1750000E+03
        
       THE SUM OF THE      130 OBSERVATIONS =   0.8800000E+02
        
                 LOG(ODDS RATIO) TEST FOR INDEPENDENCE (2X2 TABLE)
        
       NULL HYPOTHESIS: THE TWO VARIABLES ARE INDEPENDENT (LOG (ODDS RATIO) = 0)
       ALTERNATIVE HYPOTHESIS: THE TWO VARIABLES ARE NOT INDEPENDENT
        
       SAMPLE 1:
       NUMBER OF OBSERVATIONS                    =      200
       NUMBER OF SUCCESSES                       =      175
       NUMBER OF FAILURES                        =       25
       PROBABILITY OF SUCCESS                    =   0.8750000
       PROBABILITY OF FAILURE                    =   0.1250000
        
       SAMPLE 2:
       NUMBER OF OBSERVATIONS                    =      130
       NUMBER OF SUCCESSES                       =       88
       NUMBER OF FAILURES                        =       42
       PROBABILITY OF SUCCESS                    =   0.6769231
       PROBABILITY OF FAILURE                    =   0.3230769
        
       LOG(ODDS RATIO) = LOG(n11*n22/(n12*n21)):
       LOG(ODDS RATIO)                            =    1.206243
       STANDARD ERROR OF LOG(ODDS RATIO)          =   0.2844072
        
       LOG(ODDS RATIO) (BIAS CORRECTED)           =    1.195462
       STANDARD ERROR (BIAS CORRECTED)            =   0.2823872
        
       LARGE SAMPLE CONFIDENCE INTERVAL FOR LOG(ODDS RATIO)
                                UNCORRECTED RATIO         BIAS CORRECTED RATIO
                                (   1.206243    )           (   1.195462    )
          CONFIDENCE           LOWER         UPPER         LOWER         UPPER
          VALUE (%)            LIMIT         LIMIT         LIMIT         LIMIT
       -----------------------------------------------------------------------
            50.000           1.20624       1.20624       1.19546       1.19546
            80.000          0.966880       1.44561      0.957799       1.43313
            90.000          0.841761       1.57073      0.833568       1.55736
            95.000          0.738435       1.67405      0.730976       1.65995
            97.500          0.648815       1.76367      0.641993       1.74893
            99.000          0.544613       1.86787      0.538531       1.85239
        
       TEST FOR INDEPENDENCE:
       CHI-SQUARE TEST STATISTIC                  =    19.10401
       CDF OF TEST STATISTIC                      =    1.000000
        
       TEST STATISTIC (WITH YATES CORRECTION)     =    17.89948
       CDF OF TEST STATISTIC (YATES CORRECTION)   =    1.000000
        
       WITHOUT YATES BIAS CORRECTION:
                                             NULL HYPOTHESIS   NULL
       NULL          CONFIDENCE    CRITICAL  ACCEPTANCE        HYPOTHESIS
       HYPOTHESIS    LEVEL         VALUE     INTERVAL          CONCLUSION
       ===================================================================
       INDEPENDENT      50.0%        0.00     (0,0.500)        REJECT
       INDEPENDENT      80.0%        0.84     (0,0.800)        REJECT
       INDEPENDENT      90.0%        1.28     (0,0.900)        REJECT
       INDEPENDENT      95.0%        1.64     (0,0.950)        REJECT
       INDEPENDENT      97.5%        1.96     (0,0.975)        REJECT
       INDEPENDENT      99.0%        2.33     (0,0.990)        REJECT
        
       WITH YATES BIAS CORRECTION:
                                             NULL HYPOTHESIS   NULL
       NULL          CONFIDENCE    CRITICAL  ACCEPTANCE        HYPOTHESIS
       HYPOTHESIS    LEVEL         VALUE     INTERVAL          CONCLUSION
       ===================================================================
       INDEPENDENT      50.0%        0.00     (0,0.500)        REJECT
       INDEPENDENT      80.0%        0.84     (0,0.800)        REJECT
       INDEPENDENT      90.0%        1.28     (0,0.900)        REJECT
       INDEPENDENT      95.0%        1.64     (0,0.950)        REJECT
       INDEPENDENT      97.5%        1.96     (0,0.975)        REJECT
       INDEPENDENT      99.0%        2.33     (0,0.990)        REJECT
          

    Date created: 1/7/2008
    Last updated: 1/7/2008
    Please email comments on this WWW page to alan.heckert@nist.gov.