SED navigation bar go to SED home page go to Dataplot home page go to NIST home page SED Home Page SED Staff SED Projects SED Products and Publications Search SED Pages
Dataplot Vol 1 Vol 2

MANTEL-HAENSZEL TEST

Name:
    MANTEL-HAENSZEL TEST (LET)
Type:
    Analysis Command
Purpose:
    Perform a Mantel-Haenszel test of a series of fourfold (2x2) tables.
Description:
    Given two variables where each variable has exactly two possible outcomes (typically defined as success and failure), we define the odds ratio as:

      o = (N11/N12)/ (N21/N22)
          = (N11N22)/ (N12N21)

    where

      N11 = number of successes in sample 1
      N21 = number of failures in sample 1
      N12 = number of successes in sample 2
      N22 = number of failures in sample 2

    The first definition shows the meaning of the odds ratio clearly, although it is more commonly given in the literature with the second definition.

    The log odds ratio is the logarithm of the odds ratio:

      l(o) = LOG{(N11/N12)/ (N21/N22)}
             = LOG{(N11N22)/ (N12N21)}

    Alternatively, the log odds ratio can be given in terms of the proportions

      l(o) = LOG{(p11/p12)/ (p21/p22)}
             = LOG{(p11p22)/ (p12p21)}

    where

      p11 = N11/ (N11 + N21)
            = proportion of successes in sample 1
      p21 = N21/ (N11 + N21)
            = proportion of failures in sample 1
      p12 = N12/ (N12 + N22)
            = proportion of successes in sample 2
      p22 = N22/ (N12 + N22)
            = proportion of failures in sample 2

    Success and failure can denote any binary response. Dataplot expects "success" to be coded as "1" and "failure" to be coded as "0".

    The bias corrected version of the statistic is:

      l'(o) = LOG[{(N11+0.5) (N22+0.5)}/ {(N12+0.5) (N21+0.5)}]

    In addition to reducing bias, this statistic also has the advantage that the odds ratio is still defined even when N12 or N21 is zero (the uncorrected statistic will be undefined for these cases).

    Note that N11, N21, N12, and N22 defines a 2x2 contingency table. These types of contingency tables are also referred to as fourfold tables.

    Fleiss, Levin, and Paik also use the following formulation for the ith 2x2 table:

      Outcome Variable  
    Sample Present Absent Total

    1 Xi ni1 - Xi ni1
    2 mi - Xi Xi - li ni2

    Total mi ni. - mi ni.

    where li = mi + ni2 - ni..

    The Mantel-Haenszel test can be used to estimate the common odds ratio and to test whether the overall degree of association is significant. It is a consistent estimator in the following two cases:

    1. When the number of tables is fixed, and possibly small, but each table has large marginal frequencies.

    2. The number of tables is large. The marginal frequencies can be small in the individual tables.

    Define the following quantities

    • \( R_{i} = \frac{X_{i}(X_{i} - l_{i})} {n_{i.}} \)

    • \( S_{i} = \frac{(m_{i} - X_{i}) (n_{i1} - X_{i})} {n_{i.}} \)

    • \( R = \sum_{i=1}^{g}{R_{i}} \)

    • \( S = \sum_{i=1}^{g}{S_{i}} \)

    • \( P_{i} = \frac{X_{i} + X_{i} - l_{i}} {n_{i.}} \)

    • \( \begin{array}{lcl} Q_{i} & = & 1 - P_{i} \\ & = & \frac{m_{i} - X_{i} + n_{i1} - X_{i}} {n_{i.}} \end{array} \)

    The Mantel-Haenszel estimate of the common odds ratio is

      \( \hat{\omega}_{\mbox{MH}} = \frac{\sum_{i=1}^{g}{\frac{n_{i1}n_{i2}}{n_{i.}} p_{i1}(1 - p_{i2})}} {\sum_{i=1}^{g}{\frac{n_{i1}n_{i2}}{n_{i.}} p_{i2}(1 - p_{i1})}} = \frac{\sum_{i=1}^{g}{X_{i}(X_{i} - l_{i})/n_{i.}}} {\sum_{i=1}^{g}{(m_{i} - X_{i})(n_{i1} - X_{i})/n_{i.}}} \)

    where g denotes the number of groups.

    An estimate of the variance of \( \log(\hat{\omega}_{\mbox{MH}}) \) is

      \( \frac{1}{2} \left( \frac{\sum_{i=1}^{g}{P_i R_i}} {R^2} + \frac{\sum_{i=1}^{g}{(P_i S_i + Q_i R_i)}}{R S} + \frac{\sum_{i=1}^{g}{Q_i S_i}} {S^2} \right) \)

    A confidence interval for the log(odds ratio) is then

      \( \log(\hat{\omega}_{\mbox{MH}}) \pm z_{\alpha/2} \hat{\mbox{SE}}(\log(\hat{\omega}_{\mbox{MH}}) \)

    where \( \Phi^{-1} \) is the normal percent point function and SE is the standard error of the estimate (= square root of the variance).

    The Mantel-Haenszel chi-square statistic for the significance of the overall degree of association is

      \( \chi_{\mbox{MH}}^2 = \frac{\left( |\sum_{i=1}^{g}{\frac{n_{i1} n_{i2}}{n_{i.}} (p_{i1} - p_{i2})}| - 0.5 \right) ^2} {\sum_{i=1}^{g}{\frac{n_{i1} n_{i2}}{n_{i.} - 1} \bar{p} \bar{q}}} \)

    where

      Pi1 = n11/ ni1 Pi2 = n12/ ni2
      pbari = (ni1 Pi1 + ni2 Pi2)/ni.
      qbari = 1 - pbari

    The test statistic is compared to a chi-square distribution with one degree of freedom.

    The MANTEL-HAENSZEL TEST generates the following output:

    1. A summary table of various statistics (odds ratio, log(odds ratio), standard error of log(odds ratio)) for each group.

    2. The estimates of the common log(odds ratio) and the standard error of the common log(odds ratio).

    3. A table for the Mantel-Haenszel chi-square test for the overall degree of association.

    4. A large sample confidence interval for the log(odds ratio).
Syntax 1:
    MANTEL-HAENSZEL TEST <y1> <y2>
                            <SUBSET/EXCEPT/FOR qualification>
    where <y1> is the first response variable;
                <y2> is the second response variable;
    and where the <SUBSET/EXCEPT/FOR qualification> is optional.

    This syntax is used for the case where <y1> and <y2> denote a series of 2x2 tables (i.e., rows 1 and 2 are group 1, rows 3 and 4 are group 2, and so on).

Syntax 2:
    MANTEL-HAENSZEL TEST <y1> <y2> <groupid>
                            <SUBSET/EXCEPT/FOR qualification>
    where <y1> is the first response variable;
                <y2> is the second response variable;
                <groupid> is a group id variable;
    and where the <SUBSET/EXCEPT/FOR qualification> is optional.

    This syntax is used for the case where you have raw data (i.e., the data has not yet been cross tabulated into a two-way table). In this case, the two response variables have an equal number of cases for each group.

Syntax 3:
    MANTEL-HAENSZEL TEST <y1> <groupid1> <y2> <groupid2>
                            <SUBSET/EXCEPT/FOR qualification>
    where <y1> is the first response variable;
                <groupid1> is a group id variable corresponding to <y1>;
                <y2> is the second response variable;
                <groupid2> is a group id variable corresponding to <y2>;
    and where the <SUBSET/EXCEPT/FOR qualification> is optional.

    This syntax is used for the case where you have raw data (i.e., the data has not yet been cross tabulated into a two-way table). In this case, the two response variables may have an unequal number of cases for each group, so <y1> and <y2> require different group id variables.

Examples:
    MANTEL-HAENSZEL TEST Y1 Y2
    MANTEL-HAENSZEL TEST Y1 Y2 X
    MANTEL-HAENSZEL TEST Y1 X1 Y2 X2
Note:
    This test is similar to the odds ratio chi-square test. Fleiss, Levin, and Paik make the following recommendations in regard to these two tests (they include other tests in their comparison).

    1. If the number of groups is small or moderate and the sample sizes within each group are large, the log(odds ratio) test performs well.

    2. If the number of groups is large, but the sample sizes within the groups are small to moderate, then the Mantel-Haenszel test can be recommended. The log(odds ratio) test may perform poorly for this case.

    3. If the number of groups and the sample sizes within the groups are both small, exact methods may be required. Dataplot does not currently support any exact methods for this problem.
Note:
    The following information is written to the file dpst1f.dat (in the current directory):

      Column 1 = significance level
      Column 2 = lower confidence limit for common log(odds ratio)
      Column 3 = upper confidence limit for common log(odds ratio)
      Column 4 = lower confidence limit for common odds ratio
      Column 5 = upper confidence limit for common odds ratio

    To read this information into Dataplot, enter

      SET READ FORMAT F10.5,1X,4E15.7
      READ DPST1F.DAT SIGLEV LOGLOWCL LOGUPPCL
                                          ODDLOWCL ODDUPPCL

    Dataplot saves the following internal parameters:

      STATVAL = the Mantel-Haenszel test statistic
      STATCDFL = the cdf for the Mantel-Haesnzel test statistic
Default:
    None
Synonyms:
    None
Related Commands: Reference:
    Fleiss, Levin, and Paik (2003), Statistical Methods for Rates and Proportions, Third Edition, pp. 250-253.
Applications:
    Categorical Data Analysis
Implementation Date:
    2007/5
Program:
     
    let n1 = 105
    let n2 = 192
    let n3 = 145
    let n = n1 + n2 + n3
    let x = 3 for i = 1 1 n
    let istop = n1 + n2
    let x = 2 for i = 1 1 istop
    let x = 1 for i = 1 1 n1
    .
    set statistic missing value -99
    .
    .  Group 1 values
    .
    let y1 = 0 for i = 1 1 n
    let y2 = 0 for i = 1 1 n
    let y1 = 1 for i = 1 1  81
    let y2 = 1 for i = 1 1  34
    .
    .  Group 2 values (have unequal samples here, so fill
    .          with missing values
    .
    let istrt = n1 + 1
    let istop1 = istrt + 118 - 1
    let istop2 = istrt + 69 - 1
    let y1 = 1 for i = istrt 1 istop1
    let y2 = 1 for i = istrt 1 istop2
    let istrt2 = n1 + 174 + 1
    let istop2 = n1 + n2
    let y2 = -99 for i = istrt2 1 istop2
    .
    .  Group 3 values
    .
    let istrt = n1 + n2 + 1
    let istop1 = istrt + 82 - 1
    let istop2 = istrt + 52 - 1
    let y1 = 1 for i = istrt 1 istop1
    let y2 = 1 for i = istrt 1 istop2
    .
    mantel haenszel test y1 y2 x
        
    The following output is generated.
               SUMMARY OF LOG(ODDS RATIO)
      
           |                    LOG OF        STANDARD
           |   ODDS RATIO     ODDS RATIO        ERROR
     GROUP |      O(i)          L(i)          SE(L(i))
     ==================================================
        1. |    6.894114       1.930668      0.3099319
        2. |    2.414514      0.8814980      0.2138429
        3. |    2.313836      0.8389067      0.2400251
      
      
               MANTEL-HAENSZEL TEST
      
     NUMBER OF GROUPS                                =        3
     M-H ESTIMATE OF COMBINED LOG(ODDS RATIO)        =    3.004650
     M-H STANDARD ERROR OF COMBINED LOG(ODDS RATIO)  =   0.1408284
      
     MANTEL-HAENSZEL TEST STATISTIC (ASSOCIATION) =    62.05933
     CHI-SQUARE DEGRESS OF FREEDOM                =        1
     CHI-SQUARE CDF OF TEST STATISTIC             =    1.000000
      
        MANTEL-HAENSZEL (CHI-SQUARE) TEST FOR OVERALL DEGREE OF ASSOCIATION
                                           NULL HYPOTHESIS   NULL
     NULL          CONFIDENCE    CRITICAL  ACCEPTANCE        HYPOTHESIS
     HYPOTHESIS    LEVEL         VALUE     INTERVAL          CONCLUSION
     ===================================================================
     NO ASSOCIATION   50.0%        0.45     (0,0.500)        REJECT
     NO ASSOCIATION   80.0%        1.64     (0,0.800)        REJECT
     NO ASSOCIATION   90.0%        2.71     (0,0.900)        REJECT
     NO ASSOCIATION   95.0%        3.84     (0,0.950)        REJECT
     NO ASSOCIATION   97.5%        5.02     (0,0.975)        REJECT
     NO ASSOCIATION   99.0%        6.63     (0,0.990)        REJECT
      
      
     LARGE SAMPLE CONFIDENCE INTERVAL FOR LOG(ODDS RATIO)
                               LOG(ODDS RATIO)                  ODDS RATIO
                              (   3.004650    )           (   20.17915    )
        CONFIDENCE           LOWER         UPPER         LOWER         UPPER
        VALUE (%)            LIMIT         LIMIT         LIMIT         LIMIT
     -----------------------------------------------------------------------
          50.000           2.90966       3.09964       18.3506       22.1899
          80.000           2.82417       3.18513       16.8470       24.1704
          90.000           2.77301       3.23629       16.0067       25.4392
          95.000           2.72863       3.28067       15.3119       26.5935
          97.500           2.68900       3.32030       14.7169       27.6687
          99.000           2.64190       3.36740       14.0399       29.0030
        
Date created: 10/10/2008
Last updated: 12/11/2023

Please email comments on this WWW page to alan.heckert@nist.gov.