SED navigation bar go to SED home page go to Dataplot home page go to NIST home page SED Home Page SED Contacts SED Projects SED Products and Publications Search SED Pages
Dataplot Vol 1 Auxillary Chapter

FISHER EXACT TEST

Name:
    FISHER EXACT TEST (LET)
Type:
    Analysis Command
Purpose:
    Perform a Fisher exact test of independence for a two-way contingency table.
Description:
    If we have N observations with two variables where each observation can be classified into one of R mutually exclusive categories for variable one and one of C mutually exclusive categories for variable two, then a cross-tabulation of the data results in a two-way contingency table (also referred to as an RxC contingency table). The resulting contingency table has R rows and C columns.

    Conover idenifies three distinct scenarios for contingency tables:

    1. Row totals are fixed, column totals are random (or alternatively, column totals are fixed and row totals are random).

      An example of this would be where the row totals are sample sizes (which may or may not be equal). Conover uses the example where we sample a population before and after a treatment and we are counting the presence or absence of some condition in each of the samples.

      In this case, the hypothesis being tested is homegeneity. In the 2x2 case, this means we are testing whether the probability of success is equal in two Bernoulli populations.

      The normal theory test of equality of proportions is the large sample approximation to the Fisher exact test.

    2. Both row totals and column totals are random.

      This model would apply when we have a single sample that is classified according to two properties where property one has R possible outcomes and property two has C possible outcomes.

      In this case, the hypothesis being tested is bivariate independence.

      The chi-square test of independence is the large sample approximation to the Fisher exact test (see case 3).

    3. Both row totals and column totals are fixed.

      In this case, the null hypothesis being tested is independence. By independence, we mean that the row and column variables are unassociated (i.e., knowing the value of the row variable will not help us predict the value of column variable and likewise knowing the value of the column variable will not help us predict the value of the row variable).

      A more technical definition for independence is that

        P(row i, column j) = P(row i)*P(column j) for all i,j

    Note that the Fisher exact test returns the same p-value for each of these models. What does change is the power of the test. The power of the test is highest when the row and column totals are both fixed. That is, when the row and column totals are fixed, the Fisher exact test really is exact. However, when either the row or column totals is random, the test is still valid. However, it may become too conservative.

    The Fisher exact test is based on the probability of obtaining a table more extreme than the observed table. For example, for the 2x2 case when both row and column totals are fixed, the test statistic is the frequency of the row 1, column 1 cell. This is compared to the hypergeometric distribution.

    The Fisher exact test is typically used when the row and column totals are small. When they are large, the chi-square independence test is sufficiently accurate. In addition, the computational burden of the Fisher exact test can become prohibitively high as the marginal totals get higher (and the values of R and C increase).

    Dataplot computes the Fisher exact test using ACM algorithm 643, the FEXACT routine, written by Mehta and Patel. This algorithm supports the RxC case (not just the 2x2 case) and is based on a network algorithm. See the Mehta and Patel articles given in the References section for details of the algorithm.

Syntax 1:
    FISHER EXACT TEST <y1> <y2>
                            <SUBSET/EXCEPT/FOR qualification>
    where <y1> is the first response variable;
                <y2> is the second response variable;
    and where the <SUBSET/EXCEPT/FOR qualification> is optional.

    This syntax is used for the case where you have raw data (i.e., the data has not yet been cross tabulated into a two-way table).

Syntax 2:
    FISHER EXACT TEST <m>             <SUBSET/EXCEPT/FOR qualification>
    where <m> is a matrix containing the two-way table;
    and where the <SUBSET/EXCEPT/FOR qualification> is optional.

    This syntax is used for the case where we the data have already been cross-tabulated into a two-way contingency table.

Syntax 3:
    FISHER EXACT TEST <n11> <n12> <n21> <n22>
    where <n11> is a parameter containing the value for row 1, column 1 of a 2x2 table;
    <n12> is a parameter containing the value for row 1, column 2 of a 2x2 table;
    <n21> is a parameter containing the value for row 2, column 1 of a 2x2 table;
    <n22> is a parameter containing the value for row 2, column 2 of a 2x2 table.

    This syntax is used for the special case where you have a 2x2 table. In this case, you can enter the 4 values directly, although you do need to be careful that the parameters are entered in the order expected above.

Examples:
    FISHER EXACT TEST Y1 Y2
    FISHER EXACT TEST M
    FISHER EXACT TEST N11 N12 N21 N22
Note:
    The FEXACT routine allows you to set the value of the following parameters:

      EXPECT - if EXPECT <= 0, then the exact test will be performed.

      If EXPECT > 0, then the PERCNT and EMIN parameters are examined.

      PERCNT - if EXPECT > 0, the value of PERCNT specifies the number of cells that must have estimated expected values > EXPECT before the asymptotic chi-sqare test will be used.
      EMIN - if EXPECT > 0, the value of EMIN identifies the minimum estimated expected value for a cell for the asymptotic chi-square test to be used.

    The default values are referred to as the Cochran conditions:

      EXPECT = 5.0
      PERCNT = 80.0
      EMIN = 1.0

    The following commands can be entered to change the default settings:

      SET FISHER EXACT TEST EXPECT <value>
      SET FISHER EXACT TEST PERCNT <value>
      SET FISHER EXACT TEST EMIN <value>
Default:
    None
Synonyms:
    None
Related Commands: Reference:
    Mehta and Patel (1986), "ALGORITHM 643: FEXACT: a FORTRAN Subroutine for Fisher's Exact Test on Unordered RxC Contingency Tables", Volume 12, No. 2, pp. 154-161.

    Mehta and Patel (1983), "A Network Algorithm for Performing Fisher's Exact Test in rxc Contingency Tables", Journal of the American Statistical Association, Vol.78. No, 382, pp. 427-434.

    Conover (1999), "Practical Nonparametric Statistics", Third Edition, Wiley, pp. 204-216.

Applications:
    Categorical Data Analysis
Implementation Date:
    2007/3
Program 1:
     
    . Example from page 190 of Conover
    read matrix m
    1  9
    3  1
    end of data
    .
    fisher exact test m
        
    The following output is generated:
               FISHER EXACT TEST FOR INDEPENDENCE (RXC TABLE)
      
     NULL HYPOTHESIS: THE TWO VARIABLES ARE INDEPENDENT
     ALTERNATIVE HYPOTHESIS: THE TWO VARIABLES ARE NOT INDEPENDENT
      
     SAMPLE 1:
     NUMBER OF OBSERVATIONS                    =       14
     NUMBER OF LEVELS (ROWS)                   =        2
      
     SAMPLE 2:
     NUMBER OF OBSERVATIONS                    =       14
     NUMBER OF LEVELS (COLUMNS)                =        2
      
     PROBABILITY OF OBSERVED TABLE            =   0.3996005E-01
     P-VALUE                                  =   0.4095904E-01
     CDF VALUE OF TEST STATISTIC              =   0.9590409
      
     TWO-SIDED TEST:
                                 NULL HYPOTHESIS   NULL
     NULL          CONFIDENCE    ACCEPTANCE        HYPOTHESIS
     HYPOTHESIS    LEVEL         INTERVAL          CONCLUSION
     =========================================================
     INDEPENDENT      50.0%       (0.250,0.750)        REJECT
     INDEPENDENT      80.0%       (0.100,0.900)        REJECT
     INDEPENDENT      90.0%       (0.050,0.950)        REJECT
     INDEPENDENT      95.0%       (0.025,0.975)        ACCEPT
     INDEPENDENT      99.0%       (0.005,0.995)        ACCEPT
        
Program 2:
     
    . Example from page 160 of Mehta and Patel ACM paper
    read matrix m
    1  2  2  1  1  0
    2  0  0  2  3  0
    0  1  1  1  2  7
    1  1  2  0  0  0
    0  1  1  1  1  0
    end of data
    .
    fisher exact test m
        
    The following output is generated:
               FISHER EXACT TEST FOR INDEPENDENCE (RXC TABLE)
      
     NULL HYPOTHESIS: THE TWO VARIABLES ARE INDEPENDENT
     ALTERNATIVE HYPOTHESIS: THE TWO VARIABLES ARE NOT INDEPENDENT
      
     SAMPLE 1:
     NUMBER OF OBSERVATIONS                    =       34
     NUMBER OF LEVELS (ROWS)                   =        5
      
     SAMPLE 2:
     NUMBER OF OBSERVATIONS                    =       34
     NUMBER OF LEVELS (COLUMNS)                =        6
      
     PROBABILITY OF OBSERVED TABLE            =   0.7752854E-10
     P-VALUE                                  =   0.2583887E-01
     CDF VALUE OF TEST STATISTIC              =   0.9741611
      
     TWO-SIDED TEST:
                                 NULL HYPOTHESIS   NULL
     NULL          CONFIDENCE    ACCEPTANCE        HYPOTHESIS
     HYPOTHESIS    LEVEL         INTERVAL          CONCLUSION
     =========================================================
     INDEPENDENT      50.0%       (0.250,0.750)        REJECT
     INDEPENDENT      80.0%       (0.100,0.900)        REJECT
     INDEPENDENT      90.0%       (0.050,0.950)        REJECT
     INDEPENDENT      95.0%       (0.025,0.975)        ACCEPT
     INDEPENDENT      99.0%       (0.005,0.995)        ACCEPT
        

Date created: 10/21/2008
Last updated: 10/21/2008
Please email comments on this WWW page to alan.heckert@nist.gov.