SED navigation bar go to SED home page go to Dataplot home page go to NIST home page SED Home Page SED Staff SED Projects SED Products and Publications Search SED Pages
Dataplot Vol 1 Vol 2

ASSOCIATION PLOT

Name:
    ASSOCIATION PLOT (LET)
Type:
    Graphics Command
Purpose:
    Generate an association plot for a two-way contingency table.
Description:
    The chi-square test of independence for a two-way contingency table is based on the following test statistic:

      \( \sum_{i=1}^{r}{\sum_{j=1}^{c}{\frac{O_{ij} - E_{ij}} {E_{ij}}}} \)

    where

      r = the number of rows in the contingency table
      c = the number of columns in the contingency table
      Oij = the observed frequency of the ith row and jth column
      Eij = the expected frequency of the ith row and jth column
        = \( \frac{R_i C_j}{N} \)
      Ri = the sum of the observed frequencies for row i
      Cj = the sum of the observed frequencies for column j
      N = the total sample size

    This test statistic can also be formulated as

      \( \sum_{i=1}^{r}{\sum_{j=1}^{c}{d_{ij}^2}} \)

    where

      \( d_{ij}^2 = \frac{O_{ij} - E_{ij}} {\sqrt{E_{ij}}} \)

    The dij are referred to as the standardized residuals and they show the contribution to the chi-square test statistic of each cell.

    The association plot is a graphical way of showing these standardized residuals. Specifically, the cells are laid out in a two-way grid and each cell is plotted with a rectangle where

    1. The width of the rectangle is proportional to the square root of the expected frequency.

    2. The height of the rectangle is proportional to the standardized residual (dij).

    For visual clarity, positive and negative standardized residuals are typically drawn in different colors.

Syntax 1:
    ASSOCIATION PLOT <y1> <y2>             <SUBSET/EXCEPT/FOR qualification>
    where <y1> is the first response variable;
                <y2> is the second response variable;
    and where the <SUBSET/EXCEPT/FOR qualification> is optional.

    This syntax is used for the case where you have raw data (i.e., the data has not yet been cross tabulated into a two-way table).

Syntax 2:
    ASSOCIATION PLOT <m>             <SUBSET/EXCEPT/FOR qualification>
    where <m> is a matrix containing the two-way table;
    and where the <SUBSET/EXCEPT/FOR qualification> is optional.

    This syntax is used for the case where we the data have already been cross-tabulated into a two-way contingency table.

Syntax 3:
    ASSOCIATION PLOT <n11> <n12> <n21> <n22>
    where <n11> is a parameter containing the value for row 1, column 1 of a 2x2 table;
                <n12> is a parameter containing the value for row 1, column 2 of a 2x2 table;
                <n21> is a parameter containing the value for row 2, column 1 of a 2x2 table;
                and <n22> is a parameter containing the value for row 2, column 2 of a 2x2 table.

    This syntax is used for the special case where you have a 2x2 table. In this case, you can enter the 4 values directly, although you do need to be careful that the parameters are entered in the order expected above.

Examples:
    ASSOCIATION PLOT Y1 Y2
    ASSOCIATION PLOT M
    ASSOCIATION PLOT N11 N12 N21 N22
Note:
    The example program below demonstrates how to specify the different colors for positive and negative standardized residuals. It also demonstrates how to label the rows and columns of the plot.
Default:
    None
Synonyms:
    None
Related Commands: Reference:
    Friendly (2000), "Visualizing Categorical Data," SAS Institute Inc., p. 90.

    Cohen (1980), "On the Graphical Display of the Significant Components in a Two-Way Contingency Table," Communications in Statistics--Theory and Methods, A9:1025-1041.

Applications:
    Graphical Analysis of Categorical Data
Implementation Date:
    2007/6
Program:
    . Example from page 61 of Friendly
    read matrix m
     5  29 14 16
    15  54 14 10
    20  84 17 94
    68 119 26 7
    end of data
    .
    label case asis
    tic mark label case asis
    title case asis
    title offset 2
    .
    x3label
    title Association Plot
    y1label displacement 12
    y1label Eye Color
    x1label Hair Color
    tic offset units data
    xlimits 1 4
    major xtic mark number 4
    minor xtic mark number 0
    xtic mark offset 1 1
    x1tic mark label format alpha
    x1tic mark label content Black Brown Red Blond
    ylimits 1 4
    major ytic mark number 4
    minor ytic mark number 0
    ytic mark offset 1 1
    y1tic mark label format alpha
    y1tic mark label content Green Hazel Blue Brown
    y1tic mark label justification right
    .
    line solid
    region fill color blue green
    .
    association plot m
        
    plot generated by sample program

Date created: 07/25/2007
Last updated: 12/01/2023

Please email comments on this WWW page to alan.heckert@nist.gov.