SED navigation bar go to SED home page go to Dataplot home page go to NIST home page SED Home Page SED Contacts SED Projects SED Products and Publications Search SED Pages
Dataplot Vol 1 Auxillary Chapter

CROSS TABULATE

Name:
    CROSS TABULATE
Type:
    Analysis Command
Purpose:
    Generates a cross tabulation of a response variable for between two and six independent variables.
Description:
    The independent variables (also referred to as group-id variables), are mutually exclusive categories which form a two-way table. The response variable must fall into exactly one row and column of this table. By default, this command calculates the counts for each row and column combination. Alternatively, it can calculate a specified statistic for each row and column combination.

    The 3/2008 version extended support for up to six independent variables.

Syntax 1:
    CROSS TABULATE <tag1> ... <tagk>
                            <SUBSET/EXCEPT/FOR qualification>
    CROSS TABULATE COUNTS <tag1> ... <tagk>
                            <SUBSET/EXCEPT/FOR qualification>
    where <tag1> ... <tagk> is a list of two to six group-id variables;
    and where the <SUBSET/EXCEPT/FOR qualification> is optional.

    This syntax generates a count of the number of elements in each row and column combination.

Syntax 2:
    CROSS TABULATE <STAT> <y> <tag1> ... <tagk>
                            <SUBSET/EXCEPT/FOR qualification>
    where <y> is a response variable;             <tag1> ... <tagk> is a list of two to six group-id variables;
                <stat> is one of the following statistics:
        MEAN, MIDMEAN, MEDIAN, TRIMMED MEAN, WINSORIZED MEAN,
        GEOMETRIC MEAN, HARMONIC MEAN, HODGES LEHMAN,
        BIWEIGHT LOCATION,LP LOCATION,
        SUM, PRODUCT, SIZE (or NUMBER or SIZE),
        STANDARD DEVIATION, STANDARD DEVIATION OF MEAN,
        VARIANCE, VARIANCE OF THE MEAN,
        VARIANCE OF LP LOCATION,
        SD OF LP LOCATION,
        TRIMMED MEAN STANDARD ERROR,
        AVERAGE ABSOLUTE DEVIATION (or AAD),
        MEDIAN ABSOLUTE DEVIATION (or MAD),
        IQ RANGE, BIWEIGHT MIDVARIANCE, BIWEIGHT SCALE,
        PERCENTAGE BEND MIDVARIANCE, SN SCALE, QN SCALE,
        WINSORIZED VARIANCE, WINSORIZED STANDARD DEVIATION,
        RELATIVE STANDARD DEVIATION, RELATIVE VARIANCE (or
        COEFFICIENT OF VARIATION),
        RANGE, MIDRANGE, MAXIMUM, MINIMUM, EXTREME,
        LOWER HINGE, UPPER HINGE,
        LOWER QUARTILE, UPPER QUARTILE,
        <FIRST/SECOND/THIRD/FOURTH/FIFTH/SIXTH/SEVENTH/EIGHTH/
        NINTH/TENTH> DECILE,
        PERCENTILE, QUANTILE, QUANTILE STANDARD ERROR,
        SKEWNESS, KURTOSIS, NORMAL PPCC,
        AUTOCORRELATION, AUTOCOVARIANCE,
        CP, CPK, CNPK, CPM, CC,
        EXPECTED LOSS, PERCENT DEFECTIVE,
        SINE FREQUENCY, SINE AMPLITUDE,
        TAGUCHI SN0 (or SN), TAGUCHI SN+ (or SNL),
        TAGUCHI SN- (or SNS), TAGUCHI SN00 (or SN2);

    and where the <SUBSET/EXCEPT/FOR qualification> is optional.

    This syntax computes the value of the specified statistic of the elements in the response variable (<y>) for each row and column combination.

Syntax 3:
    CROSS TABULATE <STAT> <y1> <y2> <tag1> ... <tagk>
                            <SUBSET/EXCEPT/FOR qualification>
    where <y1> is the first response variable;
                <y2> is the second response variable;
                <tag1> ... <tagk> is a list of two to six group-id variables;
                <stat> is one of the following statistics:
        LINEAR INTERCEPT, LINEAR SLOPE, LINEAR RESSD,
        LINEAR CORRELATION,
        CORRELATION, RANK CORRELATION,
        COVARIANCE, RANK COVARIANCE,
        COMOVEMENT, RANK COMOVEMENT,
        WINSORIZED COVARIANCE, WINSORIZED COVARIANCE,
        BIWEIGHT MIDCOVARIANCE, BIWEIGHT MIDCORRELATION,
        PERCENTAGE BEND CORRELATION,
        RATIO;

    and where the <SUBSET/EXCEPT/FOR qualification> is optional.

    This syntax computes the value of the specified statistic of the elements in the response variables ( and ) for each row and column combination.

Syntax 4:
    CROSS TABULATE WEIGHTED <STAT> <y1> <wt> <tag1> ... <tag2>
                            <SUBSET/EXCEPT/FOR qualification>
    where <y1> is the response variable;
                <wt> is the weights variable;
                <tag1> ... <tagk> is a list of two to six group-id variables;
                <stat> is one of the following statistics:
        MEAN, STANDARD DEVIATION (or SD), VARIANCE;

    and where the <SUBSET/EXCEPT/FOR qualification> is optional.

    This syntax computes the value of the specified weighted statistic of the elements in the response variable () for each row and column combination.

Syntax 5:
    CROSS TABULATE DIFFERENCE OF <STAT> <y1> <wt> <tag1> ... <tag2>
                            <SUBSET/EXCEPT/FOR qualification>
    where <y1> is the first response variable;
                <y2> is the second response variable;
                <tag1> ... <tagk> is a list of two to six group-id variables;
                <stat> is one of the following statistics:
        MEAN, MIDMEAN, MEDIAN, TRIMMED MEAN, WINSORIZED MEAN,
        GEOMETRIC MEAN, HARMONIC MEAN, HODGES LEHMAN,
        MIDRANGE, BIWEIGHT LOCATION, LP LOCATION,SUM,
        STANDARD DEVIATION, STANDARD DEVIATION OF MEAN,
        VARIANCE, VARIANCE OF THE MEAN,
        VARIANCE OF LP LOCATION,
        SD OF LP LOCATION,
        AVERAGE ABSOLUTE DEVIATION (or AAD),
        MEDIAN ABSOLUTE DEVIATION (or MAD),
        IQ RANGE, BIWEIGHT MIDVARIANCE, BIWEIGHT SCALE,
        PERCENTAGE BEND MIDVARIANCE, SN SCALE, QN SCALE,
        WINSORIZED VARIANCE, WINSORIZED STANDARD DEVIATION,
        RELATIVE STANDARD DEVIATION, RELATIVE VARIANCE,
        COEFFICIENT OF VARIATION, RANGE,
        MAXIMUM, MINIMUM, EXTREME, QUANTILE,
        SKEWNESS, KURTOSIS;

    and where the <SUBSET/EXCEPT/FOR qualification> is optional.

    This syntax computes the difference between two response variables of the specified statistic for each row and column combination.

Examples:
    CROSS TABULATE Y1 TAG1 TAG2
    CROSS TABULATE Y1 TAG1 TAG2 SUBSET TAG2 = 2 TO 4
Note:
    The output is also written to the file DPST1F.DAT. For example,

      CROSS TABULATE MEAN Y X1 X2
      SKIP 1
      READ Z1 Z2 ZMEAN
Note:
    To specify the number of digits to print to the right of the decimal point, enter the command (the default is exponential format)

      SET WRITE DECIMALS <value>
Note:
    Since there is now a separate CHI-SQUARE INDEPENDENCE TEST, the CHI-SQUARE option available in prior versions of Dataplot is no longer supported.
Note:
    If <stat> is BINOMIAL PROPORTION or DIFFERENCE OF BINOMIAL PROPORTIONS, then a few extra columns are printed.

    In these cases, the response variable is assumed to consist of 1's or 0's (to denote success or failure, respectively). In addition to the proportion of successes, a column will be printed for the number of trials and for the lower and upper Agresti-Coull confidence limits. To specify whether lower tailed, upper tailed, or two-tailed confidence limits are desired, enter the command (two-tailed is the default)

      SET BINOMIAL TAIL <LOWER/UPPER/TWO-TAILED>

    To specify the significance level to use for the confidence limits, enter (0.05 is the default)

      LET ALPHA = <value>
Default:
    None
Synonyms:
    None
Related Commands:
    TABULATE = Generates a tabulation for a one-way table.
    ANOVA = Performs an analysis of variance.
    FLUCTUATION PLOT = Generates a fluctuation plot.
Applications:
    Exploratory Data Analysis, Categorical Data Analysis
Implementation Date:
    1989/12
    2002/8: List of supported statistics greatly expanded.
    2003/3: Support for "WEIGHTED" and "DIFFERENCE OF" statistics added.
    2003/5: Added support for SN SCALE, QN SCALE, DIFFERENCE OF SN, DIFFERENCE OF QN
    2008/3: Added support for more than 2 group-id variables
    2008/3: Added support for SET WRITE DECIMALS command
Program:
     
    SKIP 25
    READ RIPKEN.DAT Y X1 X2
    CROSS TABULATE X1 X2
    CROSS TABULATE MEAN Y X1 X2
    CROSS TABULATE SD Y X1 X2
    CROSS TABULATE RANGE Y X1 X2
        
    The following output is generated.
     
            X1             X2       *    COUNT
     **************************************************************
         1.00000        1.00000     *     4.00000
         1.00000        2.00000     *     4.00000
         1.00000        3.00000     *     4.00000
         2.00000        1.00000     *     4.00000
         2.00000        2.00000     *     4.00000
         2.00000        3.00000     *     4.00000
         3.00000        1.00000     *     4.00000
         3.00000        2.00000     *     4.00000
         3.00000        3.00000     *     4.00000
      
                                    *    Y
            X1             X2       *    MEAN
     **************************************************************
         1.00000        1.00000     *    0.278500
         1.00000        2.00000     *    0.355750
         1.00000        3.00000     *    0.236000
         2.00000        1.00000     *    0.328250
         2.00000        2.00000     *    0.522000
         2.00000        3.00000     *    0.495000
         3.00000        1.00000     *    0.137500
         3.00000        2.00000     *    0.225750
         3.00000        3.00000     *    0.213750
      
                                    *    Y
            X1             X2       *    STANDARD DEVIATION
     **************************************************************
         1.00000        1.00000     *    0.130112
         1.00000        2.00000     *    0.170367E-01
         1.00000        3.00000     *    0.171637
         2.00000        1.00000     *    0.378451E-01
         2.00000        2.00000     *    0.239155
         2.00000        3.00000     *    0.339815
         3.00000        1.00000     *    0.368194E-01
         3.00000        2.00000     *    0.101411
         3.00000        3.00000     *    0.150635
      
                                    *    Y
            X1             X2       *    RANGE
     **************************************************************
         1.00000        1.00000     *    0.234000
         1.00000        2.00000     *    0.400000E-01
         1.00000        3.00000     *    0.388000
         2.00000        1.00000     *    0.800000E-01
         2.00000        2.00000     *    0.525000
         2.00000        3.00000     *    0.715000
         3.00000        1.00000     *    0.800000E-01
         3.00000        2.00000     *    0.232000
         3.00000        3.00000     *    0.333000
        

Date created: 1/23/2009
Last updated: 1/23/2009
Please email comments on this WWW page to alan.heckert@nist.gov.