SED navigation bar go to SED home page go to Dataplot home page go to NIST home page SED Home Page SED Staff SED Projects SED Products and Publications Search SED Pages
Dataplot Vol 1 Vol 2

STATISTIC PLOT

Name:
    ... STATISTIC PLOT
Type:
    Graphics Command
Purpose:
    Generates a statistic versus index plot for a given statistic.
Description:
    A statistic plot consists of subsample statistic versus subsample index. The subsample statistic is the value of some statistic for the data in the subsample. The statistic plot is used to answer the question--"Does the subsample statistic change over different subsamples?". The plot consists of:

      Vertical axis: subsample statistic;
      Horizontal axis: subsample index.

    The statistic plot yields 2 traces:

    1. a subsample statistic trace; and
    2. a full-sample statistic reference line.

    The appearance of these two traces is controlled by the first two settings of the LINES, CHARACTERS, SPIKES, BARS, and associated attribute setting commands.

Syntax 1:
    <stat> STATISTIC PLOT <y1> ... <yk> <x>
                            <SUBSET/EXCEPT/FOR qualification>
    where <stat> is one of Dataplot's supported statistics;
                <y1> ... <yk> is a list of 1 to 3 response variables (<stat> determines how many response variables);
                <x> is the subsample identifier variable (this variable appears on the horizontal axis);
    and where the <SUBSET/EXCEPT/FOR qualification> is optional.

    For a list of supported statistics, enter

Syntax 2:
    <stat> STATISTIC PLOT <y1> ... <yk> <x>
                            <SUBSET/EXCEPT/FOR qualification>
    where <stat> is one of Dataplot's supported statistics;
                <y1> ... <yk> is a list of 1 to 30 response variables;
                <x> is the subsample identifier variable (this variable appears on the horizontal axis);
    and where the <SUBSET/EXCEPT/FOR qualification> is optional.

    This syntax is used for multiple response variables. See the Note section below for details on this syntax.

    For a list of supported statistics, enter

Examples:
    MEAN PLOT Y X
    STANDARD DEVIATION PLOT Y X1
    MEAN PLOT Y1 TO Y5 X
Note:
    A number of the subcommands (e.g., MEAN PLOT) are documented individually.
Note:
    Although DATAPLOT supports this command for a large number of statistics, there may be cases where you want it for another one. The following example shows how to compute the rank correlation (assume Y1 and Y2 are the response variables and TAG is the group identifier).

      LET TAGDIST = DISTINCT TAG
      LET NGROUP = SIZE TAGDIST
        LOOP FOR K = 1 1 NGROUP
        LET IGROUP TAGDIST(K)
        LET A = RANK CORRELATION Y1 Y2 SUBSET TAG = IGROUP
        LET YNEW(K) = A
        LET XNEW(K) = K
        END OF LOOP
      LET A = RANK CORRELATION Y1 Y2
      LET YNEW2 = DATA A A
      LET XNEW2 = DATA 1 NGROUP
      PLOT YNEW XNEW AND
      PLOT YNEW2 XNEW2

    This basic idea can be easily adapted to other statistics (even ones that are not built-in to DATAPLOT). It can also be adapted to statistics requiring any arbitrary number of variables to compute.

Note:
    The 5/2009 version of Dataplot updated the PLOT command to support multiple response variables (Syntax 2). For example,

      MEAN PLOT Y1 TO Y4 X

    That is, for each distinct value of X, there are now 4 means plotted instead of just one.

    The following commands can be used to control the appearance of the plot:

      SET STATISTIC PLOT FORMAT <DEX/OVERLAY>
      SET STATISTIC PLOT SUMMARY <VARIABLE/GROUP>

    If the FORMAT option is set to OVERLAY and the SUMMARY option is set to VARIABLE, this is equivalent to the following:

      YLIMITS ...
      PRE-ERASE OFF
      ERASE
      MEAN PLOT Y1 X
      MEAN PLOT Y2 X
      MEAN PLOT Y3 X
      MEAN PLOT Y4 X
      PRE-ERASE ON

    That is, there will be a curve corresponding to each response variable and there will be a reference line corresponding to each variable.

    If the FORMAT option is set to DEX, then this plot uses a format similar to the DEX <stat> PLOT command. That is, for each distinct value of X, there will be curve connecting the mean values for the 4 response variables.

    If the SUMMARY option is set to GROUP, there will be a single reference curve. At each distinct value of X, a single overall mean is computed for all 4 of the response variables.

    In addition, the following option is added to this command:

      <stat> <zscore/uscore> PLOT

    If ZSCORE is given, then a z-score transformation (subtract the mean and then divide by the standard deviation) is computed on each response variable first. If USCORE is given, then a u-score transformation (subtract the minimum and divide by the range) is computed on each column. Note these z-score and u-score transformations apply to the entire response variable, not to each distinct group within the response variable.

Note:
    By default, Dataplot draws a reference line where the vertical axis coordinate is the value of the statistic for all of the data.

    For some statistics (e.g., STANDARD DEVIATION and other scale statistics), this may not be particularly meaningful. Alternatively you can specify either the mean or the median value of the statistic over the groups. For example, if you are generating a standard deviation plot and you have 10 groups, you can specify that the reference line be drawn at the mean (or the median) of the 10 computed standard deviations.

    To specify what reference line is drawn, enter

      SET STATISTIC PLOT REFERENCE LINE <OVERALL/AVERAGE/MEDIAN>

    where OVERALL is the value of the statistic for all of the data, AVERAGE is the mean of the statistic over the groups, and MEDIAN is the median of the statistic over the groups.

    The default is OVERALL.

Default:
    None
Synonyms:
    On most of the commands, the word STATISTIC is optional and is usually omitted (e.g., the mean plot is documented under MEAN PLOT rather than MEAN STATISTIC PLOT).
Related Commands:
    CHARACTERS = Sets the type for plot characters.
    LINES = Sets the type for plot lines.
    BOX PLOT = Generates a box plot.
    CONTROL CHART = Generates a control chart.
    PLOT = Generates a data or function plot.
    SUMMARY = Computes various statistics for a variable.
Applications:
    Exploratory Data Analysis
Implementation Date:
    1988/2
    2009/4: support for multiple response variables
    2015/4: added SET STATISTIC PLOT REFERENCE LINE

    The list of supported statistics has been regulary updated since the original 1988/2 implementation.

Program 1:
     
    SKIP 25
    READ GEAR.DAT DIAMETER BATCH
    .
    TITLE AUTOMATIC
    TITLE OFFSET 2
    MULTIPLOT 2 2
    MULTIPLOT CORNER COORDINATES 3 0 100 100
    MULTIPLOT SCALE FACTOR 2
    X1LABEL DISPLACEMENT 14
    Y1LABEL DISPLACEMENT 12
    TIC MARK LABEL SIZE 1.8
    .
    XTIC OFFSET 1 1
    X1LABEL BATCH
    LINE BLANK SOLID
    CHARACTER X BLANK
    Y1LABEL MEAN
    TITLE MEAN PLOT
    MEAN PLOT DIAMETER BATCH
    Y1LABEL STANDARD DEVIATION
    TITLE SD PLOT
    STANDARD DEVIATION PLOT DIAMETER BATCH
    Y1LABEL RELATIVE STANDARD DEVIATION
    TITLE RELSD PLOT
    RELSD PLOT DIAMETER BATCH
    Y1LABEL RANGE
    TITLE RANGE PLOT
    RANGE PLOT DIAMETER BATCH
    .
    END OF MULTIPLOT
        
    plot generated by sample program
Program 2:
     
    skip 25
    read iris.dat y1 to y4 x
    .
    title case asis
    title offset 2
    label case asis
    y1label Mean
    x1label Group-ID
    xlimits 1 3
    major xtic mark number 3
    minor xtic mark number 0
    xtic offset 0.6 0.6
    ytic offset 1 1
    .
    set stat plot format  dex
    set stat plot summary vari
    title sp()Case 1: Format = DEX, Summary = Variable
    line color black black black blue red green cyan
    mean plot y1 to y4 x
    .
    set stat plot format  dex
    set stat plot summary group
    title sp()Case 2: Format = DEX, Summary = Group
    mean plot y1 to y4 x
    .
    set stat plot format overlay
    set stat plot summary group
    line color blue red green cyan
    line so so so so bl
    char bl bl bl bl x
    title sp()Case 3: Format = Overlay, Summary = Group
    mean plot y1 to y4 x
    .
    set stat plot format overlay
    set stat plot summary variable
    line so all
    char bl all
    line color blue red green cyan blue red green cyan
    title sp()Case 4: Format = Overlay, Summary = Variable
    mean plot y1 to y4 x
        
    plot generated by sample program

    plot generated by sample program

    plot generated by sample program

    plot generated by sample program

Privacy Policy/Security Notice
Disclaimer | FOIA

NIST is an agency of the U.S. Commerce Department.

Date created: 09/22/2011
Last updated: 10/19/2015

Please email comments on this WWW page to alan.heckert@nist.gov.