SED navigation bar go to SED home page go to Dataplot home page go to NIST home page SED Home Page SED Staff SED Projects SED Products and Publications Search SED Pages
Dataplot Vol 1 Vol 2

AVERAGE SHIFTED HISTOGRAM

Name:
    AVERAGE SHIFTED HISTOGRAM
Type:
    Graphics Command
Purpose:
    Generates an average shifted histogram.
Description:
    In addition to providing a convenient summary of a univariate set of data, the histogram can also be thought of as a simple kernel density estimator.

    David Scott has proposed the average shifted histogram (see chapter 5 of the "Multivariate Density Estimation: Theory and Practice, and Visualization" listed in the Reference section below) as a kernel density estimator that maintains the computational simplicity of the histogram while providing performance comparable to the more computationally intensive kernel density plot (enter HELP KERNEL DENSITY PLOT for details on the kernel density plot).

    The basic algorithm for the average shifted histogram is:

    1. Choose a class width of h (in Dataplot, you can select this class width with either the CLASS WIDTH or the SET HISTOGRAM CLASS WIDTH command, otherwise a default class width of 0.3 times the sample standard deviation will be used).

    2. Choose m where we construct a collection of m histograms, each with a class width of h, but with start points t0 = 0, h/m, 2h/m, ... , (m-1)h/m.

      In Dataplot, the value of m is set by entering the command

        LET M = <value>

      before entering the AVERAGE SHIFTED HISTOGRAM command.

      If the number of points is less than or equal to 100, the default value is 4. If the number of points is less than or equal to 1,000, the default value is 8. If the number of points is greater than 1,000, the default value is 16.

      Dataplot sets values of m < 1 to 1 and values of m > 64 to 64.

    3. This results in a "smoothed" histogram with a bin width of delta=h/m. Higher values of m result in a smoother estimate. Values of m are typically in the range 4 to 32.

    This is the algorithm given on page 117 of Scott. This effectively gives an isosceles triangle weighting function. Scott gives a generalization of the ASH algorithm that gives a biweight weighting function. This is the ASH1 algorithm on page 118. This will generate a smoother curve with less local noise than the triangular weighting.

    To use this biweight weighting function, enter the command

      SET AVERAGE SHIFTED HISTOGRAM WEIGHT BIWEIGHT

    To restore the default triangular weighting, enter

      SET AVERAGE SHIFTED HISTOGRAM WEIGHT TRIANGULAR
Syntax:
    AVERAGE SHIFTED HISTOGRAM <x>
                            <SUBSET/EXCEPT/FOR qualification>
    where <x> is the variable of raw data values;
    and where the <SUBSET/EXCEPT/FOR qualification> is optional.
Examples:
    AVERAGE SHIFTED HISTOGRAM TEMP
    AVERAGE SHIFTED HISTOGRAM Y SUBSET TAG = 2
    AVERAGE SHIFTED HISTOGRAM Y FOR I = 1 1 800
Note:
    Dataplot implements average shifted histograms using the algorithms BIN1 and ASH1 given on pages 117-118 of the Scott book.
Note:
    The average shifted histogram can be adapted to higher dimensional data. It is this multivariate case where the computational simplicity (relative to the kernel density plot) is particularly attractive. At this time, we have not implemented the multivariate case. We do plan to implement it in a future release.
Note:
    The AVERAGE SHIFTED HISTOGRAM command generates an estimate of the underlying density function. You can convert this to an estimate of the cumulative distribution function by integrating the density estimate. The following shows an example of doing this in Dataplot.

      LET Y = NORMAL RANDOM NUMBERS FOR I = 1 1 1000
      AVERAGE SHIFTED HISTOGRAM Y
      LET YPDF = YPLOT
      LET XPDF = XPLOT
      LET YCDF = CUMULATIVE INTEGRAL YPDF XPDF
      TITLE ESTIMATE OF UNDERLYING CUMULATIVE DISTRIBUTION
      PLOT YCDF XPDF

    You can also obtain an estimate of the percent point function (inverse cdf) with the following additional commands:

      LET YPPF = XCDF
      LET XPPF = YCDF
Default:
    None
Synonyms:
    ASH is a synonym for the AVERAGE SHIFTED HISTOGRAM command.
Related Commands: Reference:
    David Scott (1992), "Multivariate Density Estimation," John Wiley, (chapter 5 in particular).

    B. W. Silverman (1986), "Density Estimation for Statistics and Data Analysis," Chapman & Hall.

Applications:
    Density Estimation
Implementation Date:
    2004/9
Program:
     
    LET Y = NORMAL RANDOM NUMBERS FOR I = 1 1 1000
    TITLE OFFSET 2
    MULTIPLOT CORNER COORDINATES 0 0 100 100
    MULTIPLOT SCALE FACTOR 2
    MULTIPLOT 2 2
    LET M = 1
    TITLE ASH (M=1)
    AVERAGE SHIFTED HISTOGRAM Y
    LET M = 4
    TITLE ASH (M=4)
    AVERAGE SHIFTED HISTOGRAM Y
    LET M = 16
    TITLE ASH (M=16)
    AVERAGE SHIFTED HISTOGRAM Y
    LET M = 32
    TITLE ASH (M=32)
    AVERAGE SHIFTED HISTOGRAM Y
    END OF MULTIPLOT
        
    plot generated by sample program
Date created: 12/05/2005
Last updated: 12/01/2023

Please email comments on this WWW page to alan.heckert@nist.gov.