SED navigation bar go to SED home page go to Dataplot home page go to NIST home page SED Home Page SED Staff SED Projects SED Products and Publications Search SED Pages

Dataplot News

Introduction This web page contains a copy of the Dataplot news file. This file provides a chronological listing of Dataplot enhancements since June, 1994. Updates prior to that date are not included as these are incorporated into the published copy of the Dataplot Reference Manual.

We try to keep the online help files (i.e., the Dataplot HELP command) and the online Dataplot Reference Manual up to date with recent enhancements. However, sometimes there is a bit of a lag between the enhancement and the incorporation into the Dataplot online documentation. In that case, the news file may provide the only documentation for the command.

In addition, if enhancements are made to a currently existing command, these enhancements may not be available in the Dataplot Reference Manual. However, they will be documented in the news file.

A copy of the news file is also available in the Dataplot auxillary directory in the file DPNEWF.TEX. The default auxillary directory is "/usr/local/lib/dataplot" for Unix platforms and "C:\Program Files\NIST\DATAPLOT" for Windows platforms. However, your local installation may use a different directory.

Useful for Determining Whether to Upgrade The chronological order of the news file is also useful in determining if you should upgrade your local implementation of Dataplot. That is, examine the entries in the news file since your most recent upgrade to determine if these recent enhancements are of interest to you.
20.
                                                         November  2010
This is the DATAPLOT News file DPNEWF.TEX.  This NEWS file contains a
list of DATAPLOT enhancements over the last few years.  This is
typically the only place that the most recent enhancements are
documented.

To get a hardcopy off-line listing of this file, exit DATAPLOT and
enter:

    IBM PC:     PRINT C:\DATAPLOT\DPNEWF.TEX
    UNIX:       lpr /usr/local/lib/dataplot/dpnewf.tex
    VAX:        PRINT DATAPLO$:DPNEWF.TEX  (where DATAPLO$ defines the
                  directory where DATAPLOT auxillary files are kept)
    other:      Check with your local DATAPLOT installer;
                at NIST:  Alan Heckert (301-975-2899)
                          Jim Filliben (301-975-2855)

Your installation may define the directory where the DATAPLOT
auxillary files are stored differently than the list above.

-----------------------------------------------------------------------
The following enhancements were made to DATAPLOT
December 2010 - October      2013.
-----------------------------------------------------------------------

 1) The following library functions were added:

        LET YOUT = MERGE(Y1,Y2,TAG)
        LET YOUT = MERGE3(Y1,Y2,Y3,TAG)
        LET YOUT = RELDIF(Y1,Y2)
        LET YOUT = RELERR(Y1,Y2)
        LET YOUT = PERCDIF(Y1,Y2)
        LET YOUT = PERCERR(Y1,Y2)
        LET YOUT = ANGRAD(X1,Y1,X2,Y2,X3,Y3)
        LET YOUT = DPNTLINE(X1,Y1,X2,Y2,SLOPE)
        LET YOUT = SLOPE(X1,Y1,X2,Y2)
        LET YOUT = LININTER(X1,Y1,X2,Y2,X3)

    In addition, the MIN and MAX functions were updated to handle up to
    eight input arguments (previously limited to two arguments).

 2) The following string commands were added:

    a) LET IVAL = STRING COMPARE A B

       IVAL will be set to 1 if strings A and B are identical and
       set to 0 if they are not identical.

    b) LET SBASE = GROUP LABEL TO STRINGS IG

       This command will convert the group labels in IG to
       the strings SBASE1, SBASE2, ..., SBASEN (where N is the
       number of group labels in IG).

    c) Several enhancements were made to the use of row labels.

       The command 

           LET ROWLABEL = <ix>

       previously would convert a character variable, <ix>
       found in the character data file (dpzchf.dat) to row
       labels.  This command was expanded to include the
       numeric variables as well.  One example where this can
       be useful is when lab-id's are used to label plot points.
       Note that the dpzchf.dat file is searched first.  If no
       match is found there, then Dataplot will check the list
       of currently defined numeric variables.

       The command

          LET ROWLABEL = STRING TO ROW LABEL <irow> <s>

       was added.  This will set the <irow>th row label to <s>
       If <s> is a previously defined string, then the contents
       of that string will be used.  If no previously defined string
       is found, then <s> is treated as literal text.

       The command

           LET ROWLABEL <ivalue> = <string>

       will define the <ivalue>-th row of the row labels to
       <string>.

       The commands

           LET ROWLABEL = SHIFT LEFT  <ivalue>
           LET ROWLABEL = SHIFT RIGHT <ivalue>

       will shift the row label by <ivalue> rows either left
       (= down) or right (up).  The vacated rows will be set to
       blank.

       The command

           LET ROWLABEL = DELETE

       will re-initialize all row labels to blank.

 3) The following enhancements were made to the LET subcommands.

    a) The following new statistic LET subcommands were added:

          LET A = SHANNON DIVERSITY INDEX Y
          LET A = SIMPSON DIVERSITY INDEX Y
          LET A = ROBUST POOLED STANDARD DEVIATION Y
          LET A = ROBUST POOLED RANGE Y
          LET A = UNIQUE X
          LET A = EXCESS KURTOSIS Y
          LET A = SUM OF SQUARES Y
          LET A = DIFFERENCE OF SUM OF SQUARES Y1 Y2
          LET A = SUM OF SQUARES FROM MEAN Y
          LET A = DIFFERENCE OF SUM OF SQUARES FROM MEAN Y1 Y2
          LET A = RESCALED SUM Y
          LET A = RLP Y

          LET QUANT = 
          LET A = Q QUANTILE RANGE Y

          LET A = PERCENT AGREE Y1 Y2
          LET A = PERCENT DISAGREE Y1 Y2
          LET A = CORRELATION PVALUE Y1 Y2
          LET A = CORRELATION CDF Y1 Y2
          LET A = CORRELATION ABSOLUTE VALUE Y1 Y2
          LET A = RANK CORRELATION ABSOLUTE VALUE Y1 Y2
          LET A = RANK CORRELATION CDF Y1 Y2
          LET A = RANK CORRELATION PVALUE Y1 Y2
          LET A = RANK CORRELATION LOWER TAILED PVALUE Y1 Y2
          LET A = RANK CORRELATION UPPER TAILED PVALUE Y1 Y2
          LET A = KENDALL TAU ABSOLUTE VALUE Y1 Y2
          LET A = KENDALL TAU CDF Y1 Y2
          LET A = KENDALL TAU PVALUE Y1 Y2
          LET A = KENDALL TAU LOWER TAILED PVALUE Y1 Y2
          LET A = KENDALL TAU UPPER TAILED PVALUE Y1 Y2
          LET A = PARTIAL CORRELATION Y1 Y2 Y3
          LET A = PARTIAL CORRELATION PVALUE Y1 Y2 Y3
          LET A = PARTIAL CORRELATION CDF Y1 Y2 Y3
          LET A = PARTIAL CORRELATION ABSOLUTE VALUE Y1 Y2 Y3
          LET A = PARTIAL RANK CORRELATION Y1 Y2 Y3
          LET A = PARTIAL RANK CORRELATION ABSOLUTE VALUE Y1 Y2 Y3
          LET A = PARTIAL KENDALL TAU CORRELATION Y1 Y2 Y3
          LET A = PARTIAL KENDALL TAU ABSOLUTE VALUE Y1 Y2 Y3
          LET A = INDEX FIRST MATCH Y1 Y2
          LET A = INDEX LAST  MATCH Y1 Y2
          LET A = INDEX FIRST NOT MATCH Y1 Y2
          LET A = INDEX LAST  NOT MATCH Y1 Y2

          LET A = WEIGHTED ORDER STATISTIC MEAN  Y W
          LET A = WEIGHTED SUM   Y W
          LET A = WEIGHTED SUM OF SQUARES   Y W
          LET A = WEIGHTED SUM OF ABSOLUTE VALUES   Y W
          LET A = WEIGHTED AVERAGE OF ABSOLUTE VALUES   Y W
          LET A = WEIGHTED SUM OF DEVIATIONS FROM THE MEAN   Y W
          LET A = WEIGHTED SUM OF SQUARED DEVIATIONS FROM THE MEAN   Y W

          LET A = A BASIS NORMAL                                Y
          LET A = A BASIS LOGNORMAL                             Y
          LET A = A BASIS WEIBULL                               Y
          LET A = A BASIS NONPARAMETRIC                         Y
          LET A = B BASIS NORMAL                                Y
          LET A = B BASIS LOGNORMAL                             Y
          LET A = B BASIS WEIBULL                               Y
          LET A = B BASIS NONPARAMETRIC                         Y
          LET A = LOWER CONFIDENCE LIMIT                        Y
          LET A = UPPER CONFIDENCE LIMIT                        Y
          LET A = ONE SIDED LOWER CONFIDENCE LIMIT              Y
          LET A = ONE SIDED UPPER CONFIDENCE LIMIT              Y
          LET A = LOWER PREDICTION LIMIT                        Y
          LET A = UPPER PREDICTION LIMIT                        Y
          LET A = ONE SIDED LOWER PREDICTION LIMIT              Y
          LET A = ONE SIDED UPPER PREDICTION LIMIT              Y
          LET A = LOWER PREDICTION BOUND                        Y
          LET A = UPPER PREDICTION BOUND                        Y
          LET A = ONE SIDED LOWER PREDICTION BOUND              Y
          LET A = ONE SIDED UPPER PREDICTION BOUND              Y
          LET A = LOWER SD CONFIDENCE LIMIT                     Y
          LET A = UPPER SD CONFIDENCE LIMIT                     Y
          LET A = ONE SIDED LOWER SD CONFIDENCE LIMIT           Y
          LET A = ONE SIDED UPPER SD CONFIDENCE LIMIT           Y
          LET A = LOWER SD PREDICTION LIMIT                     Y
          LET A = UPPER SD PREDICTION LIMIT                     Y
          LET A = ONE SIDED LOWER SD PREDICTION LIMIT           Y
          LET A = ONE SIDED UPPER SD PREDICTION LIMIT           Y
          LET A = CUMULATIVE SUM FORWARD TEST                   Y
          LET A = CUMULATIVE SUM FORWARD TEST PVALUE            Y
          LET A = CUMULATIVE SUM BACKWARD TEST                  Y
          LET A = CUMULATIVE SUM BACKWARD TEST PVALUE           Y
          LET A = DIXON TEST                                    Y
          LET A = DIXON MINIMUM TEST                            Y
          LET A = DIXON MAXIMUM TEST                            Y
          LET A = EXTREME STUDENTIZED DEVIATE TEST              Y
          LET A = FREQUENCY TEST CDF                            Y
          LET A = FREQUENCY TEST                                Y
          LET A = FREQUENCY WITHIN A BLOCK TEST CDF             Y
          LET A = FREQUENCY WITHIN A BLOCK TEST                 Y
          LET A = GRUBB TEST                                    Y
          LET A = GRUBB TEST CDF                                Y
          LET A = GRUBB TEST DIRECTION                          Y
          LET A = GRUBB TEST INDEX                              Y
          LET A = JARQUE BERA TEST                              Y
          LET A = JARQUE BERA TEST CDF                          Y
          LET A = JARQUE BERA TEST PVALUE                       Y
          LET A = MEAN SUCCESSIVE DIFFERENCE TEST               Y
          LET A = MEAN SUCCESSIVE DIFFERENCE TEST NORMALIZED    Y
          LET A = MEAN SUCCESSIVE DIFFERENCE TEST CDF           Y
          LET A = MEAN SUCCESSIVE DIFFERENCE TEST PVALUE        Y
          LET A = NORMAL TOLERANCE K FACTOR                     Y
          LET A = NORMAL TOLERANCE LOWER LIMIT                  Y
          LET A = NORMAL TOLERANCE UPPER LIMIT                  Y
          LET A = NORMAL TOLERANCE ONE SIDED K FACTOR           Y
          LET A = NORMAL TOLERANCE ONE SIDED LOWER LIMIT        Y
          LET A = NORMAL TOLERANCE ONE SIDED UPPER LIMIT        Y
          LET A = ONE SAMPLE SIGN TEST                          Y
          LET A = ONE SAMPLE SIGN TEST CDF                      Y
          LET A = ONE SAMPLE SIGN TEST PVALUE                   Y
          LET A = ONE SAMPLE SIGN TEST LOWER TAIL PVALUE        Y
          LET A = ONE SAMPLE SIGN TEST UPPER TAIL PVALUE        Y
          LET A = ONE SAMPLE T TEST                             Y
          LET A = ONE SAMPLE T TEST CDF                         Y
          LET A = ONE SAMPLE T TEST PVALUE                      Y
          LET A = ONE SAMPLE T TEST LOWER TAIL PVALUE           Y
          LET A = ONE SAMPLE T TEST UPPER TAIL PVALUE           Y
          LET A = ONE SAMPLE WILCOXON SIGNED RANK TEST          Y
          LET A = ONE SAMPLE WILCOXON SIGNED RANK TEST CDF      Y
          LET A = ONE SAMPLE WILCOXON SIGNED RANK TEST PVALUE   Y
          LET A = ONE SAMPLE WILCOXON TEST LOWER TAILED PVALUE  Y
          LET A = ONE SAMPLE WILCOXON TEST UPPER TAILED PVALUE  Y
          LET A = SUMMARY NORMAL TOLERANCE K FACTOR             MEAN SD N
          LET A = SUMMARY NORMAL TOLERANCE LOWER LIMIT          MEAN SD N
          LET A = SUMMARY NORMAL TOLERANCE UPPER LIMIT          MEAN SD N
          LET A = SUMMARY NORMAL TOLERANCE ONE SIDED K FACTOR   MEAN SD N
          LET A = SUMMARY NORMAL TOLERANCE ONE SIDED LOWER LIMI MEAN SD N
          LET A = SUMMARY NORMAL TOLERANCE ONE SIDED UPPER LIMI MEAN SD N
          LET A = SUMMARY LOWER PREDICTION LIMIT                MEAN SD N
          LET A = SUMMARY UPPER PREDICTION LIMIT                MEAN SD N
          LET A = SUMMARY ONE SIDED LOWER PREDICTION LIMIT      MEAN SD N
          LET A = SUMMARY ONE SIDED UPPER PREDICTION LIMIT      MEAN SD N
          LET A = SUMMARY LOWER PREDICTION BOUND                MEAN SD N
          LET A = SUMMARY UPPER PREDICTION BOUND                MEAN SD N
          LET A = SUMMARY ONE SIDED LOWER PREDICTION BOUND      MEAN SD N
          LET A = SUMMARY ONE SIDED UPPER PREDICTION BOUND      MEAN SD N
          LET A = SUMMARY LOWER SD CONFIDENCE LIMIT             SD N
          LET A = SUMMARY UPPER SD CONFIDENCE LIMIT             SD N
          LET A = SUMMARY ONE SIDED LOWER SD CONFIDENCE LIMIT   SD N
          LET A = SUMMARY ONE SIDED UPPER SD CONFIDENCE LIMIT   SD N
          LET A = SUMMARY LOWER SD PREDICTION LIMIT             SD N
          LET A = SUMMARY UPPER SD PREDICTION LIMIT             SD N
          LET A = SUMMARY ONE SIDED LOWER SD PREDICTION LIMIT   SD N
          LET A = SUMMARY ONE SIDED UPPER SD PREDICTION LIMIT   SD N
          LET A = TIETJEN MOORE TEST                            Y
          LET A = TIETJEN MOORE MINIMUM TEST                    Y
          LET A = TIETJEN MOORE MAXIMUM TEST                    Y
          LET A = WILK SHAPIRO TEST                             Y
          LET A = WILK SHAPIRO TEST PVALUE                      Y

          LET A = ANGLIT PPCC                       Y
          LET A = ANGLIT PPCC LOCATION              Y
          LET A = ANGLIT PPCC SCALE                 Y
          LET A = ARCSINE PPCC                      Y
          LET A = ARCSINE PPCC LOCATION             Y
          LET A = ARCSINE PPCC SCALE                Y
          LET A = CAUCHY PPCC                       Y
          LET A = CAUCHY PPCC LOCATION              Y
          LET A = CAUCHY PPCC SCALE                 Y
          LET A = COSINE PPCC                       Y
          LET A = COSINE PPCC LOCATION              Y
          LET A = COSINE PPCC SCALE                 Y
          LET A = DOUBLE EXPONENTIAL PPCC           Y
          LET A = DOUBLE EXPONENTIAL PPCC LOCATION  Y
          LET A = DOUBLE EXPONENTIAL PPCC SCALE     Y
          LET A = FATIGUE LIFE PPCC STATISTIC       Y
          LET A = FATIGUE LIFE PPCC LOCATION        Y
          LET A = FATIGUE LIFE PPCC SCALE           Y
          LET A = FATIGUE LIFE PPCC SHAPE           Y
          LET A = GAMMA PPCC STATISTIC              Y
          LET A = GAMMA PPCC LOCATION               Y
          LET A = GAMMA PPCC SCALE                  Y
          LET A = GAMMA PPCC SHAPE                  Y
          LET A = GH PPCC STATISTIC                 Y
          LET A = GH PPCC LOCATION                  Y
          LET A = GH PPCC SCALE                     Y
          LET A = GH PPCC SHAPE ONE                 Y
          LET A = GH PPCC SHAPE TWO                 Y
          LET A = GENERALIZED PARETO PPCC STATISTIC Y
          LET A = GENERALIZED PARETO PPCC LOCATION  Y
          LET A = GENERALIZED PARETO PPCC SCALE     Y
          LET A = GENERALIZED PARETO PPCC SHAPE     Y
          LET A = EXPONENTIAL PPCC                  Y
          LET A = EXPONENTIAL PPCC LOCATION         Y
          LET A = EXPONENTIAL PPCC SCALE            Y
          LET A = HALF CAUCHY PPCC                  Y
          LET A = HALF CAUCHY PPCC LOCATION         Y
          LET A = HALF CAUCHY PPCC SCALE            Y
          LET A = HALF NORMAL PPCC                  Y
          LET A = HALF NORMAL PPCC LOCATION         Y
          LET A = HALF NORMAL PPCC SCALE            Y
          LET A = HYPERBOLIC SECANT PPCC            Y
          LET A = HYPERBOLIC SECANT PPCC LOCATION   Y
          LET A = HYPERBOLIC SECANT PPCC SCALE      Y
          LET A = INVERTED WEIBULL PPCC STATISTIC   Y
          LET A = INVERTED WEIBULL PPCC LOCATION    Y
          LET A = INVERTED WEIBULL PPCC SCALE       Y
          LET A = INVERTED WEIBULL PPCC SHAPE       Y
          LET A = LOGISTIC PPCC                     Y
          LET A = LOGISTIC PPCC LOCATION            Y
          LET A = LOGISTIC PPCC SCALE               Y
          LET A = LOGNORMAL PPCC STATISTIC          Y
          LET A = LOGNORMAL PPCC LOCATION           Y
          LET A = LOGNORMAL PPCC SCALE              Y
          LET A = LOGNORMAL PPCC SHAPE              Y
          LET A = MAXWELL PPCC                      Y
          LET A = MAXWELL PPCC LOCATION             Y
          LET A = MAXWELL PPCC SCALE                Y
          LET A = MAXIMUM GUMBEL PPCC               Y
          LET A = MAXIMUM GUMBEL PPCC LOCATION      Y
          LET A = MAXIMUM GUMBEL PPCC SCALE         Y
          LET A = MINIMUM GUMBEL PPCC               Y
          LET A = MINIMUM GUMBEL PPCC LOCATION      Y
          LET A = MINIMUM GUMBEL PPCC SCALE         Y
          LET A = NORMAL PPCC LOCATION              Y
          LET A = NORMAL PPCC SCALE                 Y
          LET A = RAYLEIGH PPCC                     Y
          LET A = RAYLEIGH PPCC LOCATION            Y
          LET A = RAYLEIGH PPCC SCALE               Y
          LET A = SEMICIRCULAR PPCC                 Y
          LET A = SEMICIRCULAR PPCC LOCATION        Y
          LET A = SEMICIRCULAR PPCC SCALE           Y
          LET A = SLASH PPCC                        Y
          LET A = SLASH PPCC LOCATION               Y
          LET A = SLASH PPCC SCALE                  Y
          LET A = TUKEY LAMBDA PPCC STATISTIC       Y
          LET A = TUKEY LAMBDA PPCC LOCATION        Y
          LET A = TUKEY LAMBDA PPCC SCALE           Y
          LET A = TUKEY LAMBDA PPCC SHAPE           Y
          LET A = UNIFORM PPCC LOCATION             Y
          LET A = UNIFORM PPCC SCALE                Y
          LET A = WALD PPCC STATISTIC               Y
          LET A = WALD PPCC LOCATION                Y
          LET A = WALD PPCC SCALE                   Y
          LET A = WALD PPCC SHAPE                   Y
          LET A = WEIBULL PPCC STATISTIC            Y
          LET A = WEIBULL PPCC LOCATION             Y
          LET A = WEIBULL PPCC SCALE                Y
          LET A = WEIBULL PPCC SHAPE                Y

          LET SIGMA = 
          LET A = CHI-SQUARE SD TEST                                 Y
          LET A = CHI-SQUARE SD TEST CDF                             Y
          LET A = CHI-SQUARE SD TEST PVALUE                          Y
          LET A = CHI-SQUARE SD TEST LOWER TAIL PVALUE               Y
          LET A = CHI-SQUARE SD TEST UPPER TAIL PVALUE               Y

          LET A = F TEST                                             Y1 Y2
          LET A = F TEST CDF                                         Y1 Y2
          LET A = F TEST PVALUE                                      Y1 Y2
          LET A = KLOTZ TEST                                         Y1 Y2
          LET A = KLOTZ TEST CDF                                     Y1 Y2
          LET A = KLOTZ TEST PVALUE                                  Y1 Y2
          LET A = KLOTZ TEST LOWER TAILED PVALUE                     Y1 Y2
          LET A = KLOTZ TEST UPPER TAILED PVALUE                     Y1 Y2
          LET A = KRUSKAL WALLIS TEST                                Y  X
          LET A = KRUSKAL WALLIS TEST CDF                            Y  X
          LET A = KRUSKAL WALLIS TEST PVALUE                         Y  X
          LET A = MANN WHITNEY RANK SUM TEST                         Y1 Y2
          LET A = MANN WHITNEY RANK SUM TEST CDF                     Y1 Y2
          LET A = MANN WHITNEY RANK SUM TEST PVALUE                  Y1 Y2
          LET A = MANN WHITNEY RANK SUM LOWER TAIL PVALUE            Y1 Y2
          LET A = MANN WHITNEY RANK SUM UPPER TAIL PVALUE            Y1 Y2
          LET A = MANN WHITNEY U STATISTIC                           Y1 Y2
          LET A = TWO SAMPLE CHI SQUARE TEST                         Y1 Y2
          LET A = TWO SAMPLE CHI SQUARE TEST CDF                     Y1 Y2
          LET A = TWO SAMPLE CHI SQUARE TEST PVALUE                  Y1 Y2
          LET A = TWO SAMPLE KOLMOGOROV SMIRNOV TEST                 Y1 Y2
          LET A = TWO SAMPLE KOLMOGOROV SMIRNOV TEST CRITICAL VALUE  Y1 Y2
          LET A = TWO SAMPLE SIGN TEST                               Y1 Y2
          LET A = TWO SAMPLE SIGN TEST CDF                           Y1 Y2
          LET A = TWO SAMPLE SIGN TEST PVALUE                        Y1 Y2
          LET A = TWO SAMPLE SIGN TEST LOWER TAIL PVALUE             Y1 Y2
          LET A = TWO SAMPLE SIGN TEST UPPER TAIL PVALUE             Y1 Y2
          LET A = TWO SAMPLE T TEST                                  Y1 Y2
          LET A = TWO SAMPLE T TEST CDF                              Y1 Y2
          LET A = TWO SAMPLE T TEST PVALUE                           Y1 Y2
          LET A = TWO SAMPLE T TEST LOWER TAILED PVALUE              Y1 Y2
          LET A = TWO SAMPLE T TEST UPPER TAILED PVALUE              Y1 Y2
          LET A = TWO SAMPLE WILCOXON SIGNED RANK TEST               Y1 Y2
          LET A = TWO SAMPLE WILCOXON SIGNED RANK TEST CDF           Y1 Y2
          LET A = TWO SAMPLE WILCOXON SIGNED RANK TEST PVALUE        Y1 Y2
          LET A = TWO SAMPLE WILCOXON TEST LOWER TAILED PVALUE       Y1 Y2
          LET A = TWO SAMPLE WILCOXON TEST UPPER TAILED PVALUE       Y1 Y2

          LET A = ANDERSON DARLING K SAMPLE TEST                     Y X
          LET A = ANDERSON DARLING K SAMPLE TEST CRITICAL VALUE      Y X
          LET A = MEDIAN TEST                                        Y X
          LET A = MEDIAN TEST CDF                                    Y X
          LET A = MEDIAN TEST PVALUE                                 Y X
          LET A = SQUARED RANK TEST                                  Y X
          LET A = SQUARED RANK TEST CDF                              Y X
          LET A = SQUARED RANK TEST PVALUE                           Y X
          LET A = SQUARED RANK TEST LOWER TAILED PVALUE              Y X
          LET A = SQUARED RANK TEST UPPER TAILED PVALUE              Y X

    b) The statistic LET subcommands now support matrix arguments.
       For example,

            LET A = MEAN M
            LET A = DIFFERENCE OF MEANS M N

       where M and N are matrices.  Note that the matrix will be converted
       to a variable (in a columnwise order) before applying the command.
       This means that the number of rows times the number of columns must
       be less than or equal to the maximum number of rows per variable
       (this is set to 1,000,000 on most current implementations).

       Be aware that Dataplot distinguishes between "statistic" and "math"
       LET subcommands.  The statistic LET subcommands work with variables
       on the right hand side and always return a parameter (i.e., scalar)
       value.  The math LET subcommands may have a mix of parameters and
       variables on both the left and right hand sides.  At the current
       time, only those math LET subcommands that explicitly work with
       matrices support matrix arguments.  Volume II of the online Reference
       Manual provides separate chapters for the "statistic", "math", and
       "matrix" LET subcommands.  It is only those commands in the
       "statistic" chapter that are affected by this update.

    c) The following new math LET subcommands were added:

          LET X FREQ CDF = MANN WHITNEY U STATISTIC FREQUENCY N1 N2
          LET TAG = KEEP X XKEEP
          LET TAG = OMIT X XOMIT
          LET Y2 TAG = THRESHOLD MINIMUM Y TVAL
          LET Y2 TAG = THRESHOLD MAXIMUM Y TVAL

          LET Y = CUMULATIVE <STAT> Y
          LET Y = CROSS TABULATE CUMULATIVE <STAT> Y X

          LET Y = WEIBULL MOMENT ESTIMATORS X

          LET Y = PERCENTAGE RANK X
          LET Y = EXPAND XLAB XVAL

          LET Y2 = JSCORE Z ROUND

          LET Y2 = ISO 13528 ZSCORE         Y XREF SIGMA
          LET Y2 = ISO 13528 ZPRIME         Y XREF SIGMA
          LET Y2 = ISO 13528 EN SCORE       Y ULAB XREF UREF
          LET Y2 = ISO 13528 ZETA SCORE     Y ULAB XREF UREF
          LET Y2 = ISO 13528 EZMINUS SCORE  Y ULAB XREF UREF
          LET Y2 = ISO 13528 EZPLUS  SCORE  Y ULAB XREF UREF

          LET MOUT = MATRIX COMBINE COLUMNS M N
          LET MOUT = MATRIX COMBINE ROWS    M N

          LET MOUT = PARTIAL CORRELATION MATRIX  M
          LET MOUT = PARTIAL CORRELATION CDF MATRIX  M
          LET MOUT = PARTIAL CORRELATION PVALUE MATRIX  M
          LET MOUT = CORRELATION CDF MATRIX  M
          LET MOUT = CORRELATION PVALUE MATRIX  M

          LET YOUT = LOW PASS FILTER Y
          LET YOUT = HIGH PASS FILTER Y

          LET TAG = POINTS IN POLYGON XVAL YVAL XPOLY YPOLY
          LET Y2 X2 = TRANSFORM POINTS Y X TX TY SX SY THETA
          LET Y2 X2 = EXTREME POINTS Y X
          LET Y2 X2 = LINE INTERSECTIONS X1 Y1 X2 Y2 X3 Y3 X4 Y4
          LET Y2 X2 = PARALLEL LINES X1 Y1 X2 Y2 X3 Y3
          LET Y2 X2 = PERPINDICULAR LINES X1 Y1 X2 Y2 X3 Y3
          LET YINDEX = NEAREST NEIGHBOR INDEX Y1 X1
          LET YDIST  = NEAREST NEIGHBOR DISTANCE Y1 X1
          LET YINDEX = NEAREST NEIGHBOR Y1 X1
          LET YINDEX YDIST  = NEAREST NEIGHBOR Y1 X1
          LET Y3 X3 TAG3 = JOIN Y1 X1 YINDEX

          LET Y2 X2 YCODED = BINNED CODED Y

       The INTEGRAL command was updated to allow indefinite integrals
       (i.e., either the lower limit or the upper limit is infinity).
       If you specify the lower limit as CPUMIN or -INFINIY or you
       specify the upper limit as CPUMAX or INFINITY, then the
       indefinite integration code will automatically be invoked.
       You do not have to define CPUMIN/CPUMAX/INIFINITY (Dataplot
       checks for the literal text, not the value of any parameter
       that may be defined these strings).

 4) The following enhancements were made to the graphics commands.

    a) The following graphics commands were added

           ISO 13528 PLOT Y Z ROUND LABID LAB

           ISO 13528 ZSCORE PLOT Z MATID ROUNDID
           ISO 13528 JSCORE PLOT Z MATID ROUNDID

           ISO 13528 RLP PLOT Z LABID MATID

    b) The following updates were made to the HOMOSCEDASTICITY PLOT:

          i) Added support for the MULTIPLE option.
         ii) Added support for the SUBSET (or HIGHLIGHT) option.
        iii) Allow more than one group-id variable.
         iv) Allow alternate measures for location and scale.
          v) Added support for summary data.
         vi) Added support for the "circle technique" to identify
             non-homogeneous labs.
        vii) Added support for the TO syntax.

       Enter HELP HOMOSCEDASTICITY PLOT for details.

    c) Added several options to the BLOCK PLOT.  Enter HELP BLOCK PLOT
       for details.

    d) Continued to add support for the MULTIPLE, REPLICATION, and
       HIGHLIGHT options and support for MATRIX arguments.  Specifically,

          i) Support for MULTIPLE option:

             ANOP PLOT, I PLOT, INFLUENCE CURVE, PERCENT POINT PLOT,
             RUN SEQUENCE PLOT, VIOLIN PLOT

         ii) Support for REPLICATION option:

             I PLOT, PERCENT POINT PLOT, RUN SEQUENCE PLOT, VIOLIN PLOT

        iii) Support for HIGHLIGHT option:

             BIHISTOGRAM, LAG PLOT, NORMAL PLOT, PERCENT POINT PLOT,
             QUANTILE-QUANTILE PLOT, RUN SEQUENCE PLOT, SHIFT PLOT,
             TUKEY MEAN DIFFERENCE PLOT, WEIBULL PLOT, 4-PLOT

         iv) Support for matrix arguments (not supported when the
             REPLICATION option is used):

             ANOP PLOT, BIHISTOGRAM, COMPLEX DEMODULATION, DUANE PLOT,
             I PLOT, INFLUENCE CURVE, LAG PLOT, NORMAL PLOT, 
             PARALLEL COORDINATES PLOT, PERCENT POINT PLOT,
             QUANTILE-QUANTILE PLOT, RUN SEQUENCE PLOT, SHIFT PLOT,
             STEM AND LEAF PLOT, TUKEY MEAN DIFFERENCE PLOT,
             VIOLIN PLOT, WEIBULL PLOT, 4-PLOT

             Note that matrix arguments will be converted to a variable
             in a column-wise fashion.  So the number of rows times the
             number of columns must be less than the maximum number of
             rows for a variable (this is set to 1,000,000 on most
             systems, but it may vary from this).

 5) The following updates were made to the Analysis commands.

    a) The following non-parametric tests were added:

           COX STUART TEST Y         - sign test for trend
           KLOTZ TEST Y1 Y2          - two-sample test for equal variances
           MEDIAN TEST Y X           - test for equal medians for k groups
           SQUARED RANKS TEST Y X    - test for equal variances for k groups
           FISHER TWO SAMPLE RAND    - two sample Fisher randomization test
                  TEST Y1 Y2           for equal location
           PAGE TEST Y X1 X2         - Page test for two factor ANOVA
           QUADE TEST Y X1 X2        - Quade test for two factor ANOVA

           KENDALL TAU INDEPENDENCE TEST Y1 Y2       - two sample
                                                       independence test
           RANK CORRELATION INDEPENDENCE TEST Y1 Y2  - two sample
                                                       independence test

    b) Added the command

           PREDICTION LIMITS               - prediction limits for the
                                             mean of new observations
           PREDICTION BOUNDS               - prediction limits to cover
                                             all new observations

           SD CONFIDENCE LIMITS            - confidence limits for the
                                             standard deviation
           SD PREDICTION LIMITS            - prediction limits for the
                                             standard deviation

           CORRELATION CONFIDENCE LIMITS   - confidence limits for the
                                             correlation coefficient based on
                                             Fisher's normal approximation
    c) Added the commands

           JARQUE BERA NORMALITY TEST  - perform a Jarque-Bera test for
                                         normality

           MEAN SUCCESSIVE DIFF TEST   - perform a mean successive
                                         differences test for randomness

           BEST DISTRIBUTIONAL FIT Y   - search for best fitting distribution
                                         (univariate data, continuous
                                         distributions, no censoring)

           MCCOOL WEIBULL LOCATION TEST - test for samples of size 10 to 100
                                          to distinguish between a
                                          3-parameter and a 2-parameter
                                          Weibull distribution

    e) The REPLICATED and MULTIPLE options, support for matrix arguments
       and support for the TO syntax were added to additional analysis
       commands.  In addition, the output was reformatted for many of these
       commands.  Specifically,

          i) Support for MULTIPLE option:

             ABASIS, ANDERSON-DARLING K-SAMPLE, BARTLETT TEST, BBASIS,
             CAPABILITY ANALYSIS, CHI-SQUARE SD TEST, CUMULATIVE SUM,
             F LOCATION TEST, F TEST, FREQUENCY TEST, GOODNESS OF FIT,
             KOLM SMIR TWO SAMPLE TEST, KRUSKAL WALLIS, LEVENE TEST,
             LJUNG BOX TEST, MANN WHITNEY RANK SUM TEST, RUNS, SIGN TEST,
             SUMMARY, T TEST, TOLERANCE LIMITS, TWO SAMPLE CHI-SQUARE,
             VAN DER WAERDEN, WILCOXON SIGNED RANK TEST, WILK-SHAPIRO

             The interpretation of the MULTIPLE option depends on the
             data expected.

             a) When a single response variable is expected (e.g., the
                SUMMARY command), the MULTIPLE option means the test will
                be applied to each response variable independently.

             b) When two variables are expected where the first variable
                is the response variable and the second variable is a
                group-id variable (e.g., the KRUSKAL WALLIS TEST), the
                MULTIPLE option means that each variable is treated as
                a distinct group, no group-id variable is entered, and a
                single test is performed.

             c) When two response variables are expected (e.g., the
                F TEST), the MULTIPLE option will perform the test on all
                the pairwise combinations of response variables.  That is

                    F TEST Y1 TO Y4

                is equivalent to entering

                    F TEST Y1 Y2
                    F TEST Y1 Y3
                    F TEST Y1 Y4
                    F TEST Y2 Y3
                    F TEST Y2 Y4
                    F TEST Y3 Y4

         ii) Support for REPLICATION option:

             ABASIS, BBASIS, CAPABILITY ANALYSIS, CUMULATIVE SUM,
             FREQUENCY TEST, GOODNESS OF FIT, LJUNG-BOX, RUNS, SUMMARY,
             TOLERANCE LIMITS, WILK-SHAPIRO

        iii) Support for matrix arguments:

             All of the commands listed above for the MULTIPLE option
             now support matrix arguments.  The following additional
             commands also support matrix arguments

             BINOMIAL PROPORTION TEST,
             DIFFERENCE OF PROPORTION CONFIDENCE LIMITS,
             PROPORTION CONFIDENCE LIMITS

             Matrix arguments are not supported when the REPLICATION
             option is used.

         iv) Support for RTF formatted output has been extended to
             all cases where previously only HTML/LATEX formatted
             output was supported.  The HTML/LATEX/RTF support was
             extended to a number of additional commands.

          v) Many of the commands have reformatted the output for better
             clarity and readability.  These are not listed individually.

 6) The following miscellaneous commands were added.

    a) Added the command PWD to retrieve the current working
       directory.

    b) Added the command PSVIEW.  This command will preview the
       current plot (i.e., the dppl2f.dat file) with a Postscript
       viewer.  You can specify the program to use as the Postscript
       viewer with the command

           SET POSTSCRIPT VIEWER  /usr/bin/evince

       The default is Ghostview for both Windows and Linux/Unix.

    c) Added the following option to the CAPTURE command:

           CAPTURE SCRIPT <filename>

       This option saves the subsequent commands to a file
       without executing them.  The intended purpose of this
       is to allow scripts for external programs (e.g., Python,
       Perl, and so on) to be created within a Dataplot
       macro.  You can subsequently use the SYSTEM command to
       execute the script.

 7) The following colors were added:

      R0 - R255  - turns on RED with an intensity level from 0
                   to 255
      Z0 - Z255  - turns on GREEN with an intensity level from 0
                   to 255
      B0 - B255  - turns on BLUE with an intensity level from 0
                   to 255

    Note the "Z" was used for GREEN because Gxxx is already used
    for gray scale colors.

    For devices that don't support full RGB colors, these will
    be mapped to RED, GREEN, and BLUE.

 8) A number of bugs have been fixed.

-----------------------------------------------------------------------
The following enhancements were made to DATAPLOT
August 2009 - November    2010.
-----------------------------------------------------------------------

 1) The following enhancements were made to the graphics commands.

    a) Several features are being developed for general implementation
       for the graphics commands.  These will be phased in over
       the next several releases.  Implementation of each of these
       features will be considered for each of the graphics commands
       on a case by case basis.

          i) REPLICATION - for this option, one or more group-id
             variables can be specified.  These group-id variables
             are cross tabulated and the plot is generated for
             each combination of the cross tabulated values.

             The LINE and CHARACTERS commands (and associated attribute
             setting commands) can be used to distinguish the
             various curves.

         ii) MULTIPLE - many Dataplot commands expect syntax like

                 BOX PLOT Y X

             where Y is the response variable and X is a group-id
             variable.

             In many cases, the groups may be in separate
             columns.  The MULTIPLE option will support the
             following syntax

                 MULTIPLE BOX PLOT Y1 Y2 Y3 Y4

             Although you can use the LET ... = STACK command to
             put the data in the Y X form, the MULTIPLE option
             makes that step unnecessary.

             For commands that except a single response (e.g.,
             BOX COX NORMALITY PLOT), the MULTIPLE option can be
             used to overlay several curves on the same plot.

             The MULTIPLE option cannot be used with the
             REPLICATION option.

        iii) SUBSET (highlighting) - for many plots, it may be
             useful to highlight certain points.  The SUBSET option
             typically specifies a group variable.  Based on this
             group variable, you can use the LINE and CHARACTER
             (and related) commands to highlight certain points.
             For example, the highlighted points might be drawn
             in a different color.

             Although this command is similar to REPLICATION, it
             is different.  For example, if you use REPLICATION to
             define two groups for a normal probability plot,
             two distinct probability plots are generated.  On the
             other hand, if you use SUBSET to define two groups for
             the normal probability plot, there is only one probability
             plot generated.  However, the two groups can be plotted
             with different attributes.

         iv) TO syntax - several Dataplot commands support a TO syntax.
             For example, READ X1 TO X4 is equivalent to
             READ X1 X2 X3 X4.  This syntax will be extended to more
             commands.

          v) Dataplot supports a matrix type.  Previously, matrix
             arguments were restricted to commands that specifically
             operated on matrices.  Commands that expect a
             univariate argument or that support the MULTIPLE 
             option are good candidates for adding support for
             matrix arguments.  Matrix arguments will not be
             supported for the REPLICATION option or for the case
             where multiple response variables are expected.
             Note that a matrix argument will be treated as a
             variable argument.  For example, NORMAL PROBABILITY PLOT M
             (where M is the name of a matrix) will generate a single 
             probability plot for all values in the matrix.

    b) Added the command

          TABULATION  PLOT Y X1 ... X4  YLEVEL

       This plot is a bit of a mix between a fluctuation plot
       and a contour plot.  Enter HELP TABULATION PLOT for
       details.

       Some enhancements were made to the FLUCTUATION PLOT
       as well.  Enter HELP FLUCTUATION PLOT for details.

    c) Added a "BATCH MULTIPLE" option for the STRIP PLOT
       command.  Enter HELP STRIP PLOT for details.

    d) Made several changes to the HISTOGRAM command.

         i) By default, Dataplot sets the lower and upper class limits
            to xbar -/+ 6*s (with xbar and s denoting the sample mean
            and standard deviation), respectively.  This can
            occassionally result in a few outlying points being excluded
            from the histogram.  To adjust the lower and upper class
            limits so that these outlying points are included, enter the
            command

                 SET HISTOGRAM OUTLIERS ON

            To revert to the default, enter

                 SET HISTOGRAM OUTLIERS OFF

        ii) By default, the histogram draws all cells, even those with
            zero frequency.  To suppress these zero frequency cells,
            enter

                  SET HISTOGRAM EMPTY BINS OFF

             To restore the default, enter

                  SET HISTOGRAM EMPTY BINS ON

       iii) Previously, Dataplot only generated histograms for the case
            where the bin widths were equal.  This has been extended
            to the case with unequal bin widths.  The syntax is

                HISTOGRAM Y XLOW XHIGH

            with XLOW containing the values for the lower bin limit
            and XHIGH containing the values for the upper bin limit.

        iv) Added the following option

                SUBSET HISTOGRAM Y X

            In this case, X is a group-id variable.  This syntax
            can be used to highlight the contribution to the
            histogram for particular subsets of the data.

         v) Fixed a bug in the CUMULATIVE RELATIVE HISTOGRAM
            for the AREA case.  If SET RELATIVE AREA HISTOGRAM
            is set to AREA (the default), relative histograms
            are normalized so that the area is equal to 1 and
            if it set to PERCENT the sum of the bar heights is
            equal to 1.  The PERCENT case did not have a bug.

       For details, enter HELP HISTOGRAM.

    e) Made several enhancements to the FREQUENCY PLOT and
       KERNEL DENSITY commands.

          i) The SET HISTOGRAM OUTLIER option applies to the
             FREQUENCY PLOT.

         ii) As with the HISTOGRAM, non-equispaced bins are
             supported for the FREQUENCY PLOT:

                 FREQUENCY PLOT Y XLOW XHIGH

        iii) The REPLICATED and MULTIPLE options were added
             to these commands.  For the REPLICATED case, either
             one or two replication variables can be specified.
             Support was also added for matrix arguments and
             for the TO syntax.

    f) Made several enhancements to the BOX PLOT command.

       Support was added for the MULTIPLE and REPLICATED
       options.  Up to six replication variables can be
       specified.  The word REPLICATION is optional.

       For the REPLICATED case, you can control the spacing between
       groups.  Internally, Dataplot uses the CODE CROSS TABULATE
       command to generate a single combined group-id variable.  Enter
       HELP CODE CROSS TABULATE for details on how to control the
       spacing (the SET commands used by CODE CROSS TABULATE are
       supported for the BOX PLOT command).

       Support was added for matrix arguments for the MULTIPLE case or
       for the case where only a single argument is given.

    g) Made several enhancements to the

           BOX COX NORMALITY PLOT Y
           BOX COX HOMGENEITY PLOT Y
           BOX COX LINEARITY PLOT Y X

       commands.

       The REPLICATED option is supported for all 3 plots.  Either one or
       two replication variables can be supported.  The MULTIPLE option is
       supported for the BOX COX NORMALITY PLOT.

       The BOX COX NORMALITY plot supports matrix arguments for the MULTIPLE
       case or for the case where only a single argument is given.  The
       TO syntax is supported for all of these commands.

    h) For the PROBABILITY PLOT, added support for the MULTIPLE and
       REPLICATED (for up to 2 replication variables) options,
       support for matrix arguments, and support for the
       HIGHLIGHT option.

       In addition, you can enter the commands

           LET PPLOC   = 
           LET PPSCALE = 

       before entering the PROBABILITY PLOT command.  This adds location
       and scale parameters to the theoretical distribution.  This is
       intended for the case where the distribution parameters are
       estimated by a non-PPCC method (e.g., maximum likelihood) and
       you want to generate the probability plot using the estimated
       parameters.

    i) For the PPCC PLOT, added support for the MULTIPLE and
       REPLICATED (for up to 2 replication variables) options
       and support for matrix arguments.

    j) The BOOTSTRAP PLOT and JACKNIFE PLOT commands were updated
       to include reports in addition to the plots.

        i) If the BOOTSTRAP/JACKNIFE plot is applied to a statistic
           (e.g., BOOTSTRAP MEAN PLOT Y), the following tables are
           generated:

           1) An initial summary table.

           2) A table containing percent points for the computed
              statistic.

           3) A table containing percentile confidence limits for
              the statistic for various values of alpha.

       ii) If the BOOTSTRAP/JACKNIFE plot is applied to a
           a distributional fit (e.g., BOOTSTRAP WEIBULL PPCC PLOT Y
           or BOOTSTRAP WEIBULL MLE PLOT Y), the following tables are
           generated:

           1) An initial summary table.

           2) A table containing percentile confidence limits for
              each of the parameters of the distribution for various
              levels of alpha.

           3) If the SET MAXIMUM LIKELIHOOD PERCENTILS command was
              given, a table containing confidence limits for the
              specified percentiles will be generated.

       For both cases, the SET WRITE DECIMALS command can be
       used to specify the number of decimals to use in the
       tables and the CAPTURE HTML, CAPTURE LATEX, and
       CAPTURE RTF options are supported.

 2) Added or enhanced the following analysis comamnds:

    a) Similar to the graphics commands, the REPLICATED and MULTIPLE
       options and support for matrix arguments will be added to the
       analysis commands over the next several releases on a case by
       case basis.

    b) The output for the GRUBBS command was modified.

       Support was added for the MULTIPLE and REPLICATED options.
       Up to six replication variables can be specified.

       In addition, the following capability was added:

           GRUBB TEST Y LABID

       The LABID variable is used for identification purposes
       only in the output.

    c) In addition, the following new outlier tests were added:

           DIXON Y
           TIETJEN MOORE Y
           EXTREME STUDENTIZED DEVIATE Y

       The Dixon test is a small sample test for a single outlier.
       The Tietjen-Moore test is an generalization of the Grubbs
       test to the case of more than one outlier where the number
       of outliers must be specified exactly.  The extreme studentized
       deviate test is an generalization of the Grubbs test to the
       case of more than one outlier where only the upper bound on
       the number of outliers must be specified.

       These commands support the MULTIPLE and REPLICATED options
       in a similar manner as the GRUBBS command.

    d) For the following commands

           CONFIDENCE LIMITS
           BIWEIGHT CONFIDENCE LIMITS
           TRIMMED MEAN CONFIDENCE LIMITS
           MEDIAN CONFIDENCE LIMITS
           QUANTILE CONFIDENCE LIMITS
           DIFFERENCE OF MEANS CONFIDENCE LIMITS

       added support for the MULTIPLE and REPLICATION options.

       In addition, matrix arguments are now supported (except for
       the REPLICATION case).  The MULTIPLE option is not supported
       for the DIFFERENCE OF MEANS CONFIDENCE LIMITS case.

 3) Added the following miscellaneous commands:

    a) CPU TIME - this command prints the current CPU time
       used by the current Dataplot session.

    b) The CHARACTER command now accepts up to 16 characters for
       the plot symbol.  The previous limit was 4 characters.

       This capability is most useful for the case where the
       CHARACTER command is used to label specific points.  In
       particular, it can be useful for the CHARACTER AUTOMATIC
       command.

    c) The command

           CALL filename.dp

       can now also be run by entering

           filename.dp

       That is, the CALL is optional.

 4) Significant restructing is being performed for the
    goodness of fit and goodness of fit plots for probability
    distributions.  Much of this change is to reduce duplicate code,
    to make various goodness of fit commands more consistent,
    and to make some planned future updates easier to implement.

    Although much of this change is primarily internal and should
    be transparent to users, the following updates were made.

    a) The Anderson-Darling option was added as an alternative
       to the PPCC PLOT and KS PLOT:

            ANDERSON DARLING PLOT Y

       The Anderson-Darling is currently supported for ungrouped
       and uncensored data (i.e., the same as the KS test).

    b) The Anderson-Darling syntax was changed to

            ANDERSON DARLING GOODNESS OF FIT Y

    c) The output format for the Anderson-Darling, Kolmogorov-Smirnov,
       and chi-square goodness of fit was modified.

    d) The following command was added

            PPCC GOODNESS OF FIT Y

       This is currently supported for the raw data case (i.e.,
       ungrouped data) without censoring.  Currently, distributions
       with more than one shape parameter are not supported.

    e) The Anderson-Darling, Kolmogorov-Smirnov, and PPCC
       goodness of fit commands were updated to generate appropriate
       critical values dynamically via Monte Carlo simulation.  A
       few comments on this.

          i) There are 2 distinct cases.  In the first case, we
             assume the distribution parameters are known.  This
             is referred to as the "fully specified" case.  In this
             case, the simulations are performed using the specified
             parameters.

             In the second case, the distribution parameters are
             estimated from the data.  For this case, the 
             distribtuion parameters are estimated for each
             Monte Carlo sample using maximum likelihood (for the
             PPCC case, the PPCC plot is used instead of maximum
             likelihood).
  
             The following command is used to specify which method
             is used for the Monte Carlo simulations

                SET GOODNESS OF FIT 

         ii) Although the second case (i.e., estimate the parameters
             from the data) is the more realistic case, Dataplot
             does not support this for all distributions.  For the
             K-S and Anderson-Darling cases, a maximum likelihood
             estimation needs to be performed for each set of
             simulated values.  So if Dataplot supports maximum
             likelihood estimation for the specified distribution,
             then the "ESTIMATE" option is likely to be supported.
             The PPCC option is limited to distributions where
             there is at most one shape parameter.

        iii) Published tabes are available for a number of
             distributions for the Anderson-Darling and for
             the fully specified case for the Kolmogorov-Smirnov
             cases.  The advantage of using the published tables
             is speed since the simulation step does not need to
             be performed.  Simulation allows the Anderson-Darling
             and Kolmogorov-Smirnov critical values to be generated
             for cases where published tables are not available and
             also permits p-values to be returned.  Note that the
             critical values returned by Dataplot simulations may
             differ slightly from the published values due to a
             different random number generator being used.

             To specify whether "tabled" critical values or
             similated critical values will be used, enter

                 SET KOLMOGOROV SMIRNOV CRITICAL VALUE 
                 SET ANDERSON DARLING CRITICAL VALUE 
f) The maximum likelihood estimation for the parameters of a distribution routines are being reviewed and updated. Some of the change is cosmetic (i.e., more consistent appearance), but in some cases the fitting algorithms are being improved. We are also reviewing the computational algorithms for some of the probability distributions. g) Made the following enhancements to the CONSENSUS MEANS command. i) Added Horn-Horn-Duncan and MINMAX estimates for the standard error to the DerSimonian Laird estimate. ii) Many of the methods require that the standard deviaitions be positive for all labs. Zero standard deviations can result when a lab has a single observation or when all observations are equal. Previously, Dataplot omitted all labs with zero standard deviations from the analysis (they were included in the initial summary table). However, some of the methods (specifically, the grand mean, mean of means, BOB, Bayesian Consensus Mean, and generalized confidence interval methods) can handle these methods. iii) The following consensus mean statistics can be computed LET A = SUMMARY DERSIMONIAN LAIRD MEAN SD N LET A = SUMMARY DERSIMONIAN LAIRD STANDARD ERROR MEAN SD N LET A = SUMMARY DERSIMONIAN LAIRD HHD MEAN SD N LET A = SUMMARY DERSIMONIAN LAIRD MINMAX MEAN SD N LET A = SUMMARY MANDEL PAULE MEAN SD N LET A = SUMMARY MANDEL PAULE STANDARD ERROR MEAN SD N LET A = SUMMARY MODIFIED MANDEL PAULE MEAN SD N LET A = SUMMARY MODIFIED MANDEL PAULE STANDARD ERROR MEAN SD N LET A = SUMMARY VANGEL RUKHIN MEAN SD N LET A = SUMMARY VANGEL RUKHIN STANDARD ERROR MEAN SD N LET A = SUMMARY GENERALIZED CONFIDENCE INTERVAL MEAN SD N LET A = SUMMARY GENERALIZED CONFIDENCE INTERVAL STANDARD ERROR MEAN SD N LET A = SUMMARY BOB MEAN SD N LET A = SUMMARY BOB STANDARD ERROR MEAN SD N LET A = SUMMARY BCP MEAN SD N LET A = SUMMARY BCP STANDARD ERROR MEAN SD N LET A = SUMMARY MEAN OF MEANS MEAN SD N LET A = SUMMARY MEAN OF MEANS STANDARD ERROR MEAN SD N LET A = SUMMARY FAIRWEATHER MEAN SD N LET A = SUMMARY FAIRWEATHER STANDARD ERROR MEAN SD N LET A = SUMMARY SCHILLER-EBERHARDT MEAN SD N LET A = SUMMARY SCHILLER-EBERHARDT STANDARD ERROR MEAN SD N LET A = SUMMARY GRAYBILL DEAL MEAN SD N LET A = SUMMARY GRAYBILL DEAL SINHA STANDARD ERROR MEAN SD N LET A = SUMMARY GRAYBILL DEAL NAIVE STANDARD ERROR MEAN SD N LET A = SUMMARY GRAYBILL DEAL ZHANG ONE STANDARD ERROR MEAN SD N LET A = SUMMARY GRAYBILL DEAL ZHANG TWO STANDARD ERROR MEAN SD N LET A = DERSIMONIAN LAIRDY X LET A = DERSIMONIAN LAIRD STANDARD ERRORY X LET A = DERSIMONIAN LAIRD HHDY X LET A = DERSIMONIAN LAIRD MINMAXY X LET A = MANDEL PAULEY X LET A = MANDEL PAULE STANDARD ERRORY X LET A = MODIFIED MANDEL PAULEY X LET A = MODIFIED MANDEL PAULE STANDARD ERRORY X LET A = VANGEL RUKHINY X LET A = VANGEL RUKHIN STANDARD ERRORY X LET A = GENERALIZED CONFIDENCE INTERVALY X LET A = GENERALIZED CONFIDENCE INTERVAL STANDARD ERRORY X LET A = BOBY X LET A = BOB STANDARD ERRORY X LET A = BCPY X LET A = BCP STANDARD ERRORY X LET A = MEAN OF MEANSY X LET A = MEAN OF MEANS STANDARD ERRORY X LET A = FAIRWEATHERY X LET A = FAIRWEATHER STANDARD ERRORY X LET A = SCHILLER-EBERHARDTY X LET A = SCHILLER-EBERHARDT STANDARD ERRORY X LET A = GRAYBILL DEALY X LET A = GRAYBILL DEAL SINHA STANDARD ERRORY X LET A = GRAYBILL DEAL NAIVE STANDARD ERRORY X LET A = GRAYBILL DEAL ZHANG ONE STANDARD ERRORY X LET A = GRAYBILL DEAL ZHANG TWO STANDARD ERRORY X The SUMMARY version of the these commands uses the summary statistics (means, standard deviations, sample size) while the other cases expect a response variable and a lab-id variable. These statistics can also be used with the various commands that support more than one response variable. Enter HELP STATISTICS for details. 5) The following statistics are now supported LET A = LOCATION Y LET A = SCALE Y LET A = DIFFERENCE OF LOCATION Y LET A = DIFFERENCE OF SCALE Y LET A = TIETJEN MOORE TEST Y LET A = TIETJEN MOORE MINIMUM TEST Y LET A = TIETJEN MOORE MAXIMUM TEST Y LET A = DIXON TEST Y LET A = DIXON MINIMUM TEST Y LET A = DIXON MAXIMUM TEST Y LET A = EXTREME STUDENTIZED DEVIATE TEST Y LET A = BINOMIAL RATIO NSUCC NTRIAL LET A = ROOT MEAN SQUARE ERROR Y LET A = DIFFERENCE OF ROOT MEAN SQUARE ERROR Y 6) The following LET sub-commands were added: LET LOWLIM UPPLIM = BINOMIAL RATIO CONFIDENCE LIMITS P1 N1 P2 N2 ALPHA LET XCODE = CODE CROSS TABULATE X1 X2 LET XCODE = CODE CROSS TABULATE X1 X2 X3 LET XCODE = CODE CROSS TABULATE X1 X2 X3 X4 LET XCODE = CODE CROSS TABULATE X1 X2 X3 X4 X5 LET XCODE = CODE CROSS TABULATE X1 X2 X3 X4 X5 X6 LET AO A0SD A1 A1SD = MATRIX FIT M X LET Y = RANK INDEX X LET Y = COMBINE X1 ... XK LET M = VARIABLE TO MATRIX M NROW LET S = STRING WORD IWORD LET NWORD = NUMBER OF WORDS S LET YOUT = MOVING Y1 ... YK 7) The SEQUENCE command was updated to allow variables for the arguments on the right hand side of the equal sign (previously these were restricted to parameters or constants). For example, LET REP = DATA 3 3 2 3 LET Y = SEQUENCE 1 REP 1 5 would generate 1 1 1 2 2 2 3 3 4 4 4 5 5 5 You can combine any mixture of constants, parameters, or variables for these arguments. However, if more than one is a variable, all of these variables must be of the same length. In addition, the following syntax is now supported LET YVAL = DATA 1 2 3 LET REP = DATA 3 2 5 LET Y = SEQUENCE YVAL REP would generate 1 1 1 2 2 3 3 3 3 3 8) The Windows version was upgraded to use version 11 of the Intel compiler (previously version 9 was used). A number of Fortran 77 constructs are no longer supported by this version of the compiler. A large number of coding changes were made to make the source code compatible with version 11. These should be transparent (i.e., no change in how commands are used, although a number of potential bugs were corrected) to users. 9) Following bug fixes: a) Fixed MOVE RELATIVE command. b) The standard deviation for the location parameter from linear fits (FIT Y X, QUADRATIC FIT Y X, etc.) was corrected. c) Corrected a bug when using the SET CONVERT CHARACTER command. d) Corrected the MEDIAN CONFIDENCE LIMITS for the Maritz-Jarrett method. e) A number of other miscellaneous bug fixes were made. ----------------------------------------------------------------------- The following enhancements were made to DATAPLOT March 2008 - July 2009. ----------------------------------------------------------------------- 1) For Unix/Linux platforms and the gfortran compiler, added the command SET PROMPT ADVANCE This controls whether the Dataplot prompt appears as (for the OFF case) > enter new command here or (for the ON case) >enter new command here Although the ON case is preferred when running the command line version, it causes problems when running the GUI version with Tcl/Tk only (i.e., not using Expect). For this reason, OFF is the default. If you typically only run the command line version, you may want to add a SET PROMPT ADVANCE ON command to your dplogf.tex file (although you will need to comment this line out if you want to run the GUI). Alternatively, you can enter the command manually when you initiate Dataplot. 2) The following enhancements were made to the READ/WRITE commands. a) When reading numeric fields, Dataplot will check for the string NaN The NaN string is used to denote "not a number" on some systems. If Dataplot encounters this string, it will insert the missing value. This can be set with the command (the default is 0) SET READ MISSING VALUE <value> The search for NaN is not case sensitive. b) Many programs will have a specific alpha string to denote a missing value. You can set a character string that denotes a missing value with the command SET DATA MISSING VALUE <value> Currently, a maximum of 4 characters is allowed for the missing value string. When reading numeric fields, Dataplot will check for the missing value string (if specified). If Dataplot encounters this string, it will insert the missing value. This can be set with the command (the default is 0) SET READ MISSING VALUE <value> The search for the missing value string is not case sensitive. c) The maximum number of characters allowed for a single command line was increased to 255. This applies to both reading commands from the terminal and reading commands from a file. In addition, you can now enter multiple continuation lines (previously, only a single continuation line was allowed). However, the combined command still has a maximum limit of 255 characters. d) On Linux/Unix platforms, support has been added for the GNU readline library. The readline library allows command line editing and history recall. To use the readline capability, enter the command SET READ LINE ON Note that this capability only applies to commands that are entered from the terminal. The editing/history capabilities supported by readline are documented at the readline web site: http://tiswww.case.edu/php/chet/readline/rltop.html Dataplot requires version 6.x of the readline library. This is the current production version, although many systems may still be using version 5.2. e) Added the commands READ MATRIX TO VARIABLES FILE.DAT Z ROWID COLID In many cases Whether you read a matrix as a matrix or as f) Added the commands READ STACKED VARIABLES FILE.DAT Z GROUPID g) Added the commands READ IMAGE TO VARIABLES FILE.DAT RED GREEN BLUE ROWID COLID READ IMAGE TO VARIABLES FILE.DAT GREY ROWID COLID READ IMAGE FILE.DAT GREY READ IMAGE RED FILE.DAT RED READ IMAGE GREEN FILE.DAT GREEN READ IMAGE BLUE FILE.DAT BLUE These commands allow you to read image data into Dataplot. The GD library is used to read the images in the following format: 1) jpeg 2) png 3) gif Note: This update is not currently available for the Windows implementation. h) The WRITE RTF command will print variables in an RTF table (support was previously added for WRITE LATEX and WRITE HTML). Note that HTML format is restricted to 15 variables or less and LATEX and RTF format is restricted to 7 variables or less. i) Added the command TABLE WIDTH <totwidth> <nright> where <totwidth> and <nright> are variables that specify the total width of the field and the number of digits to the right of the decimal point. Row one applies to variable one, row two applies to variable two, and so on. This is an alternative to the SET WRITE DECIMALS and SET WRITE FORMAT commands. The SET WRITE DECIMALS command requires that all variables be printed with the same format. Although the SET WRITE FORMAT allows more flexibility, it cannot be used for WRITE <RTF/LATEX/HMTL>. Up to 200 rows can be specified (if the number of variables being printed is greater than 200, it is recommended that you use the SET WRITE FORMAT command). A few comments on what can be specified for <ntot> and <nright> If NTOT and NRIGHT are the values for a given row, then the following apply: 1) A value of -99 indicates that the default value should be used (this is 15 for NTOT and 7 for NWIDTH). 2) If NRIGHT is a positive integer, then Fortran F format will be used (e.g., "3.26"). 3) If NRIGHT is 0, then an integer format will be used. 4) If NRIGHT is -2, then a G15.7 format will be used. In this case, the Fortran compiler will decide between an F-based format or an E-based format depending on the particular number. If NRIGHT is between -3 and -20, then a Fortran E-based format (Eyy.xx where the absolute value of NRIGHT specifies the "xx") will be used. 3) Added the following graphics commands. a) Added the command DISCRETE CONTOUR PLOT Z ROW COL Z0 b) Added the commands IMAGE PLOT GREY ROWID COLID IMAGE PLOT RED GREEN BLUE ROWID COLID This command allows you to render images. The ability to support image plots is dependent on the capabilities of the specific graphics device and is currently supported on the following devices: 1) Quickwin - i.e., the command line version of Dataplot under Microsoft Windows 2) X11 - currently, only grey scale is supported. 3) GD - the GD device is used to generate jpeg, PNG, and gif format files 4) Postscript c) Added the command FLUCTUATION <stat> PLOT Y X1 X2 FLUCTUATION <stat> PLOT Y X1 X2 X3 FLUCTUATION <stat> PLOT Y X1 X2 X3 X4 FLUCTUATION <stat> PLOT Y X1 X2 X3 X4 X5 FLUCTUATION <stat> PLOT Y X1 X2 X3 X4 X5 X6 to generate a fluctuation plot (this is a variant of the mosaic plot). Enter HELP FLUCTUATION PLOT for details. d) Added the command STRIP PLOT Y STRIP PLOT Y2 X2 BATCH STRIP PLOT Y TAG BATCH STRIP PLOT Y2 X2 TAG The strip plot is also known as a dot plot. There are a number of variations of this plot. Enter HELP STRIP PLOT for details. e) The <stat> PLOT command was updated to support multiple response variables. For example, MEAN PLOT Y1 TO Y4 X That is, for each distinct value of X, there are now 4 means plotted instead of just one. The following commands can be used to control the appearance of the plot: SET STATISTIC PLOT FORMAT <DEX/OVERLAY> SET STATISTIC PLOT SUMMARY <VARIABLE/GROUP> If the FORMAT option is set to OVERLAY and the SUMMARY option is set to VARIABLE, this is equivalent to the following: YLIMITS ... PRE-ERASE OFF ERASE MEAN PLOT Y1 X MEAN PLOT Y2 X MEAN PLOT Y3 X MEAN PLOT Y4 X PRE-ERASE ON That is, there will be a curve corresponding to each response variable and there will be a reference line corresponding to each variable. If the FORMAT option is set to DEX, then this plot uses a format similar to the DEX <stat> PLOT command. That is, for each distinct value of X, there will be curve connecting the mean values for the 4 response variables. If the SUMMARY option is set to GROUP, there will be a single reference curve. At each distinct value of X, a single overall mean is computed for all 4 of the response variables. In addition, the following option is added to this command: <stat> <zscore/uscore> PLOT If ZSCORE is given, then a z-score transformation (subtract the mean and then divide by the standard deviation) is computed on each response variable first. If USCORE is given, then a u-score transformation (subtract the minimum and divide by the range) is computed on each column. Note these z-score and u-score transformations apply to the entire response variable, not to each distinct group within the response variable. 4) The following updates were made to the graphics output devices. a) For the SVG (Scalable Vector Graphics) device, graphics elements are now assigned "layers". This may be useful if you import the SVG graphic into a graphics editing program (i.e., it may allow individual elements of the plot to be edited). b) The GD driver was enhanced to support hardware text (the previous implementation drew all characters using one of Dataplot's software fonts). There are two types of hardware characters supported: 1) The GD library supports 5 built-in fonts: small, large, mediumbold, tiny, and huge. These are fixed size fonts. 2) In addition, the GD library supports True Type Fonts (TTF). This is the font type supported on the Microsoft Windows operating system. These fonts are scalable. Although these fonts were originally developed for Microsoft Windows, they can be used on Linux/Unix systems as well. Note that neither Dataplot nor the GD web site provides any of these fonts. However, there are a number of web sites that provide these types of fonts (some are freely downloadable while others are proprietary). c) The Postscript driver was updated in the following ways. 1) For presentation and publication graphs, it is desirable to use the Postscript typeset quality fonts. However, the use of special characters (with the limited exception of the SP(), CR(), UC(), and LC() options) has required the use of one of the software Hershey fonts (e.g., SIMPLEX or DUPLEX). The Postscript device was upgraded to handle most of Dataplot's supported special characters. Specifically, the following are supported: i) subscripts and superscripts ii) Greek characters iii) A subset of the mathematical symbols and other special characters. This is based on what is available in the Postscript symbol font. Note that there is not a 1-to-1 correspondence between the characters available in the Postscript symbol font and the special characters supported by Dataplot. The following is the list of Dataplot special characters that will be translated to equivalent characters in the Postscript symbol font: INTE(), SUMM(), PROD(), INFI(), DOTP(), DEL(), DIVI(), LT(), GT(), LTEQ(), GTEQ(), NOT(), +-(), APPR(), TILD(), EQUI(), VARI(), CARA(), TIME(), PART(), RADI(), SUBS(), SUPE(), UNIO(), INTR(), ELEM(), THEX(), THFO(), RAPO(), LBRA(), RBRA(), LCBR(), RCBR(), LELB(), RELB(), RARR(), UARR(), DARR(), VBAR(), HBAR(), DEGR() The full set of special symbols supported by Dataplot is documented in chapter 13 of Volume I of the Reference Manual http://www.itl.nist.gov/div898/software/dataplot/refman1/ ch13/homepage.pdf d) Added support for the Unux/Linux libplot library. The libplot library is part of the "plotutils" package which includes the plot, tek2plot, pic2plot, plotfont, spline, and ode programs. This driver may not be available on some platforms. For further information on using this device driver, enter HELP LIBPLOT 5) The following updates were made to the analysis commands. a) The following updates were made to the CROSS TABULATE command: 1) The number of cross-tabulation variables was increased to six (from two). That is, you can cross-tabulate on a minimum of 2 variables and a maximum of six variables. 2) The output can be generated in RTF format (support was previously added for Latex and HTML format). RTF can be used to import the output into Microsoft Word. This enhancement was also made to the TABULATE command. 3) You can use the SET WRITE DECIMALS command to specify the number of digits to the right of the decimal point in the output. This enhancement was also made to the TABULATE command. 4) Since there is now a separate CHI-SQUARE INDEPENCE TEST command, the chi-square test option in the CROSS TABULATE command has been removed. 5) For the LET <resp> = CROSS TABULATE <stat> ... command, added the following option: SET LET CROSS TABULATE <EXPAND/COLLAPSE> If EXPAND is specified, the number of rows in the output variable is equal to the number of rows in the input variables. If COLLAPSE is specified, the number of rows in the output variable is equal to the number of distinct cross tabulation cells. If the COLLAPSE option is used, the following comamnds may be helpful LET X1D = CROSS TABULATE GROUP ONE <var-list> LET X2D = CROSS TABULATE GROUP TWO <var-list> LET X3D = CROSS TABULATE GROUP THREE <var-list> LET X4D = CROSS TABULATE GROUP FOUR <var-list> For example, if you want to cross-tabulate the means for three classification variables, you could do something like LET YMEAN = CROSS TABULATE MEAN Y X1 X2 X3 LET X1D = CROSS TABULATE GROUP ONE X1 X2 X3 LET X2D = CROSS TABULATE GROUP TWO X1 X2 X3 LET X3D = CROSS TABULATE GROUP THREE X1 X2 X3 PRINT YMEAN X1D X2D X3D b) Added the command BINOMIAL PROPORTION TEST P1 N1 P2 N2 BINOMIAL PROPORTION TEST Y1 Y2 c) Added the following commands to perform one-sample and two-sample proficiency analyses based on the ASTM E2489 - 06 standard: ONE SAMPLE PROFICIENCY TEST Y LABID TWO SAMPLE PROFICIENCY TEST Y LABID 6) The following updates were made to the probability distributions. a) Added support for the following distributions 1) 3-Parameter Logistic-Exponential LE3CDF(X,BETA) - cdf function LE3CHAZ(X,BETA) - cumulative hazard function LE3HAZ(X,BETA) - hazard function LE3PDF(X,BETA) - pdf function LE3PPF(P,BETA) - ppf function 2) Truncated Pareto TNPCDF(X,GAMMA,A,NU) - cdf function TNPPDF(X,GAMMA,A,NU) - pdf function TNPPPF(P,GAMMA,A,NU) - ppf function 3) Brittle Fracture BFRCDF(X,ALPHA,BETA,R) - cdf function BFRPDF(X,ALPHA,BETA,R) - pdf function BFRPPF(P,ALPHA,BETA,R) - ppf function 4) Pearson Type III PE3CDF(X,GAMMA) - cdf function PE3PDF(X,GAMMA) - pdf function PE3PPF(P,GAMMA) - ppf function 5) The Mielke's Beta-Kappa distributtion was renamed to MIECDF(X,K,THETA) - cdf function MIEPDF(X,K,THETA) - pdf function MIEPPF(P,K,THETA) - ppf function Note also that BETA parameter was in fact a scale parameter and is now explicitly treated as such. 6) Kappa KAPCDF(X,K,H) - cdf function KAPPDF(X,K,H) - pdf function KAPPPF(P,K,H) - ppf function Note that the Mielke's Beta-Kappa is a re-parameterized special case of the Kappa distribution. b) Added support for maximum likelihood estimation for the following distributions: reflected power Weibull (maximum case) Frechet (minimum case) generalized Pareto (minimum case) generalized extreme value (minimum case) Kappa Pearson type III Note that the Kappa and Pearson type III actually implement L-moment estimates rather than maximum likelihood estimates. 7) The following LET commands were added. a) Two-dimensional convex hulls can be computed with the command LET Y2 X2 = 2D CONVEX HULL Y X b) Two-dimensional minimum spanning trees can be computed with the commands LET Y2 X2 TAG = MINIMUM SPANNING TREE Y X LET Y2 X2 = MINIMUM SPANNING TREE D The first syntax is used when the input is a set of vertices. The second syntax is used when the input is a distance matrix. c) Two-dimensional spanning forests can be computed with the commands LET Y2 X2 TAG = SPANNING FOREST EDGE1 EDGE2 Y X LET EDGE1 EDGE2 TAG NV = SPANNING FOREST EDGE1 EDGE2 NVERT The first syntax is most useful when you want to plot the spanning forest. d) The command LET Y2 X2 TAG = EDGES TO VERTICES EDGE1 EDGE2 Y X can be used to convert a list of edges in a graph to a list of vertices. This is a convenience command to make plotting the graph easier. e) The commands LET X2 Y2 = SORT2 X Y LET Z1 Z2 Y2 = SORT3 X1 X2 Y LET Z1 Z2 Z3 Y2 = SORT4 X1 X2 X3 Y can be used to sort based on multiple fields. f) The commands LET Y = GATHER X INDEX LET Y = SCATTER X INDEX can be used to extract (or insert) data from one array into another based on a variable that contains row id's. g) Added the following matrix commands: The following is used to compute the permanent of a matrix. LET MOUT = MATRIX PERMANENT M The following is used to generate an adjacency matrix from a list of edges. LET ADJ = EDGES TO ADJACENCY MATRIX EDGE1 EDGE2 NVERT The following is used to permute the rows and columns of a matrix. LET MOUT = MATRIX RENUMBER M VROW VCOL The following is used to compute the pseudo inverse of a matrix. LET MINV = PSEUDO INVERSE M The following is used to compute the coordinates for a biplot. LET Y X TAG = BIPLOT M h) Added the following commands: LET N = <value> LET Y = RANDOM SUBSET FOR I = 1 1 K LET N = <value> LET Y = RANDOM K-SET OF N-SET FOR I = 1 1 K LET N = <value> LET Y = RANDOM COMPOSITION FOR I = 1 1 K LET N = <value> LET K = <value> LET Y = RANDOM PARTITION LET N = LET Y = RANDOM EQUIVALENCE RELATION LET N = <value> LET LAMBDA = DATA <values> LET Y = RANDOM YOUNG TABLEAUX LAMBDA LET Y = NEXT SUBSET 0 LET Y = NEXT SUBSET N YPREV LET Y = NEXT K-SET OF N-SET 0 LET Y = NEXT SUBSET N K YPREV LET Y = NEXT PERMUTATION 0 LET Y = NEXT PERMUTATION N YPREV LET Y = NEXT COMPOSITION 0 LET Y = NEXT COMPOSITION N K YPREV LET Y1 Y2 = NEXT EQUIVALENCE RELATION 0 LET Y1 Y2 = NEXT EQUIVALENCE RELATION N YPREV YREPPREV LET Y = NEXT YOUNG TABLEAUX N LAMBDA LET Y = NEXT YOUNG TABLEAUX N LAMBDA Y LET VAL ROWID = CONVERT YOUNG TABLEAUX Y LET HOOK = YOUNG TABLEAUX HOOK LENGTH VAL ROWID LET Y2 X2 = PEAKS OF FREQUENCY TABLE Y LET AL AU = DIFFERENCE OF PROPORTION CONFIDENCE LIMITS P1 N1 P2 N2 ALPHA LET PVAL = DIFFERENCE OF PROPORTION HYPOTHESIS TEST P1 N1 P2 N2 ALPHA LET PVAL = DIFFERENCE OF PROPORTION LOWER TAIL HYPOTHESIS TEST P1 N1 P2 N2 ALPHA LET PVAL = DIFFERENCE OF PROPORTION UPPER TAIL HYPOTHESIS TEST P1 N1 P2 N2 ALPHA i) Added the following commands for working with strings: LET SOUT = STRING MERGE SORG SADD NSTART LET SOUT = STRING REPLACE SORG SADD NSTART LET SOUT = STRING EDIT SORG SOLD SNEW LET SOUT = STRING CONCATENATE S1 S2 ... SK LET SOUT = SUBSTRING SORG NSTART NSTOP LET SOUT = SUBSTRING SORG NSTART NSTOP LET SOUT = UPPER CASE SORG LET SOUT = LOWER CASE SORG LET IVAL = ICHAR SORG LET NLEN = STRING LENGTH SORG LET NSTART NSTOP = STRING INDEX SORG SMATCH j) Added the following command for merging two sets of data LET ... = MERGE ... The merge is performed by matching columns in two sets of data and then carrying other variables of interest. For details, enter HELP MERGE. k) Added the following commands for shifting the elements of a vector up (right) or down (left) LET Y2 = SHIFT Y NSHIFT LET Y2 = CIRCULAR SHIFT Y NSHIFT l) It is sometimes convenient to extract the index of the minimim, maximum, or extreme value (i.e., the largest absolute value) of a vector. This can be done with the following commands LET A = INDEX MINIMUM Y LET A = INDEX MAXIMUM Y LET A = INDEX EXTREME Y 8) The following miscellaneous updates were made. a) Added SAVE/RESTORE options to the FEEDBACK command. This is primarily useful for general purpose macros where you want to use the FEEDBACK OFF command and you want to restore the setting that was in place when the macro was called. 9) Added the following command DIRECTION <HORIZONTAL/VERTICAL> For the TEXT command when a hardware font is being used, this specifies whether the text is drawn horizontally (the default) or vertically. 10) Fixed a number of bugs. ----------------------------------------------------------------------- The following enhancements were made to DATAPLOT March 2007 - July 2007. ----------------------------------------------------------------------- 1) We have made the following updates for categorical data analysis. There are two basic types of data that the following commands address. a) We have two variables,each with n observations, where the first can have one of r mutually exclusive values and the second can have one of c mutually exclusive values. So each observation will fit into exactly one of the r levels of variable one and exactly one of the c levels of variable two. Your data can be either in raw form (two columns of data each with n rows) or summary form (an rxc table which will typically be read into Dataplot as a matrix). Each entry in the summary table is a count of how many times that particular combination occurred. b) If each variable can have exactly two outcomes (typically coded as 1/0), then we have the 2x2 special case. There are a number of specialized methods for dealing with this type of data. For this type of data, the number of observations for the two variables need not be equal. Some examples of this type of data are: i) We have a diagnostic test to detect a disease. Variable one specifies whether the patient in fact has the disease (coded as 1) or not (coded as 0). Variable two specifies whether the test detected the disease (coded as 1) or not (coded as 0). ii) We are testing instruments to determine whether or not they can detect a particular substance. Variable one is the ground truth (coded as 1 when the substance is present and coded as 0 when it is not). Variable two denotes whether the instrument detected the substance (1 for detected, 0 for not detected). The following capabilities have been added to Dataplot for analyzing these type of data. a) The following statistical tests were added: ODDS RATIO INDEPENDENCE TEST N11 N21 N12 N22 ODDS RATIO INDEPENDENCE TEST Y1 Y2 ODDS RATIO INDEPENDENCE TEST M CHI-SQUARE INDEPENDENCE TEST N11 N21 N12 N22 CHI-SQUARE INDEPENDENCE TEST Y1 Y2 CHI-SQUARE INDEPENDENCE TEST M FISHER EXACT TEST N11 N21 N12 N22 FISHER EXACT TEST Y1 Y2 FISHER EXACT TEST M MCNEMAR TEST N11 N21 N12 N22 MCNEMAR TEST Y1 Y2 MCNEMAR TEST M ODDS RATIO CHI-SQUARE TEST Y1 Y2 ODDS RATIO CHI-SQUARE TEST Y1 Y2 X ODDS RATIO CHI-SQUARE TEST Y1 X1 Y2 X2 MANTEL-HAENSZEL TEST Y1 Y2 MANTEL-HAENSZEL TEST Y1 Y2 X MANTEL-HAENSZEL TEST Y1 X1 Y2 X2 b) Added the following statistics: LET A = ODDS RATIO X1 X2 LET A = ODDS RATIO STANDARD ERROR X1 X2 LET A = LOG ODDS RATIO X1 X2 LET A = LOG ODDS RATIO STANDARD ERROR X1 X2 LET A = RELATIVE RISK X1 X2 LET A = CRAMER CONTINGENCY COEFFICIENT X1 X2 LET A = MATRIX GRAND CRAMER CONTINGENCY COEFFICIENT M LET A = PEARSON CONTINGENCY COEFFICIENT X1 X2 LET A = MATRIX GRAND PEARSON CONTINGENCY COEFFICIENT M LET A = FALSE POSITIVE Y1 Y2 LET A = FALSE NEGATIVE Y1 Y2 LET A = TRUE POSITIVE Y1 Y2 LET A = TRUE NEGATIVE Y1 Y2 LET A = TEST SENSITIVITY Y1 Y2 LET A = TEST SPECIFICITY Y1 Y2 LET A = POSITIVE PREDICTIVE VALUE Y1 Y2 LET A = NEGATIVE PREDICTIVE VALUE Y1 Y2 These statistics are supported by the following commands: PLOT TABULATE CROSS TABULATE CROSS TABULATE PLOT BOOTSTRAP PLOT JACKNIFE PLOT c) Added the following graphics: ROC CURVE Y1 Y2 X - generate a ROC curve ROSE PLOT Y - generate a rose plot (also ROSE PLOT Y1 Y2 known as a four-fold plot) BINARY TABULATION PLOT Y1 Y2 X1 X2 BINARY PLOT Y1 Y2 X1 where is one of: CORRECT MATCH FALSE POSITIVE FALSE NEGATIVE TRUE POSITIVE TRUE NEGATIVE These "binary" plots are used to generate summary plots of "1/0" type data across groups. ASSOCIATION PLOT M - generate an association plot ASSOCIATION PLOT Y1 Y2 ASSOCIATION PLOT N11 N21 N12 N22 SIEVE PLOT M - generate a sieve plot SIEVE PLOT Y1 Y2 SIEVE PLOT N11 N21 N12 N22 2) We have made the following updates for probability distributions. a) Maximum likelihood estimates were added for the following distributions: Katz (generates moment estimates) slash triangular four parameter beta (generates moment estimates) log beta beta normal The maximum likelihood for the two-sided power distribution was generalized to include the lower and upper limit parameters. The slash and triangular distributions have also been added to the BOOTSTRAP/JACKNIFE MLE PLOT command: BOOTSTRAP TRIANGULAR MLE PLOT Y JACKNIFE TRIANGULAR MLE PLOT Y BOOTSTRAP SLASH MLE PLOT Y JACKNIFE SLASH MLE PLOT Y The maximum likelihood estimation for the two-sided power distribution was updated from the the standard case (lower and upper limits = 0 and 1) to the general case (lower and upper limits will be estimated from the data). Also, the ML procedure for this distribution only applies if the N shape parameter is > 1. b) Added the following commands for binomial confidence intervals: LET A = EXACT BINOMIAL LOWER BOUND P N ALPHA LET A = EXACT BINOMIAL UPPER BOUND P N ALPHA LET ALOW AUPP = AGRESTI COULL LIMITS P N ALPHA The BINOMIAL MAXIMUM LIKELIHOOD command can generate these values for raw data. The above LET commands are useful when you only have summary data (i.e., the p and n). c) Added the following plots: POISSON PLOT Y X GEOMETRIC PLOT Y X BINOMIAL PLOT Y X NEGATIVE BINOMIAL PLOT Y X LOGARITHMIC SERIES PLOT Y X These plots are alternatives to the PROBABILITY PLOT command. ORD PLOT Y This plot can help distinguish whether a Poisson, a negative binomial, or a logarithmic series distribution provides a more appropiate distributional model for a set of discrete data. 3) Made the following updates to graphics commands. a) The HISTOGRAM command now accepts a matrix argument. b) Added the command BIVARIATE NORMAL TOLERANCE REGION PLOT Y1 Y2 X 4) Added the following statistics: LET P1 = LET P2 = LET A = TRIMMED STANDARD DEVIATION Y 5) Added the following command SET FATAL ERROR If an analysis or graphics command returns an error code, this command tells Dataplot how to respond: IGNORE - Dataplot will simply continue processing the next command. This was the behavior before this command was added and is the default. TERMINATE - Dataplot will print a message and terminate immediately. PROMPT - Dataplot will prompt whether you want to continue or terminate. This command was added primarily as a debugging option. If you are trying to debug a complex macro, it can be helpful to have Dataplot terminate (or prompt for termination) in order to locate where the initial error is occurring. Note that this command is not active if you are running the Graphical User Interface (GUI) version. 6) A Windows Vista installation is now available. 7) Fixed a number of miscellaneous bugs. ----------------------------------------------------------------------- The following enhancements were made to DATAPLOT May 2006 - February 2007. ----------------------------------------------------------------------- 1) The following updates were made for maximum likelihood estimates for distributions: a) The negative binomial was updated to distinguish between two cases: 1) the case where k is assumed known (p is estimated) and 2) the case where k is assumed unknown. For case 1), confidence limits for p were added. b) Maximum likelihood estimates were added for the following discrete distributions: zeta Borel-Tanner Lagrange-Poisson lost games beta-geometric Polya-Aeppli generalized logarithmic series geeta Consul quasi binomial type I generalized lost games generalized negative binomial topp and leone c) The binomial mle was updated in the following ways: 1) For exact intervals, fixed a bug for extreme values of p and small samples. 2) By default, Dataplot switches from the exact method to the normal approximation for sample sizes greater than 30 (Agresti-Coull intervals are always generated). You can specify the threshold with the command SET BINOMIAL NORMAL APPROXIMATION THRESHOLD 3) Some analysts prefer to use a continuity correction (p + 0.5)/(n + 1) You can specify whether to use the continuity correction by entering the command SET BINOMIAL CONTINUITY CORRECTION The default is OFF. 2) The following distributional updates were made. a) The YULCDF was updated to use an explicit formula (as oppossed to direct summation). b) For the KS PLOT, the location and scale parameters are estimated via the probability plot. For long-tailed distributions, more accurate estimates may be obtained by applying a biweight fit of the probability plot. To specify this option, enter the command SET PPCC PLOT LOCATION SCALE BIWEIGHT To restore the use of the regular least squares estimates of location and scale, enter SET PPCC PLOT LOCATION SCALE DEFAULT c) Added the following new continuous distributions. 1) Asymmetric Log-Laplace ALDCDF(X,ALPHA,BETA) - cdf function ALDPDF(X,ALPHA,BETA) - pdf function ALDPPF(P,ALPHA,BETA) - ppf function 2) Log-Beta LBECDF(X,ALPHA,BETA,C,D) - cdf function LBEPDF(X,ALPHA,BETA,C,D) - pdf function LBEPPF(P,ALPHA,BETA,C,D) - ppf function 3) Topp and Leone TOPCDF(X,BETA) - cdf function TOPPDF(X,BETA) - pdf function TOPPPF(P,BETA) - ppf function 4) Generalized Topp and Leone GTLCDF(X,ALPHA,BETA) - cdf function GTLPDF(X,ALPHA,BETA) - pdf function GTLPPF(P,ALPHA,BETA) - ppf function 5) Reflected Generalized Topp and Leone RGTCDF(X,ALPHA,BETA) - cdf function RGTPDF(X,ALPHA,BETA) - pdf function RGTPPF(P,ALPHA,BETA) - ppf function 6) Wakeby: WAKCDF(X,BETA,GAMMA,DELTA) - cdf function WAKPPF(P,BETA,GAMMA,DELTA) - ppf function d) Added the following new discrete distributions. 1) Beta-Geometric (Waring) BGECDF(X,ALPHA,BETA) - cdf function BGEPDF(X,ALPHA,BETA) - pdf function BGEPPF(X,ALPHA,BETA) - ppf function 2) Beta-Negative Binomial (generalized Waring) BNBCDF(X,ALPHA,BETA,k) - cdf function BNBPDF(X,ALPHA,BETA,k) - pdf function BNBPPF(X,ALPHA,BETA,k) - ppf function 3) Zeta ZETCDF(X,ALPHA) - cdf function ZETPDF(X,ALPHA) - pdf function ZETPPF(X,ALPHA) - ppf function 4) Zipf ZIPCDF(X,ALPHA,N) - cdf function ZIPPDF(X,ALPHA,N) - pdf function ZIPPPF(X,ALPHA,N) - ppf function 5) Borel-Tanner BTACDF(X,LAMBDA,N) - cdf function BTAPDF(X,LAMBDA,N) - pdf function BTAPPF(X,LAMBDA,N) - ppf function 6) Lagrange-Poisson LPOCDF(X,LAMBDA,THETA) - cdf function LPOPDF(X,LAMBDA,THETA) - pdf function LPOPPF(X,LAMBDA,THETA) - ppf function 7) Leads in Coin Tossing (Discrete Arcsine) LCTCDF(X,N) - cdf function LCTPDF(X,N) - pdf function LCTPPF(X,N) - ppf function 8) Classical Matching MATCDF(X,K) - cdf function MATPDF(X,K) - pdf function MATPPF(X,K) - ppf function 9) Polya-Aeppli PAPCDF(X,THETA,P) - cdf function PAPPDF(X,THETA,P) - pdf function PAPPPF(X,THETA,P) - ppf function 10) Generalized Logarithmic Series GLSCDF(X,THETA,BETA) - cdf function GLSPDF(X,THETA,BETA) - pdf function GLSPPF(X,THETA,BETA) - ppf function 11) Geeta GETCDF(X,THETA,BETA) - cdf function GETPDF(X,THETA,BETA) - pdf function GETPPF(X,THETA,BETA) - ppf function This distribution can also be parameterized with MU and BETA. 12) Quasi Binomial Type 1 QBICDF(X,P,PHI) - cdf function QBIPDF(X,P,PHI) - pdf function QBIPPF(X,P,PHI) - ppf function 13) Generalized Negative Binomial GNBCDF(X,THETA,BETA,M) - cdf function GNBPDF(X,THETA,BETA,M) - pdf function GNBPPF(X,THETA,BETA,M) - ppf function 14) Truncated Generalized Negative Binomial GNTCDF(X,THETA,BETA,M,N) - cdf function GNTPDF(X,THETA,BETA,M,N) - pdf function GNTPPF(X,THETA,BETA,M,N) - ppf function 15) Discrete Weibull DIWCDF(X,Q,BETA) - cdf function DIWPDF(X,Q,BETA) - pdf function DIWPPF(X,Q,BETA) - ppf function DIWHAZ(X,Q,BETA) - hazard function 16) Consul (a generalized geometric) CONCDF(X,THETA,M) - cdf function CONPDF(X,THETA,M) - pdf function CONPPF(X,THETA,M) - ppf function 17) Lost Games LOSCDF(X,P,R) - cdf function LOSPDF(X,P,R) - pdf function LOSPPF(X,P,R) - ppf function 18) Generalized Lost Games GLGCDF(X,P,J,A) - cdf function GLGPDF(X,P,J,A) - pdf function GLGPPF(X,P,J,A) - ppf function 19) Katz KATCDF(X,ALPHA,BETA) - cdf function KATPDF(X,ALPHA,BETA) - pdf function KATPPF(X,ALPHA,BETA) - ppf function e) The Waring routines (WARCDF, WARPDF, WARPPF) routines were re-written to take advantage of their relationship to the beta-geometric (the Waring is simply a different parameterization of the beta-geometric). This makes the Waring routines more computationally efficient and more accurate. 3) Added the following LET sub-commands. a) Added the harmonic number and generalized harmonic number functions: LET A = HARMNUMB(N) LET A = HARMNUMB(N,M) b) For certain types of plots, it can be useful to add a small bit of random noise to a variable to avoid overplotting. This is commonly referred to as jittering. To simplify this, the following command was added: LET DELTA LET Y = JITTER X DELTA The value of DELTA is used to control the magnitude of the jittering. That is, the value of x(i) will be changed to a value x(i) + noise where noise is in the range (-DELTA/2,DELTA/2). 4) Made the following updates to the CONSENSUS MEANS command. a) If a within-lab standard deviation is zero (i.e., the lab has only a single unique measurement value), that lab will be omitted from the analysis (it will be included in the initial summary table). Previously, Dataplot treated this as an error and would not run the consensus means analysis. b) Added the Fairweather method. There are 3 separate methods for generating 95% confidence intervals for this method (the original method proposed by Fairweather, an improvement suggested by Cox, and a method developed by Ruhkin). The output for this method is only printed if the minimum number of oberservations for a lab is greater than 5. c) Added the Bayesian Consensus Procedure (BCP) method of Hagwood and Guthrie. This is a refinement of the BOB method. For this method, the consensus mean and the standard deviation of the consensus mean are asymptotically equivalent to the posterior mean and standard deviation of a fully Bayesian method. d) Dataplot currently supports 12 methods. Most users will only be interested in a subset of these methods. You can now selectively turn individual methods on or off (all methods are on by default) with the commands: SET MANDEL PAULE SET MODIFIED MANDEL PAULE SET VANGEL RUHKIN SET BOB SET SCHILLER EBERHARDT SET MEAN OF MEANS SET GRAND MEAN SET GRAYBILL DEAL SET GENERALIZED CONFIDENCE INTERVAL SET DERSIMONIAN LAIRD SET FAIRWEATHER SET BAYESIAN CONSENSUS PROCEDURE 5) The following updates and enhancements were made to the graphics commands. a) Added the command: SET 4-PLOT DISTRIBUTION The 4-plot by default consists of a run sequence plot, a lag plot, a histogram, and a normal probability plot. The above command allows us to replace the normal probability plot with an exponential probability plot. This is useful when checking the assumptions for a Homogeneous Poisson Process (HPP) where we assume the interarrival times follow an exponential distribution. b) Added the command: REPAIR PLOT Y X CENSOR This is used to plot repair data where we may have multiple systems and each system may have a single censoring time (i.e., the time between the last repair and the end of the test). Enter HELP REPAIR PLOT for details. c) Added the command: MEAN REPAIR FUNCTION PLOT Y X CENSOR d) Added the command TRILINEAR PLOT Y1 Y2 Y3 This is used for plots where the rows of Y1, Y2, and Y3 are mixtures (i.e., they sum to either 1 (or 100 if you are using fractional units)). 6) Updated the RELIABILITY TREND TEST in the following ways. a) Fixed a bug in the reverse arrangements test. b) Modified the output format for better clarity. c) Added support for multiple systems. For multiple systems, the tests will be applied to each individual system and then composite tests will be performed. d) Added support for HTML, Latex, and RTF format. 7) The following bug fixes were made: a) The 2 variable case for the chi-square goodness of fit test for discrete distributions had a bug. This has been fixed. For older versions, a work around is SET MINSIZE = 1 LET Y3 XLOW XHIGH = COMBINE FREQUENCY TABLE Y2 X2 POISSON CHI-SQUARE GOODNESS OF FIT Y3 XLOW XHIGH b) Some bugs with LET subcommands and SUBSETTING were corrected. c) A bug involving IF statements within nested loops was corrected. d) A few other miscellanous bug fixes were made. ----------------------------------------------------------------------- The following enhancements were made to DATAPLOT September 2005 - April 2006. ----------------------------------------------------------------------- 1) For many one-factor plots, it is useful to sort the horizontal axis based on the value of some statistic (most commonly a location statistic such as the mean, median, minimum, or maximum). The following commands was added to help generate these sorted plots: LET XSORT INDX = SORT BY X GROUPID For example, to generate a sorted mean plot for variables Y and X, you would do something like LET X2 INDX = SORT BY MEAN Y X X1TIC MARK LABEL FORMAT VARIABLE X1TIC MARK LABEL CONTENT INDX MEAN PLOT Y X2 This can be used with the following types of plots i) PLOT Y X where is a desired statistic (e.g., MEAN or SD). ii) BOX PLOT Y X iii) PLOT Y X GROUP For details, enter HELP SORT BY STATISTIC. These plots often have alphabetic tick mark labels. The following enhancements were made to simplify the use of alphabetic tick mark labels with sorted plots. a) The TIC MARK LABEL FORMAT and TIC MARK LABEL CONTENT commands were previously augmented to allow numeric variables, group label variables, or the row label variable as the contents for the tick mark labels. Specifically, LET LAB = DATA 50 40 30 20 10 0 X1TIC MARK LABEL FORMAT VARIABLE X1TIC MARK LABEL CONTENT LAB LET IG = GROUP LABELS A B C D E X1TIC MARK LABEL FORMAT GROUP LABEL X1TIC MARK LABEL CONTENT IG X1TIC MARK LABEL FROMAT ROW LABELS This has been enhanced to allow an index variable to be specified on the above TIC MARK LABEL CONTENT commands (the index variable is typically generated by a SORT BY command). The index variable specifies the order in which the tic mark labels will be generated. So the above examples can be augmented by LET X2 INDX = SORT BY MEAN Y X LET LAB = DATA 50 40 30 20 10 0 X1TIC MARK LABEL FORMAT VARIABLE X1TIC MARK LABEL CONTENT LAB INDX LET X2 INDX = SORT BY MEAN Y X LET IG = GROUP LABELS A B C D E X1TIC MARK LABEL FORMAT GROUP LABEL X1TIC MARK LABEL CONTENT IG INDX LET X2 INDX = SORT BY MEAN Y X X1TIC MARK LABEL FROMAT ROW LABELS X1TIC MARK LABEL CONTENT INDX b) The LET ... = GROUP LABEL .... command was augmented in the following two ways. i) You can specify literal strings for group labels. For example, LET IG = GROUP LABEL BATCHSP()1 BATCHSP()2 ... BATCHSP()3 BATCHSP()4 The strings are separated by spaces. If you need to include a space in a particular string, use the SP() as in the above example. ii) Pre-defined strings can be used to define a group label variable. For example, LET IG = GROUP LABEL ST1 TO ST10 where ST1, ST2, ...., ST10 are previously defined strings. The TO syntax is useful in this context when the number of strings is large. Dataplot's algorithm for parsing the GROUP LABEL command is: i) Dataplot first checks the character variables file (HELP SET CONVERT CHARACTER for details). If the first name listed is found, Dataplot uses this character variable to define the group labels. ii) If a character variable is not found, Dataplot checks all the listed names to see if they are previously defined strings. If they are, then Dataplot substitutes the values of these strings. iii) If one or more of the names is not a previously defined string, then Dataplot treats all of the names as literal text strings. 2) You can now pass arguments to macros. To pass arguments to a macro, do something like CALL SAMPLE.DP arg1 arg2 arg3 Up tp 10 arguments may be passed (although limits on command line lengths still apply). Arguments containing spaces or hyphens should be enclosed in quotes. The character limit for a single argument is 40 characters. In the SAMPLE.DP macro, if a $1 is encountered, it will be replaced with "arg1", if a $2 is encountered, it will be replaced with "arg2" and so on. A $0 will substitute the number of arguments given on the CALL command. This substitution will only occur if a command line is contained within a macro (i.e., if no macro is active, the "$" will not signal any substitution and it will remain in the command line as given). Dataplot currently only supports one level of argument substitition for macros. That is, the values of the macro arguments (i.e., the $1, $2, etc.) will contain the values given by the most recent CALL command that specified at least one argument. If you need to nest CALL commands with macro arguments, the recommended work around is to have the higher level macro extract any macro arguments passed to it into temporary variables or strings before calling any other macros. For example, supposse SAMPLE.DP needs to call SAMPLE2.DP with arguments. You could do something like the following in SAMPLE.DP: . Start of SAMPLE.DP macro let string zzzzs1 = $1 let string zzzzs2 = $2 let string zzzzs3 = $3 ... call sample2.dp newarg1 newarg2 The default character for argument substitution is the "$". To use a different character, enter the command MACRO SUBSTITUTION CHARACTER 3) The following enhancements were made to the CAPTURE command (the CAPTURE command re-directs alphanumeric output to a file rather than displaying it on the screen). a) Sometimes it may be useful to have the output sent to both the screen and to a file. You can do this by entering the command CAPTURE SCREEN ON To restore CAPTURE output only being sent to the CAPTURE file, enter the command CAPTURE SCREEN OFF b) Sometimes it may be useful to selectively send output to the CAPTURE file. You can do this with the following commands: CAPTURE SUSPEND CAPTURE RESUME where SUSPEND specifies that output will be sent to the screen rather than the CAPTURE file (note that the CAPTURE file remains open) and RESUME will send the output to the currently open CAPTURE file. You can enter as many CAPTURE SUSPEND/CAPTURE RESUME sequences as you like between a CAPTURE/END OF CAPTURE session. Note that OFF is a synonym for SUSPEND and ON is a synonym for RESUME. 4) Made the following probability distribution updates: a) Added confidence intervals for the maximum likelihood estimates for the geometric distribution. b) Added confidence intervals for the maximum likelihood estimates for the Poisson distribution. c) Added support for the following new probability distributions: 1) Added the type 2 generalized logistic distribution. Enter HELP GL2PDF for details. 2) Added the type 3 generalized logistic distribution. Enter HELP GL3PDF for details. 3) Added the type 4 generalized logistic distribution. Enter HELP GL4PDF for details. 4) Added the Hosking parameterization of the generalized logistic distribution. Enter HELP GL5PDF for details. 5) Added the generalzied Tukey-Lambda distribution. Enter HELP GLDPDF for details. 6) Added the beta-normal distribution. Enter HELP BNOPDF for details. 7) Added the asymmetric log double exponential (Laplace) distribution. Enter HELP ALDPDF for details. 5) Added or modified the following analysis comamnds. a) The Durbin test for identifical effects in a two-way table for balanced incomplete block designs is supported with the command DURBIN TEST Y BLOCK TREATMENT Enter HELP DURBIN TEST for details. b) The TOLERANCE LIMITS command generates both normal tolerance limits and non-parametric tolerance limits. You can now specify only one of these with the commands NORMAL TOLERANCE LIMITS NONPARAMETRIC TOLERANCE LIMITS c) The GRUBS TEST for outlier detection was previously augmented to generate three distinct tests: i) a test for both the minimum and maximum points as outliers. ii) a test for the minimum points as an outliers. iii) a test for the maximum points as an outliers. This has now been modifed into three distinct commands: GRUBBS TEST Y GRUBBS MINIMUM TEST Y GRUBBS MAXIMUM TEST Y This was done so that the internally saved parameters (e.g., STATVAL, STATCDF, etc.) will now be correct for the appropriate test. d) The CONSENSUS MEANS command was modified in a number of ways. Specifically, 1) The output format was modified to make it more consistent and to provide better clarity. In particular, a clearer distinction is made between standard uncertainty (the standard error of the consensus mean), expanded uncertainty (2*standard error) and expanded uncertainty based on a normal or t percent point value. 2) Modified the summary tables. There are now 4 summary tables generated: i) A summary table of the original data. ii) A summary table of the 95% confidence limits generated by each method iii) A summary table of the standard uncertainties generated by each method (i.e., the standard error of the consensus mean estimate) iv) A summary table of the expanded uncertainties generated by each method (i.e., the 2 times the standard error of the consensus mean estimate) 3) Added the following new methods: i) The Graybill-Deal method now generates confidence limits using a method proposed by Andrew Rukhin. It also generates 4 distinct estimates of the variance of the consensus mean (the Sinha method, the naive method, and 2 methods proposed by Nien-Fan Zhang. The commonly used naive method is know to seriously underestimate the variance for small sample sizes. ii) Added the generalized confidence interval method proposed by Hari Iyer and Jack Wang. iii) Added the DerSimonian-Laird method. 4) Previous versions of Dataplot allowed you to create the CONSENSUS MEANS output in HTML format (CAPTURE HTML FILE.HTM) or Latex format (CAPTURE LATEX file.tex). This was extended to include Rich Text Format (RTF). The RTF option is used for creating output that can be read into Microsoft Word (RTF is a protocol Microsoft created for transporting word processing files between different word processing programs). For example CAPTURE RTF FILE.RTF CONSENSUS MEAN Y X END OF CAPTURE You can then import FILE.RTF into Word. Note that although RTF is suppossed to be a portable format, our experience is that non-Word word processors do a poor job of importing the Dataplot RTF files (tables tend to be problamatic for non-Word software and Dataplot is creating most of its RTF output as tables). 6) The following updates were made to graphics output devices. a) The GD library, used to generate JPEG and PNG format graphs, was updated from version 1.84 to 2.033. The primary consequence of this is that we can now generate GIF format files as well. To generate GIF files, enter SET IPL1NA PLOT.GIF DEVICE 2 GD GIF b) Dataplot can now generate graphs in Latex format. The primary motivation for using this format is to generate publication quaility graphs. There are some unique features to this device driver that are described in detail in the HELP LATEX command. 7) The following statistic command was added. LET A = RATIO Y1 Y2 This statistic is the sum of Y1 divided by the sum of Y2. The following additional commands are supported: TABULATE RATIO Y1 Y2 X CROSS TABULATE RATIO Y1 Y2 X1 X2 RATIO PLOT Y1 Y2 X RATIO CROSS TABULATE PLOT Y1 Y2 X1 X2 BOOTSTRAP RATIO PLOT Y1 Y2 JACKNIFE RATIO PLOT Y1 Y2 8) The following special function library functions were added: I0INT - integral of the modified Bessel function of the first kind and order 0 J0INT - integral of the Bessel function of the first kind and order 0 K0INT - integral of the modified Bessel function of the third kind and order 0 Y0INT - integral of the Bessel function of the second kind and order 0 I0ML0 - difference of the modified Bessel function of the first kind of order 0 and the modified Struve function of order 0 I1ML1 - difference of the modified Bessel function of the first kind of order 1 and the modified Struve function of order 1 AIRINT - integral of the Airy function Ai BIRINT - integral of the Airy function Bi AIRYGI - modified Airy function Gi AIRYHI - modified Airy function Hi ATNINT - integral of the inverse-tangent function 9) Added the following LET subcommands: a) LET Y2 = REPLACE GROUPID GROUP2 Y1 This command does the following: 1) It matches the values in GROUP2 against GROUPID and returns the indices of the matching rows for the GROUPID array. 2) The indices are used to access the corresponding value in the Y1 array. 3) The corresponding row of Y2 is replaced with the Y1 value. The abbreviated syntax LET Y2 = REPLACE GROUPID GROUP simply assigns a value of 1 in the corresponding row of Y2. Enter HELP REPLACE for details. b) LET Y2 X2 = MATRIX BIN M This command is used to generate a frequency table for the elements in a matrix. This can be used to generate a histogram of the elements in a matrix. For example, LET Y2 X2 = MATRIX BIN M HISTOGRAM Y2 X2 Enter HELP MATRIX BIN for details. c) LET M = MATRIX TRUNCATION M IVALUE LET M = MATRIX LOWER TRUNCATION M IVALUE Set all values in the matrix M that are less than IVALUE to IVALUE. This command can be used in conjunction with the MATRIX SUBTRACT command to remove background values from a matrix. For example, if the background value is 5, do something like LET IBACK = 5 LET IZERO = 0 LET M = MATRIX SUBTRACT M IBACK LET M = MATRIX TRUNCATION M IZERO Likewise, you can use the following command to perform an upper truncation: LET M = MATRIX LOWER TRUNCATION M IVALUE That is, any values in M greater than IVALUE are set to IVALUE. 10) The SET HISTOGRAM CLASS WIDTH was previously implemented to specify different default class width algorithms for histograms. This command was extended to apply to the following additional commands: LET Y2 X2 = BINNED Y LET Y2 X2 = MATRIX BIN Y NORMAL MIXTURE MAXIMUM LIKELIHOOD Y CHI-SQUARE GOODNESS OF FIT Y 2 SAMPLE CHI-SQUARE GOODNESS OF FIT Y 11) Added the following command PROCESS ID This command will print the process id and save this process id in the internal parameter PID. 12) Made the following bug fixes. a) Previously, if all elements of a response variable were equal, the HISTOGRAM command would print an error message and not generate the histogram. Dataplot will now print a warning message, but will generate a histogram with one non-zero class (it will generate one class above and one class below with zero count as well). b) In the TABULATE command, if all elements in the response variable are identifical, change from an error message to a warning message and perform the tabulation anyway. c) Corrected a bug in Friedman's test. The previous version is correct if the original data is the rank within a block. The corrected version does not require that the data already be ranked. d) The WILK SHAPIRO command was not returning the p-value in the saved parameter PVALUE correctly. This was corrected. e) For the command LET Z2 = BIVARIATE INTERPOLATION Z Y X Y2 X2 the Y and X arguments were in the wrong order (i.e., the command was interperting Y X as X Y). This was corrected. f) Fixed bugs in the LET X = CHARACTER CODE IX1 LET X = ALPHABETIC CHARACTER CODE IX1 commands. g) The command LET Y2 XLOW XUPP = COMBINE FREQUENCY TABLE Y X is used to combine low frequency bins. The original implementation simply worked from left to right to combine the bins. Since low frequency bins typically occur in the left and right tails, the algorithm was modified to move from the left tail to the center and then from the right tail to the center. h) Fixed a bug where the ORIENTATION command could cause Dataplot to hang on subsequent plots if no DEVICE 2 command was defined and a software font was used to draw text. i) Dataplot creates and uses a number of temporary files in the current directory. If you have multiple sessions running from the current directory, this can create a problem for these temporary files. In most cases, a conflict does not occur because Dataplot will open the file, read or write to the file, and then close the file immediately. However, a few files, such as the plot files dppl1f.dat and dppl2f.dat, typically remain open. The effect of different Dataplot sessions trying to access these files is system dependent. 1. On Unix and Windows 98/NT4 platforms, the file will contain whatever was most recently written to it. 2. On Windows 2000/XP platforms, the Dataplot session that opens the file first has a "lock" on the file. This causes any subsequent Dataplot session that tries to access the file to hang. This is particularly a problem with the GUI version on Windows 2000/XP. Specifically, if the Dataplot GUI does not shut down cleanly, the underlying Dataplot executable does not get killed. This then causes any future attempt to open the GUI to hang since the "dead" Dataplot executable has a lock on the file. You have to use "Cntrl-Alt-Del" to bring up the Task Manager, select "Processes", and then manually kill any "DPLAHEY.EXE" processes in order to clear the dead process. In particualar, if you close the GUI by clicking the "x" in the upper right hand corner (rather than clicking the EXIT menu), this does not kill the underlying DPLAHEY.EXE process. As a partial solution to this problem, Dataplot should now trap this condition. It will print a message indicating how to clear the "dead" DPLAHEY.EXE process. In addition, it will do one of two things in the current Dataplot process: a. It will attach the process id to the temporary file name and then re-open the file. b. It will simply ignore file (so if dppl2f.dat is locked, Dataplot will not write the current plot to dppl2f.dat in the current Dataplot session). You can specify which option Dataplot will use by entering one of the following commands in your startup file (c:\Program Files\NIST\DATAPLOT\DPLOGF.TEX): SET TEMPORARY FILE PID SET TEMPORARY FILE IGNORE The default is PID. ----------------------------------------------------------------------- The following enhancements were made to DATAPLOT June - August 2005. ----------------------------------------------------------------------- 1) The following matrix commands were added. a. The sum of all elements in a matrix can be computed with the following command LET A = MATRIX SUM M b. Previous versions of Dataplot allowed you to compute various column or row statistics (HELP MATRIX COLUMN STATISTIC or HELP MATRIX ROW STATISTIC for details). This capability has been extended to the case of computing the statistics for the entire matrix with the command LET A = MATRIX GRAND M where denotes the desired the statistic (the list of supported statistics is the same as for the MATRIX COLUMN STATISTIC and MATRIX ROW STATISTIC commands. c. Previous versions of Dataplot allowed you to compute various column or row statistics (HELP MATRIX COLUMN STATISTIC or HELP MATRIX ROW STATISTIC for details). This capability has been extended to the case where the matrix is divided into equal partitions with the command LET MOUT = MATRIX PARTITION M NROW NCOL with M, NROW, and NCOL denoting the input matrix, the number of rows in each sub-matrix, and the number of columns in each sub-matrix, respectively. Note that this command returns a matrix (MOUT) of values. That is, the original matrix is divided into sub-matrices containing NROW rows and NCOL columns each. The partition starts at row 1 and column 1. The number of rows in MOUT is determined by dividing the number of rows in M by NROW. Likewise, the number of columns is determined by dividing the number of columns in M by NCOL. If this division does not result in an integer value (e.g., 23 columns in M and NCOL = 5 results in 3 columns left over), then the last column, or row, of MOUT will be based on whatever columns are left over. In addition, the MATRIX PARTITION command has been extended to accomodate unequal partitions where the partitions need not be contiguous. The syntax in this case is LET MOUT = MATRIX PARTITION M TAGROW TAGCOL with M denoting the input matrix. In this case, TAGROW and TAGCOL are vectors with TAGROW having the same number of rows as M and TAGCOL having the same number of columns as M. The elements of TAGROW and TAGCOL identify which partition each element of M belongs to. The output matrix will be dimensioned based on the number of distinct values in TAGROW and TAGCOL. 2) The following commands were added to compute probability weighted moments and L-moments. LET P = PROBABILITY WEIGHTED MOMENTS Y LET L = L MOMENTS Y 3) The following distributional updates were made. a. Made the following enhancements to the generalized Pareto maximum likelihood command. 1. L-moment and elemental percentile estimates are now included. The L-moment estimators are a refinement of probability weighted moments. The elemental perecentile method is described in Castillo, Hadi, Balakrishnan, and Sarabia, "Extreme Value and Related Models with Applications in Engineering and Science", Wiley, 2005. One advantage of the elemental percentile approach is that it does not have the restricted domain for the shape parameter that the moment and maximum likelihood estimators have. 2. The elemental percentile estimate is now used as the starting value for the maximum likelihood. This seems to improve the convergence of the ML method. 3. The methods used (moments, L-moments, elemental percentiles, and maximum likelihood) do not estimate a location parameter. By default, these methods will now use the minimum data value (minus an epsilon fudge factor) as the estimate of location. The data will subtract this value before applying the estimation procedures. If you would like to provide your own location estimate, enter the command LET THRESHOL = Any data values less than the value specified for THRESHOL will be omitted from the estimation. Note that the generalized Pareto is often used in the context of modeling the distribution of "points above a threshold", so specifying a threshold greater than some of the data points is fairly common. 4. The maximum likelihood estimates now include the normal approximation confidence intervals for the scale and shape parameters and, optionally, for select percentiles of the data. To specify percentile estimates, enter the command SET MAXIMUM LIKELIHOOD PERCENTILES where specifies the name of a variable containing the desired percentiles. You can specify DEFAULT to to use a default set of values. Be aware that for the generalized Pareto maximum likelihood estimation, a relatively large sample size may be required for the asymptotic normal approximations to become reasonably accurate. Some studies have indicated sample sizes of at least 500 may be required. b. Added support for the maximum likelihood estimation for the inverted Weibull distribution: INVERTED WEIBULL MLE Y INVERTED WEIBULL MLE Y X The first syntax supports the full sample case. It will return confidence intervals for the shape and scale parameters for various values of alpha (based on the normal approximations) and will return confidence intervals for selected percentiles if you have entered a SET MAXIMUM LIKELIHOOD PERCENTILES DEFAULT command. The second syntax supports the censored case. This case currently only returns point estimates. c. The BINOMIAL MLE now returns improved confidence intervals. d. We have modified the output from a number of the maximum likelihood commands to make the output more consistent. 3) Made a number of bug fixes. In particular a. Fixed a bug where the following orm of the DERIVAIVE command wasn't being recognized: LET FUNCTION D = DERIVATIVE F WRT X This syntax should now work. b. Fixed the DIFFERENCE OF MEANS CONFIDENCE INTERVALS command (in adding support for the HTML/LATEX output, we had shut off the standard ASCII output). Fixed the HTML outout for this command. ----------------------------------------------------------------------- The following enhancements were made to DATAPLOT January - May 2005. ----------------------------------------------------------------------- 1) Distributional Modeling Updates a. Dataplot provides extensive distributional modeling capabilities via probability plots and PPCC/KS plots. One limitation of these methods is that they do not provide estimates for the uncertainty of the parameter estimates and for the distribution quantiles. The BOOTSTRAP ... PLOT command was enhanced to support distributional modeling for a number of distributions. This can be used to obtain confidence intervals for the distribution parameters, for selected percentiles of the distribution, and for the value of the PPCC (or K-S statistic). For details, enter HELP DISTRIBUTIONAL BOOTSTRAP b. For the case of one shape parameter, the PPCC plot was enhanced to support a group option (where group means multiple batches of data as oppossed to binned data). In this case, a separate curve is drawn for each batch of the data. This can be used to check for a common shape parameter across multiple batches of data. For details, enter HELP PPCC PLOT c. The PPCC PLOT and PROBABILITY PLOT commands support binned data. Previously, the binning consisted of two variables: the first contained the bin frequencies and the second contaned the mid-point of the bins. This form assumes the bins are of equal width. Some binned data may contain bins of unequal width. The most common reason for the this is to combine bins in the tails which have low frequencies. The PPCC PLOT and PROBABILITY PLOT commands were updated to handle this case. In this case, the syntax is PPCC PLOT Y XLOW XHIGH PROBABILITY PLOT Y XLOW XHIGH with Y, XLOW, and XHIGH denoting the frequency variable, the lower class boundary, and the upper class boundary, respectively. For details, enter HELP PPCC PLOT HELP PROBABILITY PLOT d. The following enhancenets were made to the maximum likelihood estimation. 1. Added confidence intervals for the location and scale parameters for the double exponential case (DOUBLE EXPONENTIAL MAXIMUM LIKELIHOOD Y). 2. Added a weighted order statistics method to the Cauchy maximum likelihood estimation (CAUCHY MLE Y). This method was added because it is the method recommended for the Cauchy Anderson-Darling test (see D'Agostino and Stephens, "Goodness-Of-Fit Techniques", Marcel Dekker, 1986, p. 164). 3. Added support for the maximum case of the 2-parameter extreme value type 2 (Frechet) distribution. This includes confidence intervals for the estimated parameters and for select percentiles (see SET MAXIMUM LIKELIHOOD PERCENTILES). e. The Anderson-Darling test now supports the extreme value type 2 (Frechet) for the maximum case and the Cauchy distribution. f. Added support for the minimum case for the generalized extreme value distribution. Added the GEVHAZ and GEVCHAZ functions to compute the hazard and cumulative hazard functions for the generalized extreme value distribution. g. A number of distributions (Weibull, Gumbel, Frechet, and generalized extreme value) support both a minimum and a maximum case. The command SET MINMAX <1/2> is used to specify which case (1 = minimum, 2 = maximum). If no MINMAX command is entered, previous versions used the value 1 as the default (this was chosen since the minimum case is what is typically used for the Weibull distribution). However, for the other distributions, the maximum case is generally the one most used. For this reason, we added the value 0 to indicate the default where the default is now specific to each distribution. For the Weibull, the default is the minimum and for the Gumbel, Frechet, and generalized extreme value the default is the maximum. 2) Interlaborartory Analysis Updates Dataplot added the following commands to perform an interlaboratory analysis as documented in "Standard Practice for Conducting an Interlaboratory Study to Determine the Precision of a Test Method", ASTM International, 100 Barr Harbor Drive, PO BOX C700, West Conshohoceken, PA 19428-2959, USA. This document is in support of ASTM Standard E 691 - 99. The specific commands added are: LET A = REPEATABILITY STANDARD DEVIATION Y LABID LET A = REPRODUCABILITY STANDARD DEVIATION Y LABID LET H = H CONSISTENCY STATISTIC Y LABID LET K = K CONSISTENCY STATISTIC Y LABID LET H TAG = H CONSISTENCY STATISTIC Y LABID MATID LET K TAG = K CONSISTENCY STATISTIC Y LABID MATID E691 INTERLAB Y LABID MATID The E691 INTERLAB command generates four tables documentented in the above document. The other comamnds are useful in generating the plots described in this standard. In addition, a number of built-in macros were added to generate the various graphs demonstrated in the standard. For more information, enter HELP E691 INTERLAB 3) The following command can be useful in converting data in a two-way table to a format required by certain Dataplot commands LET Y MATID LABID = REPLICATED STACK X1 ... XK LAB The resulting output has the form X1(1) 1 LAB(1) . . . X1(n) 1 LAB(n) X2(1) 2 LAB(1) . . . X2(n) 2 LAB(n) ... Xk(1) k LAB(1) . . . Xk(n) k LAB(n) This is a variation of the STACK command. The distinction is that the last variable entered is interpreted as a labid variable that is replicated for each of the response variables. For details, enter HELP REPLICATED STACK 4) Extreme Value Analysis a. Enhancements were made to the CME and DEHAAN commands (these estimate the parameters for a generalized Pareto distribution). b. Added the following command PEAKS OVER THRESHOLD PLOT Y For details, enter PEAKS OVER THRESHOLD PLOT Y. 5) Platform Specific Issues a) We have separated the Windows installation files into two distinct cases: a) Windows 2000/XP platforms b) Windows 95/98/NT4/ME platforms This was required for compiler compatibility reasons. The Lahey LF90 and Compaq Visual Fortran compilers were starting to show some problems under Windows XP (specifically with Service Pack 2). For Windows 2000/XP, we have upgraded to the Intel 8.1 Fortran compiler. However, this compiler does not support Windows 98 and earlier platforms. So the Windows 95/98/NT4/ME version is still built using the Lahey (for the GUI) and Compaq compilers. b) We have updated the Mac OSX installation. There is now a single file that you download that includes the executable, the auxillary files, the source, the needed Tcl/Tk files, and the g77 compiler. This simplifies the installation (e.g., you do not have to install Tcl/Tk yourself). 6) We have started overhauling some of the menus for the graphical interface (GUI). This will not be radically different, just an effort to provide better organization and clarity to the menus. This updating will occur over several releases. The initial update has re-arranged the top level menus. We have added a "Getting Started" menu to help new users. The Reliability and Extreme Values menus have been reorganized. 7) Dataplot uses the "." for the decimal point when reading data. Some countries use the "," for this purpose. We have added the command SET DECIMAL POINT with denoting the character to be used as the decimal point. Note that the use of this is currently fairly limited. It is used in free-format reads only. It is provided to allow international users the ability to read their data files without editing them. Note that it does not apply if you use the SET READ FORMAT command to define a format for the data. It is also not used for writing data nor for the output from Dataplot commands. 8) Fixed a number of bugs. a. Fixed the COLUMN LIMITS where the specified limits are arrays (as oppossed to single scalar values) to work in the case where columns are of unequal length. b. Internally, Dataplot treats strings and functions interchangeably. The one distinction is that strings preserve case. However, when strings are operating as functions, we want them to be converted to upper case. Dataplot was updated so that when a string is used as a function, it is converted to upper case. This also required some updates in the "^" and "&" string operators to handle case conversions appropriately. c. Fixed a bug in the Wilcox signed rank test when it was used for a 1-sample test. d. For generalized Pareto percent point function, the scale parameter was ignored. This was corrected. e. Fixed a bug in the HFLPPF library function. f. The GRUBBS TEST checks for both the maximum and minimum values as outliers (relative to the normal distribution). This is actually two tests: one for the minimum value and one for the maximum value. When testing for both, the value of alpha needs to be divided by 2. The fix was to have the Grubbs test generate output for 3 tests: 1) Test both the minimum and the maximum value (with the value of alpha adjusted appropriately). 2) Test the minimum value only. 3) Test the maximum value only. To suppress the one-sided tests, enter the command SET GRUBBS ONE SIDED OFF g. Fixed a bug in the discrete uniform random number generator. The algorithm was generating random numbers on the interval [1,N]. This was corrected to generate random numbers on the interval [0,N]. h. If the PRINTING switch was set to OFF, the YATES command was not writing information to files "dpst1f.dat" and "dpst2f.dat". This was corrected so that these files are printed regardless of the setting of the PRINTING switch. ----------------------------------------------------------------------- The following enhancements were made to DATAPLOT June - December 2004. ----------------------------------------------------------------------- 1) The following updates were made for probability distributions. A. The following enhancements were made to maximum likelihood estimation. 1. The maximum likelihood output was rewritten for the normal, lognormal, exponential, Weibull, gamma, beta, Gumbel, and Pareto distributions. Support was added for the following: a. Improved confidence intervals for the distributional parameters. b. support for censored data was added for the normal, lognormal, exponential, Weibull, and gamma distributions. c. Confidence intervals for selected percentiles was added for the normal, lognormal, exponential, Weibull, gamma, beta, and Gumbel distributions. 2. Added support for the Rayleigh, Maxwell, asymmetric Laplace, generalized Pareto, and normal mixture distributions: RAYLEIGH MAXIMUM LIKELIHOOD Y MAXWELL MAXIMUM LIKELIHOOD Y ASYMMETRIC LAPLACE MAXIMUM LIKELIHOOD Y GENERALIZED PARETO MAXIMUM LIKELIHOOD Y LET NCOMP = NORMAL MIXTURE MAXIMUM LIKELIHOOD Y The NCOMP parameter is used to specify how many normal distributions to mix (it defaults to 2 if a value is not specified for NCOMP). The online help for the maximum likelihood was also rewritten. Enter HELP MAXIMUM LIKELIHOOD for details. B. Support was added for the following new distributions. Skew-Laplace (Skew Double Exponential) distribution: LET A = SDECDF(X,LAMBDA) - cdf of skew-Laplace distribution LET A = SDEPDF(X,LAMBDA) - pdf of skew-Laplace distribution LET A = SDEPPF(X,LAMBDA) - ppf of skew-Laplace distribution Asymmetric Laplace (Asymmetric Double Exponential) distribution: LET A = ADECDF(X,LAMBDA) - cdf of asymmetric Laplace distribution LET A = ADEPDF(X,LAMBDA) - pdf of aysmmetric Laplace distribution LET A = ADEPPF(X,LAMBDA) - ppf of asymmetric Laplace distribution Maxwell-Boltzman distribution: LET A = MAXCDF(X,SIGMA) - cdf of Maxwell Boltzman LET A = MAXPDF(X,SIGMA) - pdf of Maxwell Boltzman LET A = MAXPPF(X,SIGMA) - ppf of Maxwell Boltzman Rayleigh distribution: LET A = RAYCDF(X) - cdf of Maxwell Boltzman LET A = RAYPDF(X) - pdf of Maxwell Boltzman LET A = RAYPPF(X) - ppf of Maxwell Boltzman Generalized Inverse Gaussian distribution: LET A = GIGCDF(X,CHI,LAMBDA,THETA) - cdf of generalized inverse gaussian distribution LET A = GIGPDF(X,CHI,LAMBDA,THETA) - pdf of generalized inverse gaussian distribution LET A = GIGPPF(X,CHI,LAMBDA,THETA) - ppf of generalized inverse gaussian distribution Generalized Asymmetric Laplace distribution: LET A = GALCDF(X,KAPPA,TAU) - cdf of generalized asymmetric Laplace distribution LET A = GALPDF(X,KAPPA,TAU) - pdf of generalized asymmetric Laplace distribution LET A = GALPPF(X,KAPPA,TAU) - ppf of generalized asymmetric Laplace distribution Bessel I Function distribution: LET A = BEICDF(X,S1SQ,S2SQ,NU) - cdf of Bessel I function distribution LET A = BEIPDF(X,S1SQ,S2SQ,NU) - pdf of Bessel I function distribution LET A = BEIPPF(X,S1SQ,S2SQ,NU) - ppf of Bessel I function distribution McLeish (related to Bessel K function) distribution: LET A = MCLCDF(X,ALPHA) - cdf of McLeish distribution LET A = MCLPDF(X,ALPHA) - pdf of McLeish distribution LET A = MCLPPF(X,ALPHA) - ppf of McLeish distribution Generalized McLeish (related to Bessel K function) distribution: LET A = GMCCDF(X,ALPHA,A) - cdf of McLeish distribution LET A = GMCPDF(X,ALPHA,A) - pdf of McLeish distribution LET A = GMCPPF(X,ALPHA,A) - ppf of McLeish distribution C. The following random number generators, plots, and commands were added: LET LAMBDA = LET Y = SKEW LAPLACE RANDOM NUMBERS FOR I = 1 1 N SKEW LAPLACE PROBABILITY PLOT Y SKEW LAPLACE KOLMOGOROV SMIRNOV GOODNESS OF FIT Y SKEW LAPLACE CHI-SQUARE GOODNESS OF FIT Y SKEW LAPLACE PPCC PLOT Y SKEW LAPLACE KS PLOT Y LET LAMBDA = LET Y = ASYMMETRIC LAPLACE RANDOM NUMBERS FOR I = 1 1 N ASYMMETRIC LAPLACE PROBABILITY PLOT Y ASYMMETRIC LAPLACE KOLMOGOROV SMIRNOV GOODNESS OF FIT Y ASYMMETRIC LAPLACE CHI-SQUARE GOODNESS OF FIT Y ASYMMETRIC LAPLACE PPCC PLOT Y ASYMMETRIC LAPLACE KS PLOT Y LET Y = MAXWELL RANDOM NUMBERS FOR I = 1 1 N MAXWELL PROBABILITY PLOT Y MAXWELL KOLMOGOROV SMIRNOV GOODNESS OF FIT Y MAXWELL CHI-SQUARE GOODNESS OF FIT Y LET Y = RAYLEIGH RANDOM NUMBERS FOR I = 1 1 N RAYLEIGH PROBABILITY PLOT Y RAYLEIGH KOLMOGOROV SMIRNOV GOODNESS OF FIT Y RAYLEIGH CHI-SQUARE GOODNESS OF FIT Y LET CHI = LET LAMBDA = LET THETA = LET Y = GENERALIZED INVERSE GAUSSIAN RANDOM NUMBERS ... FOR I = 1 1 N GENERALIZED INVERSE GAUSSIAN PROBABILITY PLOT Y GENERALIZED INVERSE GAUSSIAN KOLMOGOROV SMIRNOV ... GOODNESS OF FIT Y GENERALIZED INVERSE GAUSSIAN CHI-SQUARE ... GOODNESS OF FIT Y LET KAPPA = LET TAU = LET Y = GENERALIZED ASYMMETRIC LAPLACE RANDOM NUMBERS ... FOR I = 1 1 N GENERALIZED ASYMMETRIC LAPLACE PROBABILITY PLOT Y GENERALIZED ASYMMETRIC LAPLACE KOLMOGOROV SMIRNOV ... GOODNESS OF FIT Y GENERALIZED ASYMMETRIC LAPLACE CHI-SQUARE ... GOODNESS OF FIT Y LET S1SQ = LET S2SQ = LET NU = LET Y = BESSEL I FUNCTION RANDOM NUMBERS FOR I = 1 1 N BESSEL I FUNCTION PROBABILITY PLOT Y BESSEL I FUNCTION KOLMOGOROV SMIRNOV GOODNESS OF FIT Y BESSEL I FUNCTION CHI-SQUARE GOODNESS OF FIT Y LET ALPHA = LET Y = MCLEISH RANDOM NUMBERS FOR I = 1 1 N MCLEISH PROBABILITY PLOT Y MCLEISH KOLMOGOROV SMIRNOV GOODNESS OF FIT Y MCLEISH CHI-SQUARE GOODNESS OF FIT Y MCLEISH PPCC PLOT Y MCLEISH KS PLOT Y LET ALPHA = LET A = LET Y = GENERALIZED MCLEISH RANDOM NUMBERS FOR I = 1 1 N GENERALIZED MCLEISH PROBABILITY PLOT Y GENERALIZED MCLEISH KOLMOGOROV SMIRNOV GOODNESS OF FIT Y GENERALIZED MCLEISH CHI-SQUARE GOODNESS OF FIT Y GENERALIZED MCLEISH PPCC PLOT Y GENERALIZED MCLEISH KS PLOT Y D. Dataplot uses the following defintion for the generalized Pareto probability density function: f(x,gamma) = (1+gamma*x)**(-(1/gamma)-1) However, many sources (e.g., Johnson, Kotz, and Balakrishnan) define the generalized Pareto as: f(x,gamma) = (1-gamma*x)**((1/gamma)-1) That is, the sign of gamma is reversed. The following command was added: SET GENERALIZED PARETO DEFINITION was added. A value of JOHNSON or KOTZ for this command will use the second definition given. Any other value will use the first (default) definition. E. For the Pareto and Pareto type 2 distributions, what is typically referred to as the location parameter (the A parameter) is not a location parameter in the technical sense that the relation f(x;gamma,loc) = f((x-loc);gamma,0) does not hold (it is a location parameter in the sense that it defines a lower bound for the Pareto, but not the Pareto type 2, distribution). For this reason, we modified the Dataplot definition to treat A as a second shape parameter. For example, the Pareto PDF function is PARPDF(x,gamma,a,loc,scale) The A, LOC, and SCALE parameters are optional (A will default to 1 if not given). F. The following enhancements were made to the probability plot and ppcc/ks plots. Note that both the probability plot and the ppcc plot ultimately depend on computing the percent point function for the specified distribution. If the percent point function is fast to compute (e.g., if it exists as a simple, closed formula), then these plots can be generated rapidly even if the number of data points is large. On the other hand, some percent point functions can require a good deal of computation. For example, some distributions compute the cumulative distribution function via numerical integration and then compute the percent point function by inverting the cumulative distribution function. In these cases, the ppcc/ks plots can take too long to generate to be practical (this tends to be less of an issue with probability plots). 1. The following commands can be used to control how many points are used to generate probability and ppcc/ks plots, respectively: SET PROBABILITY PLOT DATA POINTS SET PPCC PLOT DATA POINTS The algorithm is to compute equally spaced percentiles of the full data set and then use these percentiles in generating the probability and ppcc/ks plot. Using this command involves a trade-off between speed and accuracy. For distributions with simple, closed formualas or fast approximations for the percent point function, there is little reason not to use the full data set. However, for many distributions, the ppcc plot or ks plot can become impractical as the number of data points increases. The minimum number of points is 20. The number of points is typically set between 50 and 100. You may want to use less than 50 points for a few distributions with particularly expensive percent point functions. For distributions with only moderately expensive percent point functions, you may want to go as high as 100 or 200. 2. For the ppcc (or ks) plot, each point on the plot represents one underlying probability plot (which in return requires n, where n is the sample size, computations of the percent point function. For distributions with one shape parameter, Dataplot typically uses 50 points (i.e., there are 50 underlying probability plots computed). For two shape parameters, Dataplot typically uses between 20 and 50 values for each shape parameter. It decreases the number of values used when the percent point function is expensive to compute. The following command allows you to explicitly specify how many probability plots are generated by the ppcc plot: SET PPCC PLOT AXIS POINTS with and denoting the number of values to use for the first and second shape parameters, respectively. Specifying is optional. Set these values to 0 in order to revert to the Dataplot default. There are actually two reasons for using this command. If the percent point function is fast to compute (e.g., the Weibull distribution), you may want to increase the number of points in order to generate a finer grid. On the other hand, if the percent point function is expensive to compute, you may want to decrease the number of points to speed up the generation of the plot. 3. If the ppcc (or ks) plot has two shape parameters, then the default graphical format is to plot the ppcc (or ks) value on the y-axis. Each curve on the plot represents one value of one shape parameter while the value of the x-axis coordinate represents the value of the other shape parameter. To reverse the roles of the shape parameters, enter the command SET PPCC PLOT AXIS ORDER REVERSE To restore the default, enter SET PPCC PLOT AXIS ORDER DEFAULT 4. The PPCC PLOT will write the following to the file dpst2f.dat (in the current directory): PPCC LOCATION SCALE SHAPE1 SHAPE2 VALUE PARAMETER PARAMETER PARAMETER PARAMETER This can be useful for plotting how the estimate of location and scale change as the shape parameter changes. In some cases, a less optimal value of the shape parameters may be preferred if it generates more realistic estimates for location and scale. 5. The PROBABILITY PLOT and PPCC PLOT were updated to support multiply censored data. The syntax is CENSORED PROBABILITY PLOT Y X CENSORED PPCC PLOT Y X The X variable identifies which points represent failure and which represent censoring times. Specifically, X = 1 implies a failure time and X = 0 represents a censoring time. The word CENSORED is required to distinguish this syntax from the syntax for binned data. Censored probability plots and censored ppcc plots do not apply to binned data. Dataplot supports two algorithms for determining plot coordinates for a censored probability plot. i. The uniform order statistic medians are generated based on the full sample size. However, only values that represent a failure time are actually plotted. ii. Instead of uniform order statistic medians, the plotting positions for the failure times are computed using the Kaplan-Meier product limit estimate: U(i) = ((n+0.7)/(n+0.4))* PRODUCT[q=1 to i][(n-q+0.7)/(n-q+1.7)] with n denoting the full sample size and q denoting failure times only. The theoretical quantile is then the percent point function of U(i). The censored ppcc plot is then based on the correlation coefficient of the censored probability plot. To specify which censoring algorithm to use, enter the commands SET CENSORED PROBABILITY PLOT SET CENSORED PPCC PLOT The default is to use the uniform order statistic medians algorithm. G. The following enhancements were made to the Kolmogorov-Smirnov goodness of fit command and the KS PLOT. plot and ppcc/ks plots. 1. The KS PLOT for the binned case ( KS PLOT Y X) now automatically plots the chi-square goodness of fit statistic rather than the Kolmogorov-Smirnov goodness of fit statistic. This is done since the chi-square goodness of fit is expliticly based on binned data. Note that bins with a size less than 5 are automatically combined so that the minimum bin size is at least 5. 2. The KS PLOT will write the following to the file dpst2f.dat (in the current directory): PPCC LOCATION SCALE SHAPE1 SHAPE2 VALUE PARAMETER PARAMETER PARAMETER PARAMETER This can be useful for plotting how the estimate of location and scale change as the shape parameter changes. In some cases, a less optimal value of the shape parameters may be preferred if it generates more realistic estimates for location and scale. 2) The following graphics commands were added. a. Univariate average shifted histograms can be generated with the command: ASH HISTOGRAM Y 3) The following analysis commands were added. a. Cochran's test can be performed with the command COCHRAN TEST Y X where Y is a response variable and X is a group identifier variable. Cochran's test is an alternative to the Kruskal-Wallis test when the response variable is dichotomous (i.e., only 2 possible values). b. The Kruskal-Wallis test was enhanced to write the pairwise multiple comparisons to the file dpst1f.dat. c. Van Der Waerden's test can be performed with the command VAN DER WAERDEN TEST Y X where Y is a response variable and X is a group identifier variable. Van Der Waerden's test is an alternative to KRUSKAL WALLIS that is based on normal scores of the ranks. 4) The following statistics and LET subcommands were added. a. Kendell's tau can be computed with the command LET A = KENDELL TAU Y1 Y2 b. For the chi-square goodness of fit, it is generally advisable to combine bins with small counts (typically, 5 is recommended as a minimum bin size). To convert equal width bins to variable width bins with a minimum bin count, enter the commands LET MINSIZE = LET Y2 XLOW XUPPER = Y X c. The commands LET Y2 X2 = ASH BINNED Y LET Y2 X2 = COUNTS ASH BINNED Y generate frequency tables based on the average shifted histogram (see ASH HISTOGRAM above). The first syntax returns the relative frequency while the second syntax returns a count. 5) The following enhancements were made to the READ command. a. In previous versions of Dataplot, if your data set contained rows with an unequal number of columns, Dataplot would only read the number of variables corresponding to the row with the minimum number of columns. If you would like Dataplot to pad missing columns with a missing value, enter the command SET READ PAD MISSING COLUMNS ON For example, if you enter the command READ FILE.DAT X1 X2 X3 X4 X5 then rows with less than five columns will set the missing rows to a missing value. To set the numeric value that represents a missing value, enter SET READ MISSING VALUE where denotes the desired numeric value. To reset the default behavior, enter the command SET READ PAD MISSING COLUMNS OFF In some cases, missing columns would be indicative of an error in the data file. b. The SUBSET/EXCEPT/FOR clause on a READ command was ambiguous. The ambiguity aries from the fact that it is not clear whether the SUBSET/EXCEPT/CLAUSE command refers to the lines in the data file being read or to the output variables that are created by the READ command. We address this with the following command: SET READ SUBSET In this command, PACK means the SUBSET/EXCEPT/FOR clause does not apply while DISPERSE means that it does. The first setting applies to the input file while the second setting applies to the created data variables. This is demonstrated with the following example (note that P-D means the data file is set to PACK and the output variable is set to DISPERSE). The first column is the data in the file while the remaining columns show what the resulting data variable should look like. READ FILE.DAT X FOR I = 1 2 10 X P-D P-P D-P D-D =========================================== 1 1 1 1 1 2 0 2 3 0 3 2 3 5 3 4 0 4 7 0 5 3 5 9 5 6 0 6 - 0 7 4 7 - 7 8 0 8 - 0 9 5 9 - 9 10 - 10 - - The default setting is PACK-DISPERSE (this is the default because this is the behavior of previous versions of Dataplot). 6) Miscellaneous Updates a. Added the command SET POSTSCRIPT DEFAULT COLOR Postscript devices can be either black and white or color. Dataplot assumes black and white by default. After the DEVICE <2/3> POSTSCRIPT command, you can enter DEVICE <2/3> COLOR ON Although this works fine for DEVICE 2, it presents complications for DEVICE 3 (this is the device used by the PP command to print the current graph to a Postscript printer). Dataplot opens/closes this device as needed without the user entering any commands. It can be difficult to determine when to insert a DEVICE 3 COLOR ON command. If you enter SET POSTSCRIPT DEFAULT COLOR ON then Dataplot will assume Postscript devices are color (this applies to both DEVICE 2 and DEVICE 3, although it is primarily motivated for DEVICE 3 output). b. The default algorithm for class width in Dataplot is to use 0.3*s where s is the sample standard deviation. A number of different algorithms have been proposed to obtain "optimal" class widths. The command SET HISTOGRAM CLASS WIDTH can be used to specify the default class width that Dataplot will use for the HISTOGRAM and ASH HISTOGRAM commands. Additional choices may be added in future releases. The current choices are: DEFAULT - use 0.3*s SD - use 0.3*s NORMAL - use 2.5*s/n**(1/3) NORMAL CORRECTED - start with 2.5*s/n**(1/3). If the skewness is between 0 and 3, multiply this by the correction factor: 1/(1 - 0.006*skew + 0.27*skew**2 - 0.0069*skew**3). If the kurtosis - 3 is between 0 and 6, multiply by the correction factor: 1 - 0.2*(1 - EXP(-0.7*(kurt - 3))) IQ - use 2.603*IQ/N**(1/3) where IQ is the interquartile range The NORMAL width is an optimal choice (in the sense of minimizing the integrated mean square error of the histogram) if the data is in fact normal. The NORMAL CORRECTED provides correction factors for moderate skewness and kurtosis. The IQ replaces s with a robust estimate of scale (the interquartile range) and should provide a reasonable bin width for a wide range of underlying distributions. Since the "optimal" choice of bin width is dependent on the underlying distribution of the data, it is difficult to provide a default bin width that will work well in all cases (we are typically using the histogram to help determine what that underlying distribution actually is). An explicit CLASS WIDTH command will override the default class width algorithm. c. For the chi-square goodness of fit test, it is usually recommended that classes with less than 5 observations be combined in order to obtain a reasonably accurate approximation. Given data that is binned into equal size bins, you can automatically combine bins with small frequencies with the commands LET MINSIZE = LET Y3 XLOW XHIGH = COMBINE FREQUENCY TABLE Y2 X2 The variables XLOW and XHIGH will contain the lower and upper boundary values for the classes (since bins will no longer be of equal length), respectively. The value for MINSIZE defines the minimum frequency for a class (it defaults to 5). You can then generate a chi-square goodness of fit test with the command CHISQUARE GOODNESS OF FIT Y3 XLOW XHIGH A typical sequence of commands for generating a chi-square goodness of fit test for a discrete distribution, starting from raw data, is LET AMIN = MINIMUM Y LET AMAX = MAXIMUM Y CLASS LOWER AMIN CLASS UPPER AMAX CLASS WIDTH 1 LET Y2 X2 = BINNED Y LET MINSIZE = 5 LET Y3 XLOW XHIGH = COMBINE FREQUENCY TABLE Y2 X2 CHISQUARE GOODNESS OF FIT Y3 XLOW XHIGH d. The CORRELATION MATRIX and COVARIANCE MATRIX compute the correlation and covariance matrices, respectively, of the columns of a matrix. If you would like these to be generated from the rows of the matrix, you can enter the commands SET CORRELATION MATRIX DIRECTION ROW SET COVARIANCE MATRIX DIRECTION ROW To reset to the columns, enter SET CORRELATION MATRIX DIRECTION COLUMN SET COVARIANCE MATRIX DIRECTION COLUMN 7) Bug Fixes: a. There was a bug reading numbers of the form -.23 In this case, the minus sign was being lost. You can work around this by entering the number as -0.23 This bug is fixed in the current version. NOTE: This bug was introduced in the 1/2004 version. b. There was a bug reading rows containing a single character. This has been fixed. If you encounter this bug, you can work around it by inserting a leading space in the data file. NOTE: This bug was introduced in the 1/2004 version. c. The SET commands that accepted file names as arguments did not support quoting. Enclosing the file name in quotes is required when the file names contains spaces or hyphens. This has been corrected. d. There was a bug in the SUMMARY command where in some cases it did not extract the correct data. This has been fixed. e. There was a bug in the KAPLAN MEIER PLOT command that caused the censoring variable to not be recognized. This has been corrected. ------------------------------------------------------------------------- The following enhancements were made to DATAPLOT February - May 2004. ------------------------------------------------------------------------- 1) The following updates were made for probability distributions. a. Support was added for the following new distributions. Log-skew-normal distribution: LET A = LSNCDF(X,LAMBDA,SD) - cdf of log-skew-normal distribution LET A = LSNPDF(X,LAMBDA,SD) - pdf of log-skew-normal distribution LET A = LSNPPF(P,LAMBDA,SD) - ppf of log-skew-normal distribution Log-skew-t distribution: LET A = LSTCDF(X,NU,LAMBDA,SD) - cdf of log-skew-normal distribution LET A = LSTPDF(X,NU,LAMBDA,SD) - pdf of log-skew-normal distribution LET A = LSTPPF(P,NU,LAMBDA,SD) - ppf of log-skew-normal distribution G-and-H distribution: LET A = GHCDF(X,G,H) - cdf of g-and-h distribution LET A = GHPDF(X,G,H) - pdf of g-and-h distribution Note that the ppf function was added in a previous update. Hermite distribution: LET A = HERCDF(X,A,B) - cdf of Hermite distribution LET A = HERPDF(X,A,B) - pdf of Hermite distribution LET A = HERPPF(P,A,B) - ppf of Hermite distribution Yule distribution: LET A = YULCDF(X,P) - cdf of Yule distribution LET A = YULPDF(X,P) - pdf of Yule distribution LET A = YULPPF(P,P) - ppf of Yule distribution b. The following pdf functions were added (these distributions previously supported the cdf and ppf functions). LET A = NCTPDF(X,NU,LAMBDA) - pdf of non-central t LET A = DNTPDF(X,NU,L1,L2) - pdf of doubly non-central t LET A = NCCPDF(X,NU,LAMBDA) - pdf of non-central chi-square LET A = NCFPDF(X,NU1,NU2,L1) - pdf of non-central F LET A = DNFPDF(X,NU1,NU2,L1,L2) - pdf of doubly non-central F LET A = NCBPDF(X,A,B,LAMBDA) - pdf of non-central Beta These pdf functions are computed by taking the numerical derivative of the corresponding cdf function. You may at times get warning messages that the derivative has not converged with sufficient accuracy (this occurs most frequently with the non-central Beta distribution). c. The following enhancements were made to maximum likelihood estimation. 1. The binomial case now generates lower and upper confidence limits based on the Agresti and Coull approximation. 2. The lognormal case now generates confidence limits for the shape and scale parameters. 3. Support was added for the following distributions: LOGARITHIC SERIES MAXIMUM LIKELIHOOD Y GEOMETRIC MAXIMUM LIKELIHOOD Y BETA BINOMIAL MAXIMUM LIKELIHOOD Y NEGATIVE BINOMIAL MAXIMUM LIKELIHOOD Y HYPERGEOMETRIC MAXIMUM LIKELIHOOD Y HERMITE MAXIMUM LIKELIHOOD Y YULE MAXIMUM LIKELIHOOD Y FATIGUE LIFE MAXIMUM LIKELIHOOD Y GEOMETRIC EXTREME EXPONENTIAL MAXIMUM LIKELIHOOD Y FOLDED NORMAL MAXIMUM LIKELIHOOD Y CAUCHY MAXIMUM LIKELIHOOD Y 4. For the Johnson SU/SB distribution, a percentile estimator is now available (a method of moments estimator was previously available): JOHNSON PERCENTILE Y Note that this estimator will automatically determine whether a SB or SU estimator is appropiate. Also, you can define a constant Z used by this estimator by entering the command (before the JOHNSON PERCENTILE command): LET Z = This value is typically set between 0.5 and 1 with a default value of 0.54. As the sample size gets larger, then values of Z closer to 1 are appropriate (e.g., for a sample of size 1,000, a value of 0.8 works well). 5. Support for Latex and HTML output was added to most supported distributions. d. The following random number generators were added: LET NU = LET LAMBDA = LET Y = NONCENTRAL T RANDOM NUMBERS FOR I = 1 1 N LET NU = LET LAMBDA1 = LET LAMBDA2 = LET Y = DOUBLY NONCENTRAL T RANDOM NUMBERS FOR I = 1 1 N LET NU = LET LAMBDA = LET Y = NONCENTRAL BETA RANDOM NUMBERS FOR I = 1 1 N LET GAMMA = LET Y = GENERALIZED LOGISTIC RANDOM NUMBERS FOR I = 1 1 N LET GAMMA = LET Y = GENERALIZED HALF-LOGISTIC RANDOM NUMBERS FOR I = 1 1 N LET ALPHA = LET BETA = LET Y = HERMITE RANDOM NUMBERS FOR I = 1 1 N LET P = LET Y = YULE RANDOM NUMBERS FOR I = 1 1 N LET A = LET C = LET Y = WARING RANDOM NUMBERS FOR I = 1 1 N LET A = LET B = LET C = LET Y = GENERALIZED WARING RANDOM NUMBERS FOR I = 1 1 N The t, F, and chi-square random number generators were updated to accept non-integer values for the degrees of freedom parameters. e. The following additions were made to the probability plot, Kolmogorov-Smirnov goodness of fit, chi-sqaure goodness of fit, and ppcc plot commands: LET LAMBDA = LET SD = LOG SKEW NORMAL PROBABILITY PLOT Y LOG SKEW NORMAL KOLMOGOROV-SMIRNOV GOODNESS OF FIT Y LOG SKEW NORMAL CHI-SQUARE GOODNESS OF FIT Y LOG SKEW NORMAL PPCC PLOT Y LET LAMBDA = LET SD = LET NU = LOG SKEW T PROBABILITY PLOT Y LOG SKEW T KOLMOGOROV-SMIRNOV GOODNESS OF FIT Y LOG SKEW T CHI-SQUARE GOODNESS OF FIT Y LET G = LET H = G AND H PROBABILITY PLOT Y G AND H KOLMOGOROV-SMIRNOV GOODNESS OF FIT Y G AND H CHI-SQUARE GOODNESS OF FIT Y G AND H PPCC PLOT Y LET ALPHA = LET BETA = HERMITE PROBABILITY PLOT Y HERMITE CHI-SQUARE GOODNESS OF FIT Y HERMITE PPCC PLOT Y LET P = YULE PROBABILITY PLOT Y YULE CHI-SQUARE GOODNESS OF FIT Y YULE PPCC PLOT Y f. The Anderson Darling test was updated to support the generalized Pareto distribution: ANDERSON-DARLING GENERALIZED PARETO TEST Y The maximum likelihood estimation for the generalized Pareto is still undergoing algorithmic development, so you should specify the shape and scale parameter for the generalized Pareto (before invoking the Anderson-Darling test) as follows: LET GAMMA = LET A = g. An optional definition was added for the geometric distribution. The default defintion for the geometric distribution is the number of failures before the first success is obtained in a sequence of Bernoulli trials. The alternate definition is the number of trials up to and including the first success in a series of Bernoulli trials. This definition simply shifts the geometric distribution to start at X = 1 rather than X = 0. To specify the alternate definition, enter the command SET GEOMETRIC DEFINITION DLMF To restore the default definition, enter the command SET GEOMETRIC DEFINITION JOHNSON AND KOTZ h. The negative binomial was updated to support non-integer arguments for the number of failures shape parameter (i.e., k). i. A number of bug fixes and algorithmic improvements were made for the ppcc plots with two shape parameters and the random number generation for a few distributions. 2. The following enhancements were made to the PPCC PLOT and PROBABILITY PLOT commands. a. For some long tailed distributions, there can be large variability in the tails. This can distort the estimates of location, PPA0, and scale, PPA1, of the line fitted to the probability plot. To address this, Dataplot now also returns PPA0BW and PPA1BW. These are the estimates obtained by performing two iterations of biweight weighting of the residuals. In most cases, the use of PPA0 and PPA1 is preferred. However, if the probability plot indicates the prescence of extreme outliers in the tails, PPA0BW and PPA1BW may provide better estimates for the location and scale parameters. b. The following command was added as a variant of the ppcc plot: KS PLOT Y where is any of the distributions supported by the PPCC PLOT command. This plot uses a similar concept to the ppcc plot. However, it uses the value of the Kolmogorov-Smirnov goodness of fit statistic rather than the correlation coefficient of the probability plot as the measure of distributional fit. In this, the goal is to minimize the Kolmogorov-Smirnov goodness of fit statistic. Although we are still developing experience with this plot, a few prelimary recommendations are: 1. For most continuous distributions with one shape parameter, the PPCC PLOT and KS PLOT generate similar estimates for the shape parameter. 2. The KS PLOT seems to perform better for at least some distributions with two shape parameters. 3. The KS PLOT generates a smoother plot for discrete distributions. For additional information, enter HELP KS PLOT c. For the PPCC PLOT and KS PLOT, the following command allows you to specify the desired format for the plot when there are two shape parameters: SET PPCC FORMAT For the default setting, TRACE, these plots are generated as a multi-trace 2D plot. That is, the Y axis will represent the correlation (or value of the Kolmogorov-Smirnov statistic), the X axis will represent the value of the second shape parameter, and each trace will represent one of the values for the first shape parameter. If this value is set to 3D, the plot is represented as a 3D surface plot. 3. Sometimes data may only be available in the form of a frequency table. However, some Dataplot commands may expect the data in a "raw" format. The following command was added to convert frequency data to raw data: LET Y = FREQUENCY TO RAW X FREQ For example, X FREQ -------- 0 3 1 2 2 4 would be converted to 0 0 0 1 1 2 2 2 2 ------------------------------------------------------------------------- The following enhancements were made to DATAPLOT June 2003-January 2004. ------------------------------------------------------------------------- 1) The following enhancements were made to the Dataplot I/O capabilities. a) Previously, the Dataplot READ command was updated to handle the syntax READ FILE.DAT In this case, Dataplot simply assigns the names X1, X2, and so on to the variables. Many packages accept data files where the first line contains the variable names. To support this in Dataplot, do the following: SET READ VARIABLE LABEL ON READ FILE.DAT In this case, Dataplot will interpret the first line read as the variable names in the file. b) Dataplot has previously not supported reading character variables in data files (with the one execption of READ ROW LABELS). If encountered, Dataplot would generate an error message and not read the data file correctly. To address this, we have added the command SET CONVERT CHARACTER Setting this to ERROR will continue the current Dataplot action of reporting an error. This is recommended for the case when a file is suppossed to contain only numeric data and the presence of character data is in fact indicative of an error in the data file. Setting this to IGNORE will instruct Dataplot to simply ignore any fields containing character data. Setting this to ON will read character fields and write them to the file "dpzchf.dat". There are some restrictions on when Dataplot will try to read character data: 1) This only applies to the variable read case. That is, READ PARAMETER and READ MATRIX will ignore character fields or treat them as an error. 2) Dataplot will only try to read character data from a file. When reading from the keyboard (i.e., when READ is specified with no file name), character data will be ignored when a SET CONVERT CHARACTER ON is specified. 3) This capability is not supported for the SERIAL READ case. 4) The SET READ FORMAT command does not accept the "A" format specification for reading character fields. Some of these restrictions may be addressed in subsequent releases of Dataplot. Enter HELP CONVERT CHARACTER for details. c) The COLUMN LIMITS command has been updated to accept variable arguments. For example, COLUMN LIMITS LOWER UPPER with LOWER and UPPER denoting variables (as oppossed to parameters) each with N elements. Dataplot will parse the data file assuming that field one of the data is in columns LOWER(1) to UPPER(1), field two of the data is in LOWER(2) to UPPER(2) and so on. Note that only one numeric or character variable will be read in each field. Many programs, Excel for example, will write data to ASCII files with the data values either left or right justified to a given column. If the ASCII file is written so that the decimal point is in a fixed column, then using the SET READ FORMAT is typically recommended rather than the COLUMN LIMITS with variable arguments. If the data file contains columns of equal length, then using this form of the COLIMNM LIMITS command is not necessary. However, there are two cases where it is useful: 1) If you only want to read selected fields in the data file, then this form of the COLUMN LIMITS command easily allows you to do this. 2) If the data columns are of unequal length, as ASCII files created from Excel often are, then this form of the COLUMN LIMITS allows these data files to be read correctly. If a given field is empty, Dataplot interprets it as a missing value. By default, Dataplot will set the missing value to 0. If you would like to specify a value other than zero, then enter the command SET READ MISSING VALUE where is the desired value. Enter HELP COLUMN LIMITS for details. d) If Excel writes a comma delimited ASCII file (.CSV), then missing values are denoted with ",,". In order to interpert these files correctly, you can enter the command SET READ DELIMITER where specifies the desired delimiter. The default delimiter is a comma. If Dataplot encounters the delimiter before any valid data has been found, it interprets this as a missing value. Missing values are set to 0 unless a SET READ MISSING VALUE command has been entered (see above). We have added a section in the online help files that provides general guidance on reading ASCII data files in Dataplot. This consolidates information documented under a number of different commands. For details, enter HELP ASCII FILES 2) The SET CONVERT CHARACTER ON command allows you to read character variables. We have added the following commands that operate on these character variables. a) Many character variables are in fact group-id variables. In order to allow you to use these group-id variables in a numeric context, the following two commands were added: LET Y = CHARACTER CODE IX LET Y = ALPHABETIC CHARACTER CODE IX with IX denoting the name of a character variable that has been read into Dataplot and Y denoting the name of a numeric variable that will be created by this command. Both of these commands identify the unique rows in the character variable (Dataplot checks for exact matches, it does not try to guess if a typo has occurred, etc.). If there are K unique rows, Dataplot will generate coded values as the integer values from 1 to K. The distinction is that CHARACTER CODE will perform the coding in the order that the unique rows are encoutered in the file while ALPHABETIC CHARACTER CODE will sort the unique character rows and code based on the alphabetic order. b) Character variables are frequently used as group-id variables (e.g., Male and Female to identify sex). The following command creates a group-id variable from a character variable: LET IG = GROUP LABELS MONTH with MONTH denoting the name of a character variable. The name IG will be used to denote a group-id variable. The number of rows in IG will be equal to the number of unique rows in MONTH. Up to 5 group-id variables can be created and the maximum number of rows for a group-id variable is the maximum number of rows for a numeric variable divided by 100. c) You can create a row label variable with the READ ROW LABEL command. Alternatively, you now enter the command LET ROWLABEL = MONTH with MONTH denoting the name of a character variable. Note that the variable name on the left hand side of the "=" must be ROWLABEL for this command to work. d) The TIC MARK LABEL FORMAT and TIC LABEL CONTENT commands have been updated to suppor the following: TIC MARK LABEL FORMAT GROUP LABEL TIC MARK LABEL CONTENT IG TIC MARK LABEL FORMAT ROW LABEL TIC MARK LABEL FORMAT VARIABLE TIC MARK LABEL CONTENT YVAR Setting the tic mark label format to GROUP LABEL instructs Dataplot to use a group label variable for the contents of the tic mark label. The TIC MARK LABEL CONTENT command is then used to specify the name of the group label variable to use. Setting the tic mark label format to VARIABLE is similar to the GROUP LABEL case. However, in this case a numeric variable is specified rather than a group label variable. This allows you to place your own numeric tic mark labels. For example, you can use this to generate a "reverse" axis. Setting the tic mark label format to ROW LABEL allows you to use the row labels as the content for the tic mark labels. For example, this can be useful for labeling a bar chart. 3) Support for the following univariate distributions was added: LET A = TRACDF(X,A,B,C,D) - cdf of trapezoid distribution LET A = TRAPDF(X,A,B,C,D) - pdf of trapezoid distribution LET A = TRAPPF(P,A,B,C,D) - ppf of trapezoid distribution LET A = GTRCDF(X,A,B,C,D,NU1,NU3,ALPHA) - cdf of generalized trapezoid distribution LET A = GTRPDF(X,A,B,C,D,NU1,NU3,ALPHA) - pdf of generalized trapezoid distribution LET A = GTRPPF(P,A,B,C,D,NU1,NU3,ALPHA) - ppf of generalized trapezoid distribution LET A = FTCDF(X,NU) - cdf of folded t distribution LET A = FTPDF(X,NU) - pdf of folded t distribution LET A = FTPPF(P,NU) - ppf of folded t distribution LET A = SNCDF(X,ALPHA) - cdf of skew normal distribution LET A = SNPDF(X,ALPHA) - pdf of skew normal distribution LET A = SNPPF(P,ALPHA) - ppf of skew normal distribution LET A = STCDF(X,NU,ALPHA) - cdf of skew t distribution LET A = STPDF(X,NU,ALPHA) - pdf of skew t distribution LET A = STPPF(X,NU,ALPHA) - ppf of skew t distribution LET A = SLACDF(X) - cdf of slash distribution LET A = SLAPPF(P) - ppf of slash distribution LET A = IBCDF(X,ALPHA,BETA) - cdf of inverted beta distribution LET A = IBPPF(P,ALPHA,BETA) - ppf of inverted beta distribution LET A = GHCDF(X,G,H) - cdf of g-and-h distribution LET A = GHPPF(P,G,H) - ppf of g-and-h distribution LET A = MAKCDF(X,XI,L,T) - cdf of Gompertz-Makeham distribution LET A = MAKPDF(X,XI,L,T) - pdf of Gompertz-Makeham distribution LET A = MAKPPF(P,XI,L,T) - ppf of Gompertz-Makeham distribution LET A = GHPPF(P,G,H) - ppf of g-and-h distribution LET A = ZIPPDF(X,ALPHA) - pdf of Zipf distribution Note that the IBPDF and SLAPDF functions were implemented previously. The GHPDF function is still under development. You can generate random numbers for these distributions with the commands LET A = LET B = LET C = LET D = LET Y = TRAPEZOID RANDOM NUMBERS FOR I = 1 1 N LET A = LET B = LET C = LET D = LET NU1 = LET NU3 = LET ALPHA = LET Y = GENERALIZED TRAPEZOID RANDOM NUMBERS FOR I = 1 1 N LET NU = LET Y = FOLDED T RANDOM NUMBERS FOR I = 1 1 N LET ALPHA = LET Y = SKEWED NORMAL RANDOM NUMBERS FOR I = 1 1 N LET NU = LET ALPHA = LET Y = SKEWED T RANDOM NUMBERS FOR I = 1 1 N LET G = LET H = LET Y = G AND H RANDOM NUMBERS FOR I = 1 1 N LET XI = LET LAMBDA = LET THETA = LET Y = GOMPERTZ-MAKEHAM RANDOM NUMBERS FOR I = 1 1 N LET ALPHA = LET Y = ZIPF RANDOM NUMBERS FOR I = 1 1 N Random numbers for the slash and inverted beta distributions were added previously. You can generate the following probability plots and goodness of fit tests LET A = LET B = LET C = LET D = TRAPEZOID PROBABILITY PLOT Y TRAPEZOID KOLMOGOROV-SMIRNOV GOODNESS OF FIT TEST Y TRAPEZOID CHI-SQUARE GOODNESS OF FIT TEST Y LET A = LET B = LET C = LET D = LET NU1 = LET NU3 = LET ALPHA = GENERALIZED TRAPEZOID PROBABILITY PLOT Y GENERALIZED TRAPEZOID KOLMOGOROV-SMIRNOV GOODNESS OF FIT TEST Y GENERALIZED TRAPEZOID CHI-SQUARE GOODNESS OF FIT TEST Y LET NU = FOLDED T PROBABILITY PLOT Y FOLDED T KOLMOGOROV-SMIRNOV GOODNESS OF FIT TEST Y FOLDED T CHI-SQUARE GOODNESS OF FIT TEST Y FOLDED T PPCC PLOT Y LET NU = LET LAMBDA = SKEW T PROBABILITY PLOT Y SKEW T KOLMOGOROV-SMIRNOV GOODNESS OF FIT TEST Y SKEW T CHI-SQUARE GOODNESS OF FIT TEST Y SKEW T PPCC PLOT Y LET LAMBDA = SKEW NORMAL PROBABILITY PLOT Y SKEW NORMAL KOLMOGOROV-SMIRNOV GOODNESS OF FIT TEST Y SKEW NORMAL CHI-SQUARE GOODNESS OF FIT TEST Y SKEW NORMAL PPCC PLOT Y LET G = LET H = G AND H PROBABILITY PLOT Y G AND H KOLMOGOROV-SMIRNOV GOODNESS OF FIT TEST Y G AND H CHI-SQUARE GOODNESS OF FIT TEST Y G AND H PPCC PLOT Y LET XI = LET LAMBDA = LET THETA = GOMPERTZ-MAKEHAM PROBABILITY PLOT Y GOMPERTZ-MAKEHAM KOLMOGOROV-SMIRNOV GOODNESS OF FIT TEST Y GOMPERTZ-MAKEHAM CHI-SQUARE GOODNESS OF FIT TEST Y c) Added the following commands JOHNSON SU MOMENTS Y JOHNSON SB MOMENTS Y to compute method of moment estimates for the Johnson SU and Johnson SB distributions. d) The GUMBEL MAXIMUM LIKELIHOOD command was extended to support both the minimum and maximum cases (the previous version was restricted to the maximum case). Before the GUMBEL MAXIMUM LIKELIHOOD command, enter the command SET MINMAX 1 to specify the minimum case and SET MINMAX 2 to specify the maximum case. e) Enter the following command to generate Dirichelet random numbers: LET M = DIRICHLET RANDOM NUMBERS ALPHA N with ALPHA denoting a vector containing the shape parameters of the Dirichlet distribution and N denoting a scalar that specifies the number of rows to generate. M will be a matrix with N rows and k columns (where k is the number of elements in the ALPHA vector). You can also compute the Dirichlet probability density or the log of the Dirichlet probability density with the commands LET M = DIRICHLET PDF X ALPHA LET M = DIRICHLET LOG PDF X ALPHA f) Enter the following command to generate correlated uniform random numbers: LET U = MULTIVARIATE UNIFORM RANDOM NUMBERS SIGMA N with SIGMA denoting the variance-covariance matrix of a multivariate normal distribution and N denoting the number of rows to generate. g) The Anderson-Darling goodnes of fit test was enhanced to include the following distributions: ANDERSON-DARLING LOGISTIC TEST Y ANDERSON-DARLING DOUBLE EXPONENTIAL TEST Y ANDERSON-DARLING UNIFORM TEST Y The uniform case is for the uniform distribution on the (0,1) interval. This can also be used for fully specified distributions (i.e., the shape, location, and scale parameters are not estimated from the data). Simply calculate the appropriate CDF function with the specified shape, location, and scale parameters (this converts the data to the (0,1) interval) and apply the test for a uniform distribution. h) The following maximum likelihood estimation commands were added: LOGISTIC MAXIMUM LIKELIHOOD Y UNIFORM MAXIMUM LIKELIHOOD Y BETA MAXIMUM LIKELIHOOD Y The BETA and UNIFORM cases generate both method of moments and maximum likelihood estimates. The beta case estimates the lower and upper limits of the data from the minimum and maximam data values, respectively, and then computes the maximum likelihood estimates for the alpha and beta shape parameters. i) Support was added for the following random number generators: 1) FIBONACCI CONGRUENTIAL - a mixture of the Fibonnaci generator with a congruential generator 2) MERSENNE TWISTER - Fortran 90 implementation of the Mersenned twister generator (may not be valid on platforms that are compiled with Fortran 77 compilers) Enter HELP RANDOM NUMBER GENERATOR for details. j) Fixed the inverse gaussian and reciprocal inverse gaussian probability functions. The MU parameter was treated as a location parameter in original implementation. However, it is really a shape parameter. So IGPDF and RIGPDF can now be called via IGPDF(X,GAMMA,MU,LOC,SCALE) RIGPDF(X,GAMMA,MU,LOC,SCALE) The MU parameter is treated as an optional parameter (LOC and SCALE are also optional). MU is set to 1 if it is omitted. The MU parameter can also be specified for random numbers and probability plots. If the MU parameter is not set, it will automatically be set to 1 (no error message is printed). The PPCC plot for these two distributions is now generated for both the gamma and mu parameters (i.e., a 3D plot is generated). If you want the PPCC plot assuming MU =1 for the inverse gaussian case, you can use the WALD PPCC PLOT command (the Wald distribution is a special case of the inverse gaussian where MU is set to 1). 4) Added the following analysis commands: a) Support for linear and quadratic calibration is available via the following commands: LINEAR CALIBRATION Y X Y0 QUADRATIC CALIBRATION Y X Y0 The LINEAR CALIBRATION command performs a linear calibration analysis using eight different methods. The QUADRATIC CALIBRATION command performs a quadratic calibration analysis using three different methods. Enter HELP CALIBRATION for details. b) The Friedman test for two-way analysis of variance on ranks is supported with the command FRIEDMAN TEST Y BLOCK TREATMENT Enter HELP FRIEDMAN TEST for details. c) The frequency and cumulative sum tests for randomness are supported with the commands FREQUENCY TEST Y LET M = FREQUENCY WITHIN A BLOCK TEST Y CUMULATIVE SUM TEST Y These tests are used for sequences of 0's and 1's (Dataplot just checks for two distinct values, the higher value is set to 1 and the lower value is set to 0). To test a uniform random number generator, do something like the following: LET N = 1 LET P = 0.5 LET Y = BINOMIAL RANDOM NUMBERS FOR I = 1 1 10000 FREQUENCY TEST Y For details, enter HELP FREQUENCY TEST HELP CUMULATIVE SUM TEST 5) The following enhancements were made to the BOOTSTRAP PLOT command a) Extended the grouped case to handle two groups (previously one group was supported). b) For the grouped (either one or two groups), the following information is written to file: DPST1F.DAT - the full set of bootstrap estimates for the statistic (group-id in column 1, bootstrap statistic in column 2) DPST2F.DAT - writes the group-id and the corresponding mean, standard deviation, and the 0.025, 0.975, 0.05, 0.95, 0.0005, and 0.995 quantiles c) Added the following form of the command BCA BOOTSTRAP PLOT Y This generates BCa bootstrap confidence intervals as defined by Efron. At the expense of additional computation, it generates bootstrap confidence intervals that are second order accurate (the percentile bootstrap confidence intervals are first order accurate). Enter HELP BOOTSTRAP PLOT for further information. 6) The CAPTURE HTML (for generating Dataplot output in HTML format) capability has been extended to additional analysis commands. In addition, Dataplot output can now be generated in Latex format with the command CAPTURE LATEX file.tex with "file.tex" denoting the name where the Latex output is generated. An END OF CAPTURE terminates the generation of Latex output. The CAPTURE HTML and CAPTURE LATEX commands now generate formatted output for the following commands: SUMMARY TABULATE CROSS TABULATE CONSENSUS MEAN CONSENSUS MEAN PLOT LINEAR CALIBRATION QUADRATIC CALIBRATION YATES ANALYSIS FIT ANOVA FRIEDMAN TEST WILK SHAPIRO ANDERSON DARLING KOLMOGOROV-SMIRNOV GOODNESS OF FIT CHI-SQUARE GOODNESS OF FIT EXPONENTIAL MAXIMUM LIKELIHOOD GUMBEL MAXIMUM LIKELIHOOD WEIBULL MAXIMUM LIKELIHOOD LOGISTIC MAXIMUM LIKELIHOOD PARETO MAXIMUM LIKELIHOOD UNIFORM MAXIMUM LIKELIHOOD BETA MAXIMUM LIKELIHOOD CONFIDENCE LIMITS DIFFERENCE OF MEANS CONFIDENCE LIMITS BIWEIGHT LOCATION CONFIDENCE LIMITS TRIMMED MEAN CONFIDENCE LIMITS MEDIAN/QUANTILE CONFIDENCE LIMITS T TEST F TEST CHI-SQUARE TEST GRUBB TEST LEVENE TEST FREQUENCY TEST FREQUENCY WITHIN A BLOCK TEST CUSUM TEST In addition, WRITE HTML and WRITE LATEX commands have been added to allow the generation of one-way tables. We plan to implement this capability for most of the analysis commands over the course of the next year or so. In addition, we are investigating a similar capability for Rich Text Format (RTF), which would allow importation into Word and other word processing programs. Output from unsupported commands is enclosed in "
" and
    "
" tags for HTML and within the "begin{\verbatin}" environment for Latex. Enter HELP HTML HELP LATEX for details. 7) Dataplot has previously supported a LET ... = DERIVATIVE ... command that generates analytic derivatives. However, this was supported for a rather limited set of functions (enter HELP DERIVATIVE for details). We have added the commands LET A = NUMERICAL DERIVATIVE F WRT X FOR X = X0 LET Y = NUMERICAL DERIVATIVE F WRT X to compute derivatives numerically. The distinction in the above syntax is that the first command computes a single derivative while the second syntax computes the derivative for a vector of values (define X to contain the points at which you want the derivative computed). For details, enter HELP NUMERICAL DERIVATIVE f 8) Fixed following bugs: a) Fixed the READ and WRITE commands to handle hyphens inside of quoted file names correctly (only applies if SET FILE NAME QUOTE ON entered). b) The substitution character, "^", was modified to treat anything other than a letter, a number, or an underscore as terminator for the Dataplot name. Note that although you can use some special characters in Dataplot names, this is strongly discouraged. c) Fixed a bug where the file name restriction of 80 characters was actually a restriction on the entire command line. This has been fixed so that file name may be up to 80 characters and the full command line may be more than 80 characters. d) Fixed a bug with the CAPTURE FLUSH command. e) If an improper format is given on the SET WRITE FORMAT, Dataplot will now return an error message rather than crashing. f) Fixed a bug in the generation of non-central chi-square, non-central F, and doubly non-central F random numbers. ----------------------------------------------------------------------- The following enhancements were made to DATAPLOT April-May 2003. ----------------------------------------------------------------------- 1) Added the following plot commands PARALLEL COORDINATES PLOT Y1 ... YK The parallel coordinates plot is a technique for plotting multivariate data. Enter HELP PARALLEL COORDINATES PLOT for details. 2) Added support for the following statistics: LET A = SN SCALE Y1 LET A = QN SCALE Y1 LET A = DIFFERENCE OF SN Y1 Y2 LET A = DIFFERENCE OF QN Y1 Y2 LET P1 = 10 LET P2 = 10 Enter HELP for the given statistic for details (e.g., HELP DIFFERENCE OF SN). In addition, these statistics are supported for the following plots and commands STATISTIC PLOT Y1 Y2 X CROSS TABULATE STATISTIC PLOT Y1 Y2 X1 X2 BOOTSTRAP PLOT Y1 Y2 X1 X2 JACKNIFE PLOT Y1 Y2 X1 X2 TABULATE Y1 Y2 X CROSS TABULATE Y1 Y2 X1 X2 LET Z = CROSS TABULATE Y1 Y2 X1 X2 The DIFFERENCE OF COUNTS statistic is not supported for these plots and commands (since it will simply be zero for all cases). The SN SCALE and QN SCALE statistics are also supported for the following additional commands DEX PLOT Y X1 ... XK BLOCK PLOT Y X1 ... XK INFLUENCE CURVE Y INTERACTION PLOT Y X1 X2 LET Y = MATRIX COLUMN M LET Y = MATRIX ROW M 3) The following probability distribution commands were added: a) The following commands for multivariate random numbers were added: LET W = WISHART RANDOM NUMBERS MU SIGMA N LET U = INDEPENDENT UNIFORM RANDOM NUMBERS LOWL UPPL NP LET M = MULTIVARIATE T RANDOM NUMBERS MU SIGMA NU N LET M = MULTINOMIAL RANDOM NUMBERS P N NEVENTS For details, enter HELP WISHART RANDOM NUMBERS HELP INDEPENDENT UNIFORM RANDOM NUMBERS HELP MULTIVARIATE T RANDOM NUMBERS HELP MULTINOMIAL RANDOM NUMBERS b) The following multivariate cumulative distribution and probability density/mass function commands were added: LET M = MULTIVARIATE NORMAL CDF SIGMA UPPL LET M = MULTIVARIATE NORMAL CDF SIGMA LOWL UPPL LET M = MULTIVARIATE T CDF SIGMA UPPL LET M = MULTIVARIATE T CDF SIGMA LOWL UPPL LET M = MULTINOMIAL PDF X P These compute the cdf for multivariate normal and multivariate t distributions and the pdf for the multinomial distribution. For details, enter HELP MULTIVARIATE NORMAL CDF HELP MULTIVARIATE T CDF HELP MULTINOMIAL PDF c) Support for the following univariate distributions was added: LET A = LANCDF(X) - cdf of Landau distribution LET A = LANPDF(X) - pdf of Landau distribution LET A = LANPPF(P) - ppf of Landau distribution LET A = LANDIF(X) - derivative of Landau pdf LET A = LANXM1(X) - first moment function of Landau distribution LET A = LANXM2(X) - second moment function of Landau distribution LET A = ERRCDF(X,ALPHA) - cdf of error distribution LET A = ERRPDF(X,ALPHA) - pdf of error distribution LET A = ERRPPF(X,ALPHA) - ppf of error distribution LET A = SLAPDF(X) - pdf of slash distribution LET A = IBPDF(X,ALPHA) - pdf of inverted beta distribution The cdf and ppf functions for the slash and inverted beta distributions are still being developed. You can generate random numbers for these distributions with the commands LET Y = LANDAU RANDOM NUMBERS FOR I = 1 1 N LET Y = SLASH RANDOM NUMBERS FOR I = 1 1 N LET ALPHA = LET Y = ERROR RANDOM NUMBERS FOR I = 1 1 N LET ALPHA = LET Y = INVERTED BETA RANDOM NUMBERS FOR I = 1 1 N The error distribution is also referred to as the Subbotin, exponential power, or general error distribution. There are several different parameterizations of this distribution. Dataplot uses the parameterization of Tadikamalla in "Random Sampling From the Exponential Power Distribution", Journal of the American Statistical Association, September, 1980. Enter HELP ERRPDF for details. d) Support was added for the following random number generators: 1) GENZ - Alan Genz generator 2) LUXURY - based on the Marsagalia and Zaman borrow-and-carry generator. Uses a code written by F. James and incorporating improvements by M. Luscher. Enter HELP RANDOM NUMBER GENERATOR for details. 4) Added the following command: LET Y2 X2 = STACK Y1 Y2 ... YK This command appends the variables Y1, Y2, ..., YK into the single variable Y2. In addition, X2 contains a group identifier variable (values corresponding to Y1 are set to 1, values corresponding to Y2 are set to 2, and so on). Many Dataplot commands (e.g., BOX PLOT, MEAN PLOT, ANOVA) require data be in the two-variable format (i.e., a response variable and a group identifier variable). However, many data files will simply have each response variable in a separate column. The STACK command provides a convenient way to generate the data in the form needed by many Dataplot commands. ----------------------------------------------------------------------- The following enhancements were made to DATAPLOT January-March 2003. ----------------------------------------------------------------------- 1) The Windows 95/98/ME/NT/2000/XP installation now uses InstallShield. This should simplify the installation of Dataplot on Windows platforms. 2) A few tweaks were made to the Postscript device. a) Previously, Dataplot started a new page when the device was intialized. It also started a new page when the first plot was generated. This was to ensure that a fresh page was started if you were generating diagrammatic graphics before the first plot. However, it caused a blank page to be printed for most applications. Dataplot now automatically keeps track so that the first plot will not generate the unneeded page erase. b) Previously, the LANDSCAPE WORDPERFECT orientation (this results in a landscape orientation on a portrait page) was supported for encapsulated Postscript, but not for regular Postscript. This orientation is now supported for regular Postscript. c) Dataplot allows you to switch between the various orientations (LANDSCAPE, PORTRAIT, LANDSCAPE WORDPERFECT, SQUARE) when using Postscript. For this reason, it sets the bounding box for an 11x11 inch page. The following command SET POSTSCRIPT BOUNDING BOX can be used to modify this behavior. If the value is FLOAT (the default), the bounding box is set for an 11x11 inch page. If the value is set to FIXED, the bounding box will be set according to whatever the current orientation is when the device is initialized. However, you should not change the orientation if FIXED is used. If you are simply using the Postscript output for printing, then you do not need to worry about this command. However, it may occasionally be useful if are importing the Postscript output into an external program. 3) Postscript was added to the list of devices supported by the CAPTURE HTML command (see 3) for the August-December 2002 updates). If a DEVICE 2 CLOSE command is encountered when CAPTURE HTML is on and the device is set to postscript, Dataplot will first use Ghostscript to convert the Postscript output to JPEG. The JPEG file will have the same file name as the original postscript file, but its extension will be changed to "jpg" (e.g., the default name "dppl1f.dat" results in a JPEG file called "dppl1f.jpg"). Dataplot will add an " For example, on my Windows system, I use SET GHOSTSCRIPT PATH F:\GS\GS704\GS\BIN\ We suggest that you add this command to your Dataplot startup file "dplogf.tex". b. We suggest using either the ORIENTATION PORTRAIT or the ORIENTATION LANDSCAPE WORDPERFECT command to set the orientation. Plots with a landscape orientation are rotated in the Dataplot Postscript output (in order to make full use of the page). Currently, Ghostscript does not support a command line switch to rotate the graph. This means that landscape plots will be rotated vertically on the web page (you can use an external program, GIMP for example, to rotate the JPEG files if you like). 4) Dataplot uses a vector graphics model. However, when you want to incorporate Dataplot graphics into other applications, it is often preferrable to work with bitmapped graphics. Dataplot now supports the command: SET POSTSCRIPT CONVERT where is one of the following: JPEG - for jpeg PDF - for Portable Document Format (PDF) TIFF - for Tiff PBM - PBM Portable Bit Map Format (supports black and white) PGM - PBM Portable Grey Map Format (supports grey scale) PPM - Portable Pixmap Format (supports color) PNM - PBM Portable Anymap Format (operates on PBM, PGM, or PPM formats) If is set to one of the choices above, a DEVICE 2 CLOSE command is encountered, and the device is set to postscript, Dataplot first uses Ghostscript to convert the Postscript output to the requested format. The converted file will have the same file name as the original postscript file, but its extension will be changed to "jpg", "pdf", "tif", "pbm", "pgm", "ppm", or "pnm" depending on the value of . For example, if is "PDF", the default name "dppl1f.dat" results in a PDF file called "dppl1f.pdf"). As noted above in 3), this option assumes Ghostscript is installed on your local system. You can use the SET GHOSTSCRIPT PATH described above to set the path for Ghostscript. Also, as noted in 3), we suggest using either the ORIENTATION PORTRAIT or the ORIENTATION LANDSCAPE WORDPERFECT command to set the orientation. A few additional points: a. The original postscript file is not deleted. An additional plot file, with a different extension, is created. b. The bit map formats are generally most useful when there is one image per file. You can do something like the following: SET POSTSCRIPT CONVERT JPEG SET IPL1NA plot1.ps DEVICE 2 POSTSCRIPT ... generate plot 1 ... DEVICE 2 CLOSE SET IPL1NA plot2.ps DEVICE 2 POSTSCRIPT ... generate plot 2 ... DEVICE 2 CLOSE This will result in the files plot1.ps, plot1.jpg, plot2.ps, and plot2.jpg. The PDF files may be an exception to this. Depending on how you want to use the generated plots, you can either create all the plots in a single PDF file or put each plot in a separate PDF file (using the above logic). c. If the CAPTURE HTML switch is on, PDF files are incorporated into the generated HTML file. For PDF files, no file conversion is performed. Instead, a link to the PDF file is added to the HTML page. The advantage of the PDF format over JPEG is that it is typically of higher quality than the JPEG file. The disadvantage is that you have to link to another page to view it. 5) The CAPTURE HTML command can be used to save Dataplot numeric and graphics output in an HTML page. By default, Dataplot generates fairly minimal "header" and "footer" HTML code (basically, it sets a white background and not much else). If your basic purpose is to simply create a web viewable page, then this is sufficient. However, many sites have specific style guidelines for web pages. These can typically be incorporated into the "header" and "footer" of the HTML page. In order to provide additional flexibility to the appearance of the web pages created using CAPTURE HTML, Dataplot now supports the following two commands: SET HTML HEADER FILE SET HTML FOOTER FILE If these commands are given, Dataplot will add the contents of to the beginning and the contents of to the end of the generated HTML file. The Dataplot HELP directory contains the files "sed_header.htm" and "sed_footer.htm". These can be used as examples for developing your own templates (these implement some NIST specific information, so they are not intended to be used directly by non-NIST users). Note that Dataplot does no error checking on these files. We recommend that you view a page containing the intended header and footer to detect problems with your HTML code. Dataplot will only read 240 characters per line in these file. 6) One current limitation in Dataplot has been that reading data from ASCII files was limited to a maximum of 132 columns. The only way arround this was to use the SET READ FORMAT. However, this did not work if the data did not have a consistent format. The default limit was raised to 255 columns. To read even longer data lines, use the command MAXIMUM RECORD LENGTH. Enter HELP MAXIMUM RECORD LENGTH for details. 7) The following commands were added: TRIMMED MEAN CONFIDENCE LIMITS Y MEDIAN CONFIDENCE LIMITS Y These provide confidence intervals for robust estimates of location. Enter HELP TRIMMED MEAN CONFIDENCE LIMITS HELP MEDIAN CONFIDENCE LIMITS for details. 8) The following plot commands were added: VIOLIN PLOT Y X SHIFT PLOT Y X The VIOLIN PLOT is a mix of a a box plot and a kernel density plot. The shift plot is a variation of quantile-quantile or Tukey mean-difference plots. Enter HELP VIOLIN PLOT and HELP SHIFT PLOT for details. 9) The Hotelling control chart capability was upgraded in the following way: a) A distinction is now made between phase I and phase II plots. The previous implementation was effectively a phase I plot. b) Support was added for the individual observations case. Enter HELP HOTELLING CONTROL CHART for details. 10) The Ljung-Box test for randomness was added. This test is based on the autocorrelation plot and is commonly used in the context of ARIMA modeling. Enter HELP LJUNG BOX TEST for details. 11) The follwing miscellaneous changes were made: a) A correction was made in the computation of the Herrell-Davis quantile estimate. Enter HELP QUANTILE for details. b) The SEARCH command now returns the line number that the first match is found on in the internal parameter LINENUMB. This can occassionaly be useful when writing macros. c) If no variable name is given on the READ command, Dataplot will now try to automatically determine the variables. There are two cases: i) If the command SKIP AUTOMATIC was previously entered, Dataplot will skip all lines until a line starting with "----" is encountered. It will then backup one line and read the variable list from that line. This case is primarily used when reading data files that come with the Dataplot distribution (i.e., the files in the Dataplot "DATA" sub-directory). Most, though not all, of these files follow that convention. ii) If a SKIP AUTOMATIC command has not been entered, Dataplot will read the first line of the file and determine the number of columns of data. It will then automatically name the variables X1 X2 ... XK (where K is the number of variables). Note that any SKIP, COLUMN LIMITS, or ROW LIMITS commands will be honored when reading the first line to determine the number of variables. This capability only applies when reading variables (i.e., it is not supported for the READ PARAMETER, READ STRING, or READ MATRIX cases). Also, it only applies when reading from a file, not when reading from the terminal. d) Some bugs were fixed. 12) Added support for the following statistics: LET A = DIFFERENCE OF MEANS Y1 Y2 LET A = DIFFERENCE OF MIDMEANS Y1 Y2 LET A = DIFFERENCE OF MEIDANS Y1 Y2 LET A = DIFFERENCE OF MIDRANGE Y1 Y2 LET A = DIFFERENCE OF TRIMMED MEANS Y1 Y2 LET A = DIFFERENCE OF WINSORIZED MEANS Y1 Y2 LET A = DIFFERENCE OF GEOMETRIC MEANS Y1 Y2 LET A = DIFFERENCE OF HARMONIC MEANS Y1 Y2 LET A = DIFFERENCE OF HODGES-LEHMAN Y1 Y2 LET A = DIFFERENCE OF BIWEIGHT LOCATIONS Y1 Y2 LET A = DIFFERENCE OF STANDARD DEVIATIONS Y1 Y2 LET A = DIFFERENCE OF VARIANCES Y1 Y2 LET A = DIFFERENCE OF AAD Y1 Y2 LET A = DIFFERENCE OF MAD Y1 Y2 LET A = DIFFERENCE OF INTERQUARTILE RANGE Y1 Y2 LET A = DIFFERENCE OF WINSORIZED SD Y1 Y2 LET A = DIFFERENCE OF WINSORIZED VARIANCE Y1 Y2 LET A = DIFFERENCE OF BIWEIGHT MIDVARIANCE Y1 Y2 LET A = DIFFERENCE OF BIWEIGHT SCALE Y1 Y2 LET A = DIFFERENCE OF PERCENTAGE BEND MIDVARIANCE Y1 Y2 LET A = DIFFERENCE OF GEOMETRIC SD Y1 Y2 LET A = DIFFERENCE OF RANGE Y1 Y2 LET A = DIFFERENCE OF SKEWNESS Y1 Y2 LET A = DIFFERENCE OF KURTOSIS Y1 Y2 LET A = DIFFERENCE OF RELATIVE SD Y1 Y2 LET A = DIFFERENCE OF COEFFICIENT OF VARIATION Y1 Y2 LET A = DIFFERENCE OF SD OF MEAN Y1 Y2 LET A = DIFFERENCE OF RELATIVE VARIANCE Y1 Y2 LET A = DIFFERENCE OF VARIANCE OF MEAN Y1 Y2 LET A = DIFFERENCE OF QUANTILE Y1 Y2 LET A = DIFFERENCE OF MINIMUM Y1 Y2 LET A = DIFFERENCE OF MAXIMUM Y1 Y2 LET A = DIFFERENCE OF EXTREME Y1 Y2 LET A = DIFFERENCE OF MAXIMUM Y1 Y2 LET A = DIFFERENCE OF MAXIMUM Y1 Y2 LET A = DIFFERENCE OF SUM Y1 Y2 LET A = DIFFERENCE OF COUNTS Y1 Y2 Enter HELP for the given statistic for details (e.g., HELP DIFFERENCE OF MEANS). In addition, these statistics are supported for the following plots and commands STATISTIC PLOT Y1 Y2 X CROSS TABULATE STATISTIC PLOT Y1 Y2 X1 X2 BOOTSTRAP PLOT Y1 Y2 X1 X2 JACKNIFE PLOT Y1 Y2 X1 X2 TABULATE Y1 Y2 X CROSS TABULATE Y1 Y2 X1 X2 LET Z = CROSS TABULATE Y1 Y2 X1 X2 The DIFFERENCE OF COUNTS statistic is not supported for these plots and commands (since it will simply be zero for all cases). ---------------------------------------------------------------------- The following enhancements were made to DATAPLOT August-December 2002. ---------------------------------------------------------------------- 1) Added the following command: AUTO TEXT <ON/OFF> Entering AUTO TEXT ON will prepend a TEXT to all subsequent lines until an AUTO TEXT OFF command is encoutered. This command is used in generating word slides. Enter HELP AUTO TEXT for details. 2) The list of supported statistics has been expanded for the following commands: BLOCK PLOT DEX PLOT TABULATE CROSS TABULATE MATRIX ROW STATISTIC MATRIX COLUMN STATISTIC CROSS TABULATE (LET) Enter the corresponding HELP command for a complete list of supported statistics. 3) The CAPTURE command added the following option: CAPTURE HTML <file-name> This writes the output from the CAPTURE command in HTML format. Note that most commands simply use a <PRE> ... </PRE> syntax. Curently, the exceptions are the TABULATE and CROSS TABULATE, which write the output using HTML table syntax. This can be used in conjunction with the WEB command. For example, SKIP 25 READ RIPKEN.DAT Y X1 X2 ECHO ON CAPTURE HTML C:\TABLE.HTM TABULATE MEAN Y X1 CROSS TABULATE MEAN Y X1 X2 END OF CAPTURE WEB file://C:\TABLE.HTM In addition, if DEVICE 2 is set to PNG, JPEG, or SVG, Dataplot will incorporate the graphics into the web page using the IMG tag. For example, device 1 x11 . skip 25 read berger1.dat y x . line blank solid character x blank echo on capture html fit.htm set ipl1na data.png device 2 gd png title original data plot y x device 2 close fit y x set ipl1na pred.png device 2 gd png title predicted line plot y pred vs x device 2 close end of capture . web file:///home/heckert/dataplot/solaris/fit.htm 4) The maximum number of lines in a loop was raised from 500 to 1,000. ---------------------------------------------------------------------- The following enhancements were made to DATAPLOT April-July 2002. ---------------------------------------------------------------------- 1) Added support for the following probability distribution functions. a) Two-Sided Power TSPCDF(X,THETA,N) TSPPDF(X,THETA,N) TSPPPF(X,THETA,N) LET THETA = <value> LET N = <value> LET Y = TWO-SIDED POWER RANDOM NUMBERS FOR I = 1 1 100 LET THETA = <value> LET N = <value> TWO-SIDED POWER PROBABILITY PLOT Y TWO-SIDED POWER PPCC PLOT Y LET THETA = <value> LET N = <value> CHI-SQUARE TWO-SIDED POWER GOODNESS OF FIT TEST Y LET THETA = <value> LET N = <value> KOLMOGOROV-SMIRNOV TWO-SIDED POWER GOODNESS OF FIT TEST Y LET A = <lower limit> LET B = <upper limit> TWO-SIDED POWER MAXIMUM LIKELIHOOD Y Note: The MLE estimator assumes that the value of the lower and upper limits (default to 0 and 1) are known and fixed. It returns estimates for THETA and N. b) Bi-Weibull BWECDF(X,SCALE1,GAMMA1,LOC2,SCALE2,GAMMA2) BWEPDF(X,SCALE1,GAMMA1,LOC2,SCALE2,GAMMA2) BWEPPF(P,SCALE1,GAMMA1,LOC2,SCALE2,GAMMA2) BWEHAZ(X,SCALE1,GAMMA1,LOC2,SCALE2,GAMMA2) BWECHAZ(X,SCALE1,GAMMA1,LOC2,SCALE2,GAMMA2) LET SCALE1 = <value> LET GAMMA1 = <value> LET LOC2 = <value> LET SCALE2 = <value> LET GAMMA2 = <value> LET Y = BIWEIBULL RANDOM NUMBERS FOR I = 1 1 100 LET SCALE1 = <value> LET GAMMA1 = <value> LET LOC2 = <value> LET SCALE2 = <value> LET GAMMA2 = <value> BIWEIBULL PROBABILITY PLOT Y LET SCALE1 = <value> LET GAMMA1 = <value> LET LOC2 = <value> LET SCALE2 = <value> LET GAMMA2 = <value> CHI-SQUARE BIWEIBULL GOODNESS OF FIT TEST Y LET SCALE1 = <value> LET GAMMA1 = <value> LET LOC2 = <value> LET SCALE2 = <value> LET GAMMA2 = <value> KOMOGOROV-SMIRNOV BIWEIBULL GOODNESS OF FIT TEST Y c) Multivariate normal distribution LET MU = DATA <list of p means> READ MATRIX SIGMA <pxp set of values> END OF DATA LET N = <value> LET M = MULTIVARIATE NORMAL RANDOM NUMBERS MU SIGMA N Note that M will be an NxP matrix. N is the number of rows generated for each component and their are P components to the multivariate normal. SIGMA is the pxp variance-covariance matrix of the multivariate normal. SIGMA will be checked to ensure that it is a positive definite matrix. MU is a vector specifying the means of the p components. This command utilizes a code written by Charlie Reeves when he was a member of the NIST Statistical Engineering Division. d) Multinomial distribution LET P = DATA <list of probabilities that sum to 1> LET NEVENTS = <value> LET NCAT = SIZE P LET N = <value> LET M = MULTINOMIAL RANDOM NUMBERS P NEVENTS NCAT N Note that M will be an NxP matrix. N is the number of rows generated for each component and their are P components to the multivariate normal. SIGMA is the pxp variance-covariance matrix of the multivariate normal. SIGMA will be checked to ensure that it is a positive definite matrix. MU is a vector specifying the means of the p components. e) Logarithmic series distribution Added randon number generation for this distribution. For example, LET THETA = 0.7 LET Y = LOGARITHMIC SERIES RANDOM NUMBERS FOR I = 1 1 500 The cdf, pdf, and ppf functions are already available for this distribution. 2) Made the following updates to the FIT command: a) Added the command: SET FIT ADDITIVE CONSTANT <ON/OFF> If OFF, then Dataplot does not include a constant term in a multi-linear fit (i.e., FIT Y X1 X2 ...). The default is to include the additive constant. b) If Dataplot detects a singularity in a multi-linear fit, it now prints an error message. Previously, it simply set all the parameter estimates to 0 and terminated the fit. In addition, Dataplot explictly checks for two types of singularities: a column that contains all the same values (this essentially adds an addtional constant term) and for two columns being equal. c) Added the command: LET M = CREATE MATRIX X1 ... XK where X1 ... XK designates a list of previously defined variables. This command has a similar function as the MATRIX DEFINITION command. However, the MATRIX DEFINITION command creates matrices from variables that are contiguous (the order of variables is determined by the order in which they were created in Dataplot). The CREATE MATRIX command does not have this restriction. The variables need not be contiguous. This command is useful for creating a design matrix in regression problems that can be used as input for some of the new commands that follow. d) Added the command: LET C = CATCHER MATRIX X This computes the catcher matrix, X*(X'X)**(-1). This matrix is used in the computation of certain regression diagnostics (e.g., Variance Inflation Factors, Partial Regression Plots). This command greatly simplifies the writing of macros to generate these regression diagnostics (and allows larger design matrices to be used). Enter HELP CATCHER MATRIX for details. e) Added the command: LET XTXINV = XTXINV MATRIX X This computes the matrix (X'X)**(-1). This matrix is used in the computation of certain regression diagnostics (e.g., DFBETA statistic) and in computing certain confidence and prediction intervals for multi-linear fits. This command simplifies the writing of macros to generate these regression diagnostics and intervals (and allows larger design matrices to be used). Enter HELP XTXINV MATRIX for details. f) Added the command: LET C = CONDITION INDICES X where X is the design matrix for a multi-linear fit (note that you need to create the indpendent variables, including a column containing all 1's, as a matrix). The condition indices provide a measure of colinearity in the design matrix. Enter HELP CONDITION INDICES for details. g) Added the command: LET VIF = VARIANCE INFLATION FACTORS X where X is the design matrix for a multi-linear fit (note that you need to create the indpendent variables, including a column containing all 1's, as a matrix). The variance inflation factors provide a measure of colinearity in the design matrix. Enter HELP VARIANCE INFLATION FACTORS for details. h) Added the following plot commands: PARTIAL REGRESSION PLOT Y X1 ... XK XI PARTIAL RESIDUAL PLOT Y X1 ... XK XI PARTIAL LEVERAGE PLOT Y X1 ... XK XI CCPR PLOT Y X1 ... XK XI MATRIX PARTIAL REGRESSION PLOT Y X1 ... XK MATRIX PARTIAL RESIDUAL PLOT Y X1 ... XK MATRIX PARTIAL LEVERAGE PLOT Y X1 ... XK MATRIX CCPR PLOT Y X1 ... XK XI These generate partial regression plots, partial residual plots, partial leverage plots, and component and component-plus-residual (CCPR) plots for a multi-linear fit. These plots are typically used to assess the effect of a variable on the fit given the effect of other variables already included in the fit. There are 2 forms for the command. In the first form, a single plot is generated. In this case, the last variable listed is the "primary" variable. That is, this is the variable we are considering adding/deleting from the fit. Note that this variable should already be listed. That is, a fit of Y versus X1 to XK is performed (including XI), then the plot assesses the effect of XI on the fit. In the second form, a multiplot is generated where each of the indpendent variables is used as the primary variable. Enter HELP PARTIAL REGRESSION PLOT HELP PARTIAL RESIDUAL PLOT HELP PARTIAL LEVERAGE PLOT HELP CCPR PLOT for details. i) For multi-linear fits, the output for DPST2F.DAT was enhanced to include Bonferroni and Hotelling joint confidence limits, respectively, for the predicted values. By default, a 95% interval is generated. To use a different alpha value, enter the following command before the fit: LET ALPHA = 0.90 In addition, the output for DPST1F.DAT now includes the t critical value and lower and upper joint Bonferroni confidence limits for the parameters. The format 5E15.7 is used in writing these values. In addition, for multi-linear fits, the regression ANOVA table is written to the file DPST5F.DAT. In addition, the values for R**2, adjusted R**2, and the Press P statistic are also printed to this file. Theses three statistics are saved as the internal parameters RSQUARE, ADJRSQUA, and PRESSP, respectively. j) One weakness in the Dataplot multi-linear fit routine has been the lack of any "forward selection/backward selection/best subsets" capabilities. The command BEST CP Y X1 ... XK was added to identify the best candidate models using the Mallow's CP criterion. Enter HELP BEST CP for details. k) Added the command: BOOTSTRAP FIT Y X1 .... XK This performs a bootstrap linear/multilinear fit. Bootstrap linear fits are an alternative to weighting and transformation when the assumptions for multilinear fitting are not satisfied (that is, the errors from the fit are independent and have a common distribution, typically assumed to be normal, with common location and scale). Enter HELP BOOTSTRAP FIT for details. 3) Added support for alternative random number generators. Note that the default generator (i.e., the one that has been in Dataplot for many years) is based on Fibonacci sequence as defined by Marsagalia. Note that this is equivalent to the generator UNI of Jim Blue, David Kahaner, and George Marsagalia that is in the CMLIB library. Support is now provided for a linear congruential generator written by Fullerton (CMLIB routine RUNIF) and a multiplicative congruential generator (ACM algorithm 599). In addition, 2 generators based on the generalized feedback shift register (GFSR) methods are supported. The first is based on the original algorithm of Lewis and Payne (Journal of the ACM, Volume 20, pp. 456-468). The second is an alternative implementation given by Fushimi and Tezuka (Journal of the ACM, Volume 26, pp. 516-523). Both are based on codes given by Monohan (2000) in "Numerical Methods of Statistics". Support is also provided for the Applied Statistics algorithm 183. AS183 is based on the fractional part of the sum of 3 multiplicative congruential generators. It requires 3 integers be specified initially. Dataplot uses the multiplicative congruenetial generator (which does depend on the SEED command) to randomly generate these 3 integers. These 6 generators are used to generate uniform random numbers. Random numbers for other distributions are then derived from these uniform random numbers. To specify the uniform random number generator, use the command SET RANDOM NUMBER GENERATOR FIBONACCI SET RANDOM NUMBER GENERATOR LINEAR CONGRUENTIAL SET RANDOM NUMBER GENERATOR MULTIPLICATIVE CONGRUENTIAL SET RANDOM NUMBER GENERATOR GFSR SET RANDOM NUMBER GENERATOR FUSHIMI SET RANDOM NUMBER GENERATOR AS183 Note that you can use the SEED command to change the random numbers generated as well. The SEED does not apply to the 2 GFSR generators (these each have their own initialization routines). 4) Added support for the following special functions. a) Fermi-Dirac function FERMDIRA(X,ORDER) where ORDER is the order of the function. ORDER can be -0.5, 0.5, 1.5, or 2.5 (Dataplot uses an epsilon of 0.1, any order not within epsilon of one of the above values results in an error. Enter HELP FERMDIRA for details. 5) Added support for the following statistics: LET A = WINSORIZED VARIANCE Y LET A = WINSORIZED SD Y LET A = WINSORIZED COVARIANCE Y X LET A = WINSORIZED CORRELATION Y X LET A = BIWEIGHT MIDVARIANCE Y X LET A = BIWEIGHT MIDCOVARIANCE Y X LET A = BIWEIGHT MIDCORRELATION Y X LET A = PERCENTAGE BEND MIDVARIANCE Y LET A = PERCENTAGE BEND CORRELATION Y1 Y2 LET A = HODGES LEHMAN Y LET A = TRIMMED MEAN STANDARD ERROR LET A = <XQ> QUANTILE Y LET A = <XQ> QUANTILE STANDARD ERROR Y Enter HELP WINSORIZED VARIANCE HELP WINSORIZED SD HELP WINSORIZED COVARIANCE HELP WINSORIZED CORRELATION HELP BIWEIGHT MIDVARIANCE HELP BIWEIGHT MIDCOVARIANCE HELP BIWEIGHT MIDCORRELATION HELP PERCENTAGE BEND MIDVARIANCE HELP PERCENTAGE BEND CORRELATION HELP HODGES LEHMAN HELP TRIMMED MEAN STANDARD ERROR HELP QUANTILE HELP QUANTILE STANDARD ERROR for details. 6) Added the following plot: <stat> INFLUENCE CURVE Y XSEQ where <stat> is one of the built-in supported statistics, Y is a response variable, and XSEQ is a sequence of x values. The plot is generated by looping through the values in XSEQ. For a given value of XSEQ, the value of <stat> is computed for that value of XSEQ along with the values in Y. The vertical axis of the plot contains the computed statistic while the horizontal axis contains the value of XSEQ. This plot is of interest in the field of robust statistics. For details, enter HELP INFLUENCE CURVE. 7) For the ANOVA command, the residual standard deviations for various models are written to the file DPST3F.DAT (these are the same values that appear in the fitted output). This allows these values to be read back in as a variable, which is occassionally useful in writing macros that involve an ANOVA step. 8) The PROBE command now recognizes the following: PROBE IDMAN(1) PROBE IDMAN(2) PROBE IDMAN(3) This identifies the current manufacturer for devices 1, 2, and 3 respectively. In addition, the value of PROBEVAL is set if the returned manufacturer is one of the following: X11 = 1 QWIN = 2 REGI = 3 TEKT = 4 OPGL = 5 QUAR or MACI = 6 POST or PS = 7 HP or HPGL = 8 GENE = 9 GD = 10 QUIC = 11 CALC = 12 ZETA = 13 GKS = 14 LAHE = 15 PRIN = 16 LATE = 17 SVG = 18 DISC = 19 In addition, the device model can be extracted via the commands PROBE IDMOD(1) PROBE IDMOD(2) PROBE IDMOD(3) PROBE IDMO2(1) PROBE IDMO2(2) PROBE IDMO2(3) PROBE IDMO3(1) PROBE IDMO3(2) PROBE IDMO3(3) The following PROBE commands were added to return the operating system and compiler, respectively. PROBE IOPSY1 PROBE ICOMPI For IOPSY1, the value of PROBEVAL is also set: UNIX = 1 (Unix) PC-D = 2 (Windows) VMS = 3 (VAX/VMS) other = 0 For ICOMPI, the value of PROBEVAL is also set: f77 = 1 (the Unix Fortran compiler) MS-F = 2 (the Microsoft, now Compaq, Fortran compiler) LAHE = 3 (the Lahey Fortran compiler) other = 0 In general, if the PROBE command returns a string value of ON, OPENED, or YES, it sets the value of the PROBEVAL parameter to 1. Similarly, if the PROBE command returns a string value of OFF, CLOSED, or NO, it sets the value of the PROBEVAL parameter to 0. The above uses of PROBE are primarily of value in writing general purpose macros. In particular, macros that are intended to be used by others. 9) The following command was added: CAPTURE FLUSH The purpose of this command is to allow Dataplot text output to be written to the graphics output file. This can be useful when you are writing a macro and you want the analytic output (for example, the output from a fit) to be included with the graphics output. The following shows a sample of how this command is used: device 1 x11 device 2 postscript . title automatic skip 25 read gear.dat y x . mean plot y x . move 5 95 margin 5 capture junk.dat tabulate mean y x capture flush end of capture . device 2 close system lpr dppl1f.dat The initial CAPTURE command directs text output to the file "junk.dat". When the CAPTURE FLUSH command is encountered, the capture file is closed, an ERASE command is generated for the graphics devices, the contents of the capture file are printed on the graphics devices using the TEXT command (i.e., each line of the file generates a distinct TEXT command), and then the capture file is re-opened (it will start at the beginning). Since the lines are generated with the TEXT command, the appearance of the text can be controlled with the various TEXT attribute commands. Also, it is recommended that CRLF be set to ON (the default), a MOVE command be given to set the position for the first line of the text, and a MARGIN command be entered to set the beginning x-coordinate for the line. Some output may be too long to display on one page. You can control the number of lines printed per page with the following command: SET CAPTURE LINES <value1> ... <value5> Up to 5 values may be entered. The first value is for the first page of output, the second value is for the second page of output, and so on. If more than 5 values are entered, then the page limits start over (i.e., page 6 uses the value for page 1, page 7 uses the value for page 2, and so on). The default is 25 lines for all pages. If the MULTIPLOT switch is ON, the initial page erase is suppressed. The following example shows how this feature can be used: . device 1 x11 device 2 ps device 1 font simplex . title automatic skip 25 read gear.dat y x . multiplot 2 2 multiplot corner coordinates 0 0 100 100 multiplot scale factor 2 . mean plot y x sd plot y x . move 5 98 margin 5 plot capture junk.dat tabulate mean y x capture flush end of capture move 5 98 plot capture junk.dat tabulate sd y x capture flush end of capture . end of multiplot . Note that the null PLOT command is used to move to the next plot area without actually generating a plot. This example draws a mean and standard deviation plot on the first row and then suplements that with the numeric values generated using the TABULATE command on the second row. The following two commands are also available. SET CAPTURE NUMBER <ON/OFF> SET CAPTURE BOX <ON/OFF> If SET CAPTURE NUMBER ON is entered, the output lines are numbered. This is primarily a convenience function to help determine what values to enter for the SET CAPTURE LINES command in order to generate breaks at the appropriate spots. If SET CAPTURE BOX ON is entered, a box will be drawn for each page of the output. Use the BOX 1 CORNER COORDINATES command, before the CAPTURE FLUSH, to specify the cooridinates of the box. Use the various BOX attribute commands to set the properties of the box. 10) The following enhancements were made to the IF command: a) You can now test for strings with the IF command. That is, LET STRING S = TEST IF S = TEST PRINT S ENDS OF IF LET STRING S = TEST IF S <> "NOT TEST" PRINT S ENDS OF IF Note that "=" and "<>" are the only comparisons allowed (i.e., no "<" or ">"). The argument on the left of the "=" must be the name of a previously defined string. The argument to the right of the "=" is a literal string. The string can be enclosed in dooble quotes, ", if it contains spaces. If there are no double quotes, the string is assumed to end once the first space is encountered. b) Support was added for a ELSE and ELSE IF clauses. For example, IF A = 2 PRINT "A = 2" ELSE PRINT "A NOT EQUAL 2" END OF IF or IF A = 2 PRINT "A = 2" ELSE IF A = 1 PRINT "A = 1" ELSE PRINT "A NOT EQUAL 2 AND A NOT EQUAL 1" END OF IF c) A bug was fixed for the IF ... NOT EXIST and IF ... EXIST cases. Also, these now test whether the name exists as a parameter, string, variable, or matrix (previously, it only checked if it was a parameter). 11) One problem with reading files in Dataplot has been the inability to handle directory and file names with embedded spaces. The command SET FILE NAME QUOTE <ON/OFF> was added to address this problem. If ON is specified, then the file name may be enclosed in double quotes ("). All text, including spaces, until the matching ending double quote is found are considered a part of the file name (no provision is made for file names containing a double quote character). If OFF is specified, this feature is disabled. The default is OFF to accomodate quoted strings on the WRITE that might contain a "." (which is what Dataplot uses to identify a file name). For example, WRITE "Example of writing a string." The following will work as intended: SET FILE NAME QUOTE ON WRITE "C:\ My Data\STRING.OUT" "String to STRING.OUT" 12) Modified the output for the SIGN TEST, SIGNED RANK TEST, and the RANK SUM test to have better clarity. 13) Added the following to the BOOTSTRAP PLOT command: BOOTSTRAP CORRELATION PLOT Y X BOOTSTRAP RANK COVARIANCE PLOT Y X BOOTSTRAP RANK CORRELATION PLOT Y X BOOTSTRAP COVARIANCE PLOT Y X BOOTSTRAP LINEAR CALIBRATION PLOT Y X BOOTSTRAP QUADRATIC CALIBRATION PLOT Y X 14) Fixed several bugs. ---------------------------------------------------------------------- The following enhancements were made to DATAPLOT November-March 2002. ---------------------------------------------------------------------- 1) Added the following probability distributions. a) Geometric Extreme Exponential GEECDF(X,GAMMA) GEEPDF(X,GAMMA) GEEPPF(X,GAMMA) GEEHAZ(X,GAMMA) GEECHAZ(X,GAMMA) LET GAMMA = <value> LET Y = GEOMETRIC EXTREME EXPONENTIAL RANDOM NUMBERS FOR I = 1 1 100 LET GAMMA = <value> GEOMETRIC EXTREME EXPONENTIAL PROBABILITY PLOT Y GEOMETRIC EXTREME EXPONENTIAL PPCC PLOT Y LET GAMMA = <value> CHI-SQUARE GEOMETRIC EXTREME EXPONENTIAL GOODNESS OF FIT TEST Y LET GAMMA = <value> KOLMOGOROV-SMIRNOV GEOMETRIC EXTREME EXPONENTIAL GOODNESS OF FIT TEST Y b) Johnson SB JSBCDF(X,ALPHA1,ALPHA2) JSBPDF(X,ALPHA1,ALPHA2) JSBPPF(X,ALPHA1,ALPHA2) LET ALPHA1 = <value> LET ALPHA2 = <value> LET Y = JOHNSON SB RANDOM NUMBERS FOR I = 1 1 100 LET ALPHA1 = <value> LET ALPHA2 = <value> JOHNSON SB PROBABILITY PLOT Y JOHNSON SB PPCC PLOT Y LET ALPHA1 = <value> LET ALPHA2 = <value> CHI-SQUARE JOHNSON SB GOODNESS OF FIT TEST Y LET ALPHA1 = <value> LET ALPHA2 = <value> KOLMOGOROV-SMIRNOV JOHNSON SB GOODNESS OF FIT TEST Y c) Johnson SU JSUCDF(X,ALPHA1,ALPHA2) JSUPDF(X,ALPHA1,ALPHA2) JSUPPF(X,ALPHA1,ALPHA2) LET ALPHA1 = <value> LET ALPHA2 = <value> LET Y = JOHNSON SU RANDOM NUMBERS FOR I = 1 1 100 LET ALPHA1 = <value> LET ALPHA2 = <value> JOHNSON SU PROBABILITY PLOT Y JOHNSON SU PPCC PLOT Y LET ALPHA1 = <value> LET ALPHA2 = <value> CHI-SQUARE JOHNSON SU GOODNESS OF FIT TEST Y LET ALPHA1 = <value> LET ALPHA2 = <value> KOLMOGOROV-SMIRNOV JOHNSON SU GOODNESS OF FIT TEST Y d) Generalized Tukey-Lambda Note: still being tested/developed. In particular, negative values of shape parameter are not working. GLDCDF(X,LAMBDA3,LAMBDA4) GLDPDF(X,LAMBDA3,LAMBDA4) GLDPPF(X,LAMBDA3,LAMBDA4) LET LAMBDA3 = <value> LET LAMBDA4 = <value> LET Y = GENERALIZED TUKEY LAMBDA RANDOM NUMBERS FOR I = 1 1 100 LET LAMBDA3 = <value> LET LAMBDA4 = <value> GENERALIZED TUKEY LAMBDA PROBABILITY PLOT Y GENERALIZED TUKEY LAMBDA PPCC PLOT Y LET LAMBDA3 = <value> LET LAMBDA4 = <value> CHI-SQUARE GENERALIZED TUKEY LAMBDA GOODNESS OF FIT TEST Y LET LAMBDA3 = <value> LET LAMBDA4 = <value> KOLMOGOROV-SMIRNOV GENERALIZED TUKEY LAMBDA GOODNESS OF FIT TEST Y 2) Added support for the following new statistics. a) LET A = BIWEIGHT LOCATION Y b) LET A = BIWEIGHT SCALE Y For more information, enter the following commands: HELP BIWEIGHT LOCATION HELP BIWEIGHT SCALE 3) Added support for a biweight based confidence interval: BIWEIGHT CONFIDENCE INTERVAL Y For more information, enter the following command: HELP BIWEIGHT CONFIDENCE INTERVAL 4) Added the following command: SET BOX PLOT WIDTH <VARIABLE/FIXED> This specifies whether box plots are drawn with fixed width or variable width boxes. In variable width box plots, the width of the box is proportional to the maximum group sample size. That is, the largest width is used for the box plot with the largest sample size. The remaining box plots compute a scale factor that is the sample size of the given box plot relative to the maximum sample size. The default is variable width. This is recommended in most cases as it conveys additional information regarding the relative sample sizes. However, there are cases where it is desirable to turn this feature off (e.g., when multiple BOX PLOT commands are used to overlay box plots on the same page. 5) Added the following commands: SET 4PLOT MULTIPLOT <ON/OFF> SET 6PLOT MULTIPLOT <ON/OFF> Setting these switches ON specifies that the multiplot corner coordinates will be used to size the 4-PLOT and 6-PLOT, respectively. The default is OFF (i.e., the plot sizes are hard-coded to a default value). If set to ON, then you can use the MULTIPLOT CORNER COORDINATES to size the graphs. 6) ROBUSTNESS PLOT was added as a synonym for BLOCK PLOT. 7) Support was added for the Scalable Vector Graphics (SVG) graphics output. SVG is an XML based vector graphics format that is expected to become increasingly popular for web based applications. SVG format files can also be imported into several popular graphics editing programs. For more information, enter HELP SVG 8) The VERSION command was re-activated. 9) Fixed several bugs. ---------------------------------------------------------------------- The following enhancements were made to DATAPLOT May-October 2001. ---------------------------------------------------------------------- 1) Added support for kernel density plots. Enter HELP KERNEL DENSITY PLOT for details. 2) Added the following command: CONSENSUS MEAN PLOT This plot summarizes the results of a consensus means analysis. Enter HELP CONSENSUS MEANS PLOT for details. 3) Added the following probability distributions. a) Inverted Weibull IWECDF(X,GAMMA) IWEPDF(X,GAMMA) IWEPPF(X,GAMMA) IWEHAZ(X,GAMMA) IWECHAZ(X,GAMMA) LET GAMMA = <value> LET Y = INVERTED WEIBULL RANDOM NUMBERS FOR I = 1 1 100 LET GAMMA = <value> INVERTED WEIBULL PROBABILITY PLOT Y INVERTED WEIBULL PPCC PLOT Y LET GAMMA = <value> CHI-SQUARE INVERTED WEIBULL GOODNESS OF FIT TEST Y LET GAMMA = <value> KOLMOGOROV-SMIRNOV INVERTED WEIBULL GOODNESS OF FIT TEST Y b) Log Double Exponential LDECDF(X,ALPHA) LDEPDF(X,ALPHA) LDEPPF(X,ALPHA) LET ALPHA = <value> LET Y = LOG DOUBLE EXPONENTIAL RANDOM NUMBERS FOR I = 1 1 100 LET ALPHA = <value> LOG DOUBLE EXPONENTIAL PROBABILITY PLOT Y LOG DOUBLE EXPONENTIAL PPCC PLOT Y LET ALPHA = <value> CHI-SQUARE lOG DOUBLE EXPONENTIAL GOODNESS OF FIT TEST Y LET ALPHA = <value> KOLMOGOROV-SMIRNOV LOG DOUBLE EXPONENTIAL GOODNESS OF FIT TEST Y 4) Added support for random number for the following distributions: LET Y = COSINE RANDOM NUMBERS FOR I = 1 1 100 LET Y = ANGLIT RANDOM NUMBERS FOR I = 1 1 100 LET Y = HYPERBOLIC SECANT RANDOM NUMBERS FOR I = 1 1 100 LET Y = ARCSIN RANDOM NUMBERS FOR I = 1 1 100 LET Y = HALF-LOGISTIC RANDOM NUMBERS FOR I = 1 1 100 LET GAMMA = <value> LET Y = DOUBLE WEIBULL RANDOM NUMBERS FOR I = 1 1 100 LET GAMMA = <value> LET Y = DOUBLE GAMMA RANDOM NUMBERS FOR I = 1 1 100 LET GAMMA = <value> LET Y = INVERTED GAMMA RANDOM NUMBERS FOR I = 1 1 100 LET GAMMA = <value> LET Y = LOG GAMMA RANDOM NUMBERS FOR I = 1 1 100 LET GAMMA = <value> LET Y = GENERALIZED EXTREME VALUE RANDOM NUMBERS FOR I = 1 1 100 LET DELTA = <value> LET Y = LOG LOGISTIC RANDOM NUMBERS FOR I = 1 1 100 LET BETA = <value> LET Y = BRADFORD RANDOM NUMBERS FOR I = 1 1 100 LET B = <value> LET Y = RECIPROCAL RANDOM NUMBERS FOR I = 1 1 100 LET C = <value> LET B = <value> LET Y = GOMPERTZ RANDOM NUMBERS FOR I = 1 1 100 LET P = <value> LET Y = POWER NORMAL RANDOM NUMBERS FOR I = 1 1 100 LET P = <value> LET SD = <value> LET Y = POWER LOGNORMAL RANDOM NUMBERS FOR I = 1 1 100 LET ALPHA = <value> LET BETA = <value> LET Y = POWER EXPONENTIAL RANDOM NUMBERS FOR I = 1 1 100 LET ALPHA = <value> LET BETA = <value> LET Y = ALPHA RANDOM NUMBERS FOR I = 1 1 100 LET GAMMA = <value> LET THETA = <value> LET Y = EXPONENTIATED WEIBULL RANDOM NUMBERS FOR I = 1 1 100 5) Extended the ppcc plot to handle distributions with 2 shape parameters. Specifically, BETA PPCC PLOT GOMPERTZ PPCC PLOT ALPHA PPCC PLOT EXPONENTIAL POWER PPCC PLOT EXPONENTIATED WEIBULL PPCC PLOT This generates a 3-d plot of ppcc value over the range of values taken by the 2 shape parameters. Support for several additional 2-shape parameter distributions is still being tested. Enter HELP PPCC PLOT for details. 6) Made some updates to the STANDARDIZE command. a) LET Y2 = USCORE Y X1 X2 This syntax generates a u-score (i.e., subtract the minimum and divide by the range). This effectively translates the variable to a uniform (0,1) scale (much as the z-score translates to a standard normal scale). b) LET Y2 = SCALE STANDARDIZE Y X1 X2 This divides by the scale statistic, but does not subtract the location statistic first. c) Support was added for additional location and scale statistics. Enter HELP STANDARDIZE for details. 7) Added the command LET Y2 = CROSS TABULATE <stat> Y X1 X2 where <stat> is one of approximately 25 statistics. This command is related to, but different than, the analysis command CROSS TABULATE. This command stores the value of the cross tabulated statistic in each row of Y2 (where Y2 is the same length as the original array Y). The purpose of this form of the cross tabulation is to allow the cross tabulated values to be used in subsequent computations (e.g., to compute statistics not supported directly by Dataplot). For more information, enter the following command: HELP CROSS TABULATE (LET) In this case, you need to specify the "(LET)" in order to avoid ambiguity with other CROSS TABULATE commands. 8) Added support for the following new statistics. a) LET A = INTERQUARTILE RANGE Y For more information, enter the following commands: HELP INTERQUARTILE RANGE 9) Added the following commands: LET A = COMMON DIGITS Y LET A = NUMBER OF COMMON DIGITS Y These commands return the common digits, and the number of common digits, of a vector of numbers. For example, given the numbers 3.214, 3.216, 3.217, and 3.219, the common digits are 3.21 and the number of common digits is 2. The common digits are tested to the the RIGHT of the decimal point only (although Dataplot does include the portion to the left of the decimal point when returning the value of the common digits). If the numbers do no match in their integer portion, Dataplot does not return any common digits. This is a convenience command that was added to simplify some macros we were writing. 10) Added the following command: LET Y = MATCH X VAL LET Z2 = MATCH X VAL Z This command matches each value in VAL against X. For the first syntax, it returns the index of the X array where the match was found. A match is that value that is closest in absolute value (i.e., an exact match is not required, so a match will always be returned). For the second syntax, the index is used to extract the value in Z corresponding to the matched index. This second syntax in fact implements the most common use of this command (i.e., the index is usually not of interest in itself, rather it is used to extract appropriate values from another variable). 11) A few bug fixes were made. In partiuclar, a) The ANDERSON DARLING WEIBULL TEST was modified slightly. You no longer get an error message if the GAMMA parameter is not specified. This GAMMA was not actually being used. The command now does the following: i) If no GAMMA (shape parameter) or BETA (scale parameter) has been predefined, maximum likelihood estimates are computed automatically. ii) If GAMMA and BETA are pre-defined, then the test is based on these values. This allows you to test the goodness of fit for parameter values obtained by methods other than maximum likelihood. b) Made a few fixes in the SINGLE SAMPLE ACCEPTANCE PLAN command. Specifically, it now requires P1 < P2. In addition, a maximum number of iterations has been added to detect convergence problems (although this usually caused by P1 > P2). Also modified the documentation for this command to provide more realistic examples. c) Fixed some bugs in the GD device driver (JPEG and PNG support). d) The COLUMN LIMITS command now works with READ STRING (when the string is read from a file). e) The output for a number of confirmatory tests was modified for clarity. Note that the underlying computations were not modified, just the presentation of the output. ---------------------------------------------------------------------- The following enhancements were made to DATAPLOT February-April 2001. ---------------------------------------------------------------------- 1) The online help files have been substantially updated. Specifically, the additions over the last three years are now (mostly) incorporated into the help files and the web documentation. 2) Added support for generating JPEG and PNG image formats. Enter HELP GD for details. These device drivers are dependent on several external libraries, so support may not be available on all platforms. 3) Added the following command: CHARACTER AUTOMATIC SIGN <varname> This is similar to the CHARACTER AUTOMATIC command. However, it makes the character "+", "-", or "0" depending on the sign of the value in <varname>. This is sometimes useful when writing macros for design of experiment applications. 4) PROBE is used to determine the current value of internal Dataplot variables. Added the following values that can now be accessed with PROBE. FX1MIN FX1MAX FY1MIN FY1MAX GX1MIN GX1MAX GY1MIN GY1MAX DX1MIN DX1MAX DY1MIN DY1MAX The FX1MIN, FX1MAX, FY1MIN, FY1MAX define the current axis limits, DX1MIN, DX1MAX, DY1MIN, DY1MAX define the current data limits, and GX1MIN, GX1MAX, GY1MIN, GY1MAX are the current "fixed" limits (i.e., limits set by the LIMITS command). The most common use is to PROBE the values for FX1MIN, FX1MAX, FY1MIN, and FY1MAX to determine the current axis limits. This can sometimes be useful when writing complex macros. For example, PLOT SIN(X) FOR X = 0 0.1 6 PROBE FX1MIN LET XAXISMIN = PROBEVAL PROBE FX1MAX LET XAXISMAX = PROBEVAL PROBE FY1MIN LET YAXISMIN = PROBEVAL PROBE FY1MAX LET YAXISMAX = PROBEVAL 5) Added the following command: LET Y2 = STANDARDIZE Y LET Y2 = STANDARDIZE Y X1 LET Y2 = STANDARDIZE Y X1 X2 This command standardizes a variable, Y, based on either no groups, one group, or two groups. You can standardize for both mean and standard deviation or just by the mean. By standardize, we mean subtract the mean and divide by the standard deviation. Alternative measures for location and scale are allowed. For details, enter HELP STANDARDIZE 6) By default, the size of characters in subscripts or superscripts are set to 1/2 the current character size. You can set the scale factor using the following commands: SET SUPERSCRIPT VERTICAL SCALE <value> SET SUPERSCRIPT HORIZONTAL SCALE <value> These set the height and width of the character respectively. 7) The CAPABILITY command was significantly enhanced. Enter HELP CAPABILITY for details. 8) Support was added for orthogonal distance regression. Enter HELP ORTHOGONAL DISTANCE FIT for details. 9) Support was added for consensus means using Mandel-Paule, modified Mandel-Paule, Vangel-Ruhkin (maximum likelihood), Schiller-Eberhardt, and bounds on bias (BOB) methods. Enter HELP CONSENSUS MEANS for details. 10) Some bugs were fixed. In particular, diagrammatic graphics drawn in data units rather than screen units (e.g., DRAWDATA, MOVEDATA) were not drawn correctly for log scales. This has been fixed. An error message is printed if a WEIBULL or NORMAL axis scale is detected. ---------------------------------------------------------------------- The following enhancements were made to DATAPLOT January 2000. ---------------------------------------------------------------------- 1) Added the following commands. a) LEGEND <numb> UNITS <DATA/SCREEN> This command allows legend coordinates to be interpreted in either the screen 0 to 100 units (SCREEN, the default) or in units of the plot (DATA). b) ...LABEL OFFSET <value> ...LABEL JUSTIFICATION <value> These commands allow you to set the horizontal offset (in Dataplot 0 to 100 screen units, the LABEL DISPLACEMENT allows you to set the vertical offset) and justification of the axis labels. These commands were motivated by some of the new multiplots discussed below. However, they can be used at any time (although usage should be rare). c) You can use CR() in text strings to start a new line. Up to 10 lines may be entered, although more than 3 lines is rare. Each of the lines use the same plot attributes (e.g., all left justified or all center justified). This applies to both hardware and software fonts and is used for all types of text. The most common usages are to create multiline titles and legends and to use multiple lines with alphabetic tic mark labels. d) By default, the Dataplot HISTOGRAM and FREQUENCY POLYGONS range from -6 to +6 standard deviations from the mean. Although in most cases, this is more than adequate, Dataplot did not warn you if points were found outside this range. Dataplot now flags the number of points outside this range (separate messages for points below and points above). No message is printed if all points are within the range. The CLASS LOWER and CLASS UPPER commands can be used if you need to widen the range. e) Dataplot now supports row labels and variable labels. Row labels are strings of up to 32 characters that are used to identify a row of the data. To define the row label, do something like the following: SKIP 25 COLUMN LIMITS 1 19 READ ROW LABELS AUTO79.DAT COLUMN LIMITS 20 132 READ AUTO79.DAT Y1 TO Y12 The COLUMN LIMITS are almost always used when reading the row labels. Typically, you read a file once for the numeric data and then a second time for the row labels. Currently, the use of row labels is only supported with the CHARACTER command (see below). However, we anticipate additional usage of this feature in future updates. A long label (up to 52 characters) can be associated with a variable name (which is currently limited to 8 characters). Variable labels are specified with (note that the variable name must already be defined). VARIABLE LABEL <var name> <var label> The label may contain spaces. Variable labels are currently supported in three ways: i) Some of the new multi-plotting commands (discussed below) automatically make use of variable labels. ii) You can use the "^" to substitute a variable label for a variable name in text strings. For example, LET Y = NORMAL RAND NUMBERS FOR I = 1 1 100 VARIABLE LABEL Y NORMAL RANDOM NUMBERS Y1LABEL ^Y PLOT Y Previously, Dataplot only supported substitutions for parameters and strings. Now, if a variable name is found, it checks to see if a label has been defined. If yes, the label is substituted for the variable name. If not, the variable label is left as is (with the "^" removed). iii) The X1LABEL AUTOMATIC and Y1LABEL AUTOMATIC commands will now substitute the varialbe label for the variable name on the x and y axes respectively. f) The following special options were added for the CHARACTER command: ROWID - uses the row number as the plot character ROWLABEL - uses the row label as the plot character XVALUE - uses the x-coordinate of the point as the plot character YVALUE - uses the y-coordinate of the point as the plot character XYVALUE - uses (x-coor,y-coor) as the plot character TVALUE - uses the tag value as the plot character (Dataplot assigns a curve-id, the tag, to each point) ZVALUE - this is a special form that is specific to certain commands. For a few commands (currently the DEX CONTOUR PLOT and the CROSS TABULATE PLOT, but we expect a few additional plots to support this form in future releases), Dataplot writes a numeric value into an internal array. The value in this array is used as the plot symbol. Using this with unsupported plot types may have unpredictable results (it will depend on what is stored in the internal array). This option is typically set automatically by Dataplot in the background, so currently users should not set this directly. The ROWID and ROWLABEL are typically only used for the PLOT command (i.e., not for HISTOGRAM, etc.). This option keeps track of any subsetting (i.e., SUBSET/FOR/EXCEPT clauses on the plot command) when identifying the point. However, the results may be unpredictable for graphics other than the PLOT command. The most common use of this command is to identify specific points on the plot (typically with the ROWLABEL option). A typical sequence would be CHARACTER X PLOT Y X PRE-ERASE OFF LIMITS FREEZE CHARACTER ROWLABEL PLOT Y X SUBSET Y > 90 g) The STATISTIC PLOT command now supports the CORRELATION, RANK CORRELATION, COVARIANCE, and RANK COVARIANCE cases. h) The command SET PARAMETER EXPANSION <NUMERIC/EXPONENTIAL> was added. This command applies when substituting the value of a parameter using "^". Normally, this was intended for putting numeric values in text lagels. In this case, it is desirable to limit the number of digits. However, when used with the FIT command (parameters you want to remain constant rather than be fitted are often entered this way), you may need to specify high precision. If NUMERIC (the default) is specified, the current algorithm for parameter substitution is used. If EXPONENTIAL is specified, the parameter is entered using scientific notation. For example, (0.123456789012*10**(2)) i) The command SET SORT DIRECTION <ASCENDING/DESCENDING> was added. This command specifies whether the sorts performed by SORT and SORTC are ascending or descending sorts (the default is ascending). 2) The following new plots were added. a) INTERACTION PLOT Y X1 ... XK <stat> INTERACTION PLOT Y X1 ... XK These plot Y versus X1*X2* ... *XK and are primarily intended for DEX applications. Specifically, it serves as the building block for the DEX INTERACTION PLOT discussed below. It is actually the DEX INTERACTION PLOT that is typically generated by the user. This command supports the same set of statistics as the STATISTIC PLOT command. The case of most interest for the DEX plots is 2 X variables, but these plots will in fact handle an arbitrary number up to 25. b) CROSS TABULATE <stat> PLOT Y X1 X2 CROSS TABULATE <stat> PLOT Y1 Y2 X1 X2 CROSS TABULATE PLOT X1 X2 CROSS TABULATE PLOT <stat> X1 X2 This command performs a cross-tabulation on X1 and X2. It computes the statistic given by <stat> for the response values (Y) in each cell of the cross tabulation. The list of supported statistics is the same as for the STATISTIC PLOT command. Most of the supported statistics expect a single response variable. A few expect two (e.g., LINEAR CORRELATION). The COUNT (or NUMBER) expect no response variables. The output of this command plots the computed statistic on the Y axis. The X axis coordinate is determined from the two group variables in the following way: i) The levels of the first group variable (X1 in the above examples) are plotted at 1, 2, 3, etc. ii) For each level of the group 1 variable, the levels of the group 2 variable are scaled +/- 0.2 around the level of the group 1 variable. For example, if X1 has 2 levels (at 1 and 2) and X2 has 3 levels (1, 2, and 3), then the following x-coordinates are used: X1 X2 X-COOR ============================ 1 1 0.8 1 2 1.0 1 3 1.2 2 1 1.8 2 2 2.0 2 3 2.2 The syntax CROSS TABULATE X1 X2 is a special case. It plots the value of X1 on the X axis and the value of X2 on the Y axis. The plot character is then set to the count for that cell (this is done automatically and you do not need to set the plot character). This form of the plot has application in the design of experiments. Note that this command is an extension of the STATISTIC PLOT command. However, instead of one group variable, there are two group variables. The command SET CROSS TABULATE PLOT DIMENSION <1/2> can be used to specify an alternative format for this plot. If "1", then the format of the plot is described as above. If "2", then the format is similar to the CROSS TABULATE X1 X2 format. That is, SET CROSS TABULATE PLOT DIMENSION 2 CROSS TABULATE MEAN PLOT Y X1 X2 will print the value of the mean of Y at the value of X1 on the X axis and the value of X2 on the Y axis. Essentially, this is the tabled values in graphic format. You can use this format to generate plots where you want to print a numeric value at (X,Y), that is some value other than X or Y. You can define a response variable Z with the desired values to print and then use the CROSS TABULATE MEAN PLOT (if there is only one value, the mean is equal to that value). c) DEX CONTOUR PLOT Y X1 X2 YCONT This plots a dex contour plot for the case when X1 and X2 have 2 levels (represented by the values -1 and 1). In addition, one or more center points (X1 and X2 both 0) may be present. Any points where X1 and X2 are not equal to -1, 1, or 0 are ignored. The array YCONT contains the contour levels. The appearance of the plot is controlled by the settings of the LINE and CHARACTER command. Specifically, trace 1 = label for center point and the points at (-1,-1), (-1,1), (1,1), (1,-1). The character setting should be ZVAL and line should be blank. trace 2 = center point. If no center point was specified, this point is not generated (and the CHAR and LINE settings need to be adjusted accordingly). trace 3 = line connecting (-1,-1), (1,-1), (1,1), (-1,1) trace 4+= the contour lines start with trace 4. There is one trace for each value of YCONT. This command implements the algorithm previously available in the built-in DEXCONT.DP macro as a Dataplot command. As an example of this command, you can enter SKIP 25 READ BOXYIELD.DAT Y X1 X2 LET YCONT = SEQUENCE 50 2 70 CHARACTER ZVAL CIRCLE CIRCLE CHARACTER FILL OFF ON ON LINE BLANK BLANK BLANK DEX CONTOUR PLOT Y X1 X2 d) YATES CUBE PLOT Y X1 X2 X3 This plots a Yates cube plot for the case when X1, X2, and X3 are factor variables with exactly two levels. It plots the value of the response variable, Y, at each vertex. This plot is used in 2**(3) factorial and fractional factorial designs. 3) Dataplot now supports sub-regions on plots. Sub-regions are motivated by the desire to denote "engineering limits" on a plot. That is, a rectangle, denoting an acceptance region in both the X and Y directions, is drawn on the plot and then the plots are overlaid on top of this. Although the subregion capability was motivated for the purpose of denoting engineering limits, they can in fact be used for whatever purpose you want. The SUBREGION commands are: SUBREGION <ON/OFF> <ON/OFF> <ON/OFF> .... SUBREGION XLIMITS <lower value> <upper value> SUBREGION <id> XLIMITS <lower value> <upper value> SUBREGION YLIMITS <lower value> <upper value> SUBREGION <id> YLIMITS <lower value> <upper value> Up to 10 subregions may be defined. In most applications, only a single subregion is plotted. The SUBREGION <ON/OFF> switch determines whether or not the given subregion is plotted. The SUBREGION XLIMITS/YLIMITS commands specify the lower and upper bounds of the rectangle. If no <id> is specified, the limits are set for the first subregion. If <id> is specified, it should be between 1 and 10. You do not need to adjust the settings for the CHARACTER, LINE, BAR, and SPIKE when using subregions. Dataplot automatically shifts these in the background. The attributes of the SUBREGION are defined by: REGION FILL <ON/OFF> REGION COLOR <COLOR> REGION BORDER LINE <linetype> REGION BORDER COLOR <color> The REGION FILL and REGION COLOR determine the attributes of the interior of the rectangle. The two most common choices are to leave it blank or to fill it with some type of light gray scale color. The attributes of the box border are set with the REGION BORDER LINE and REGION BORDER COLOR commands. The standard line types (BLANK,SOLID, DASH, DOTTED, etc.) are supported. Although only one setting was given above, if you have defined multiple subregions, then you should define multiple settings in the above command. A typical sequence of commands would be SUBREGION ON SUBREGION XLIMITS 0.35 0.42 SUBREGION YLIMITS 2000 3000 REGION FILL ON REGION BORDER LINE DASH REGION COLOR G90 PLOT .... SUBREGION OFF Some points to note about subregions are: a) The subregions are plotted before any of the plot curves. The significance of this is that a solid filled subregion will be drawn and then the regular plot points are drawn on top. The effect of this can be hardware dependent. On X11 and Postscript devices, a solid character can be seen on top of a light gray scale box (if the gray scale gets too dark, the plot points are no longer distinguishable). However, on some hardware devices, you may not be able to see points plotted on top of a solid fill region. In this case, plot the border of the subregion and leave the interior blank. It is this order of plotting that distinguishes the subregion from simply using a BOX <id> command to plot rectangular regions on the screen. b) Although most commonly used with the PLOT command, subregions can in fact be used with any Dataplot graphics command. c) Currently, only rectangular subregions are supported. We expect that to be generalized to polygonal regions in the future. 4) Dataplot now saves the following internal parameters after all plots (not just those generated with PLOT): PLOTCORR - correlation of the X and Y coordinates on the plot PLOTCOR1 - correlation of the X and Y coordinates on the plot with a tag value of 1. This can be useful for plots that generate reference lines (which you do not want included in the correlation computation PLOTYMAX - maximum Y coordinate YMAXINDE - index of the maximum Y coordinate PLOTYMIN - minimum Y coordinate YMININDE - index of the minimum Y coordinate PLOTXMAX - maximum X coordinate XMAXINDE - index of the maximum X coordinate PLOTXMIN - minimum X coordinate XMININDE - index of the minimum X coordinate NACCEPT - number of plot points inside the first subregion (0 if no subregions defined) NREJECT - number of plot points outside the first subregion (0 if no subregions defined) NTOTAL - number of plot points (NACCEPT + NREJECT) (0 if no subregions defined) 5) The following multiplots were added: SCATTER PLOT MATRIX Y1 Y2 ... YK FACTOR PLOT Y1 X1 ... YK CONDITIONAL PLOT Y X TAG a) SCATTER PLOT MATRIX Y1 ... YK This generates all the pairwise scatter plots of Y1 ... YK on a single page. b) FACTOR PLOT Y X1 ... XK This generates the plots Y VS X1, Y VS X2, .... , Y VS XK on a single page. c) CONDITIONAL PLOT Y X TAG This generates PLOT Y VERSUS X for each unique value in TAG on a single page. There are a lot of variations possible with these types of plots. For example, the basic concept is not limited to scatter plots. For example, you can generate all the pairwise bihistograms instead of the pairwise scatter plots. There are many options in terms of labeling, what plot goes on the diagonal, and so on. There are various SET commands that control the appearance and nature of these plots. Enter HELP SCATTER PLOT MATRIX HELP CONDITIONAL PLOT HELP FACTOR PLOT for a complete description of what is available. Two variations of the SCATTER PLOT MATRIX are important enough to be given special names: DEX INTERACTION PLOT YOUDEN MATRIX PLOT These are described under HELP SCATTER PLOT MATRIX. 6) Fixed the following bugs. a) The MULTIPLOT SCALE FACTOR did not work correctly with the software fonts. b) Entering "character blank", i.e., the blank is in lower case, plotted BLAN as the plot character when DEVICE 1 FONT SIMPLEX was used. c) Using SP() with a software font did not work. d) The BOX SHADOW OFF command was fixed to set the shadow height and width to zero rather than to the default. ---------------------------------------------------------------------- The following enhancements were made to DATAPLOT January - July 1999. ---------------------------------------------------------------------- 1) Modified the IF command so that if there is an error (e.g., one of the parameters is not defined), the IF status is set to FALSE rather than being undefined. 2) Added the following time series commands. a) Added the command LET PERIOD = <value> LET START = <value> SEASONAL SUBSERIES PLOT Y A seasonal subseries plot is used to determine if there is significant seasonality in a time series. Instead of a straight time order plot, it splits the plot into the corresponding seasons (or periods). For example, for monthly data, all the January values are plotted, then all the February values, and so on. Reference lines are drawn at the seasonal means. b) Added the command LET PERIOD = <value> LET STLWIDTH = <value> LET STLSDEG = <0/1> LET STLTDEG = <0/1> LET STLROBST = <0/1> SEASONAL LOWESS Y (or SEASONAL LOESS Y) READ DPST1F.DAT SEAS TREND The SEASONAL LOWESS command decomposes a time series into trend, seasonal, and residual components using techniques based on locally weighted least squares. That is, X(t) = TREND(t) + SEAS(t) + RES(t) The seasonal and trend components are written to the file DPST1F.DAT (dpst1f.dat on Unix systems) and can be read back into Dataplot for further plotting and analysis. The internal variable RES contains the residual component and the internal variable PRED contains the trend plus the seasonality component. The SEASONAL LOWESS command accepts a number of options which can be defined by the LET commands above. The most important is the PERIOD parameter which identifies the number of seasons (e.g., 12 for monthly data). The STLWIDTH parameter identifies the number of data points to use in the LOWESS steps and defaults to N/10. It is similar to specifying the LOWESS FRACTION for standard LOWESS smoothing. The more points used, the more smoothing that occurs. The STLSDEG and STLTDEG parameters identify the polynomial degree used in the lowess for the seasonal and trend components respectively. By default, the seasonal lowess performs some robustness iterations. Enter LET STLROBST = 1 to suppress this. This technique is described in Cleveland, Cleveland, McRae, and Terpenning, "STL: A Seasonal-Trend Decomposition Procedure Based on Loess", Statistics Research Report, AT&T Bell Laboratories. c) Added an ARIMA modeling capability. The command is: ARMA Y AR DIFF MA SAR SDIFF SMA SPERIOD where Y = the response variable AR = the order of auto-regressive terms DIFF = number of differences to apply. DIFF is typically 0, 1, or 2. Differencing is one technique for removing trend. MA = order ot the moving average terms SAR = order of seasonal auto-regressive terms. SDIFF = number of seasonal differences to apply. It is typically 0, 1, or 2. SMA = order of seasonal moving average terms. SPERIOD = period for seasonal terms. It defaults to 12 (if a seasonal component is included). If there is no seasonal component, the last 4 terms may be omitted. To minimize the amount of screen output, but to also to keep the maximum amount of information, Dataplot writes most of the output to files. Speficially, dpst1f.dat - the parameters and the standard deviations of the parameters from the ARMA fit. The order is: 1) Autoregressive terms 2) Seasonal autoregressive terms 3) Mean term 4) Moving average terms 5) Seasonal moving average terms dpst2f.dat - this file contains: 1) Row number 2) Original series (i.e., Y) 3) Predicted values 4) Standard deviation of predicted values 5) Residuals 6) Standardized residuals dpst3f.dat - Intermediate outut from iterations before convergence. This is generally useful if the ARMA fit does not converge. dpst4f.dat - The parameter variance-covariance matrix. dpst5f.dat - The forecast values for (N/10)+1 observations ahead. Specifically, 1) The forecasted values 2) The standard deviation of the forecasted values. 3) The lower 95% confidence band for the forecast. 4) The upper 95% confidence band for the forecast. Dataplot allows you to define the starting values by defining the variable ARPAR. The order of the parameters is as given for the file dpst1f.dat above. By default, all parameters are set to 1 except for the mean term which is set to 0. In addition, you can define the variable ARFIXED to fix certain parameters to their start values. That is, you define ARPAR to specify the start values. If the corresponding element of ARFIXED is zero, the parameter is estimated as usual. If ARFIXED is one, then the parameter is fixed at the start value. The most common use of this is to set certain parameters to zero. For example, if you fit an AR(2) model and you want the AR(1) term to be zero, you could enter the following: LET ARPAR = DATA 0 1 LET ARFIXED = DATA 1 0 Dataplot uses the STARPAC library (developed by Janet Rogers and Peter Tyrone of NIST) to compute the ARIMA estimates. ARIMA modeling is covered in many time series texts. It is beyond the scope of this news file to discuss ARIMA modeling. However, to use ARIMA models, it is generally recommended that the series be at least 50 observations long. In addition, if the series is dominated by the trend and seasonal factors, an explicit trend, seasonal, and random component decomposition method, such as the seasonal lowess described above, is generally preferred to an explicit ARIMA model. 3) Added support for location and scale parameters for an additional 15 distributuins. Entering the command LIST DISTRIBU. will list the distributions table. This table shows which distributions support location and scale parameters. 4) Added the following statistics: Added the CNPK capability index statistics: LET LSL = <value> LET USL = <value> LET A = CNPK Y This statistic is now also supported for the following plots: LET LSL = <value> LET USL = <value> CNPK PLOT Y X DEX CNPK PLOT Y The LSL and USL specify the lower specification and upper specificiation engineering limits. The CNPK is a variant of the CPK capability indices used for non-normal data and is defined as: CNPK = MIN(A,B) where A = (USL-MEDIAN)/(P(.995)-MEDIAN) B = (MEDIAN-LSL)/(MEDIAN-P(0.005)) P(0.995) and P(0.0050 are the 99.5 and 0.5 percentiles of the data respectively. Added the geometric mean and standard deviation and the harmonic mean statistics. LET A = GEOMETRIC MEAN Y LET A = GEOMETRIC STANDARD DEVIATION Y LET A = HARMONIC MEAN Y This statistic is now also supported for the following plots: GEOMETRIC MEAN PLOT Y X GEOMETRIC STANDARD DEVIATION PLOT Y X HARMONIC MEAN PLOT Y X BOOTSTRAP GEOMETRIC MEAN PLOT Y X BOOTSTRAP GEOMETRIC STANDARD DEVIATION PLOT Y X BOOTSTRAP HARMONIC MEAN PLOT Y X JACKNIFE GEOMETRIC MEAN PLOT Y X JACKNIFE GEOMETRIC STANDARD DEVIATION PLOT Y X JACKNIFE HARMONIC MEAN PLOT Y X The geometric mean is defined as: XGM = (PRODUCT(Xi))**(1/N) The geometric standard deviation (SD means standard deviation of) is defined as: XSD = EXP(SD(LOG(Xi))) The harmonic mean is defined as: XHM = N/SUM(1/Xi) 5) Added the Wilks-Shapiro test for normality. The following commands are equivalent. WILKS SHAPIRO NORMALITY TEST Y WILKS SHAPIRO TEST Y WILKS SHAPIRO Y There must be at least 3 values in Y. The computed significance level is not neccessarily valid for N >= 5,000. This command uses algorithm R94 from the Applied Statistics Journal. 6) Added the studentized range CDF and PPF functions. LET A = SRACDF(X,V,R) LET A = SRAPPF(P,V,R) where V is the degrees of freedom and R is the number of samples. X must be positive, V must be >= 1, and R must be >= 2. For most applications, R = V + 1. The PPF function is only supported for values in the range 0.90 to 0.99. The studentized range is defined as: Q = Range/(Standard deviation) The studentized range is used in constructing confidence intervals and significance levels for tests for multiple comparison in analysis of variance problems. 7) Updated the Weibull maximum likelihood estimates to suport censored data (both type 1 and type 2 and multiple). It also now generates confidence intervals for the estimate (for various significance levels). The command SET CENSORING TYPE <NONE/1/2/MULTIPLE> defines the censoring type. The EXPONENTIAL MLE output was modified to be more readable and consistent with the Weibull output. 8) Added the following quality control commands. a) Added the following command to generate binomial based single sample acceptance plans: SINGLE SAMPLE ACCEPTANCE PLOT P1 P2 ALPHA BETA where P1 = Acceptable Quality Level P2 = Lot Tolerence Percent Defective ALPHA = Probability of a Type I error BETA = Probability of a Type II error b) Added a command to generate the average run length for the cumulative sum (cusum) control chart. The average run length is the average number of observations that are entered before the system is declared out of control. LET S0 = <value> LET K = <value> LET H = <value> These commands set parameters required by the cusum ARL calculation. Specifically, S0 = start-up value for the cumulative sum. This is usually zero. However, it can be set to a positive initial value for a fast initial response (FIR) cusum chart. H = defines the value which signals that the cusum is "out of control". A value of 5 is a common choice. K = the value of k is set to one half of the smallest shift in location (in standard deviation units) that you want to detect. A common choice is a 1-sigma shift, that is k = 0.5. LET Y = ONE-SIDED CUSUM ARL DELTA LET Y = CUSUM ARL DELTA where DELTA defines the difference between the target value of the process and the true value of the process. This is a variable that is usually defined to be a sequence of values. For example, LET DELTA = SEQUENCE 0 .01 0.5 That is, this command returns the average run length for a series of values that define the difference between the target value and the true value of the process. A typical sequence of commands would be LET K = 0.5 LET H = 5 LET S0 = 0 LET DELTA = SEQUENCE 0 .01 1.0 LET Y = CUSUM ARL DELTA PLOT Y DELTA This command was implemented using Applied Statistics algorithm 258. If unreasonable values are specified for the parameters, this algorithm can generate unreasonable results. 9) Added the following commands: ANOP LIMITS <low> <high> PROPORTION CONFIDENCE LIMITS Y DIFFERENCE OF PROPORTION CONFIDENCE LIMITS Y1 Y2 to generate a confidence interval for proportions and the difference of two proportions respectively. The ANOP LIMITS command is used to define the lower and upper bounds that define a success. The confidence intervals are based on the direct binomial computations, not the normal approximation, so it is not limited by small N. 10) Added the command WEB HANDBOOK <keyword> This command access the NIST/SEMATECH Engineering Statistics Handbook. A beta version of the Handbook will be released May, 1999 (http://www.itl.nist.gov/div898/handbook/). The <keyword> is matched against a file of keywords to go to the appropriated location in the handbook. This command is used primarily by the Dataplot GUI, but it can also be entered by the end-user. If you want to see a list of the supported keywords, enter LIST HANDBK.TEX The handbook provides tutorial information on many common engineering statistical capabilities. This complements the WEB HELP command, which accesses the on-line Dataplot Reference Manual. The on-line Reference Manual is primarily concerned with how you implement a statistical technique while the Handbook provides more of a statistical tutorial. If your site has downloaded the Handbook, enter a command like the following: SET HANDBOOK URL http://ketone.cam.nist.gov/cf/handbook/ to define the home directory for the handbook. The web commands SET BROWSER and SET NETSCAPE OLD apply to the WEB HANDBOOK as well. SET BROWSER defines the browser and SET NETSCAPE OLD allows you to use a currently open browser for the WEB HANDBOOK command. These commands are discussed in more detail later in this news file. 11) Added the following non-parameteric tests. a) The following are non-parametric alternatives to the 2-sample t test (i.e., test the hypothesis U1 = U2 where U1 and U2 are the population means for 2 samples). SIGN TEST Y1 Y2 SIGN TEST Y1 Y2 D0 SIGN TEST Y1 MU SIGNED RANK TEST Y1 Y2 SIGNED RANK TEST Y1 Y2 D0 SIGNED RANK TEST Y1 MU RANK SUM TEST Y1 Y2 RANK SUM TEST Y1 Y2 D0 where Y1 and Y2 are the response variables and D0 and MU are parameters. Specify D0 to test U1 - U2 = D0. The 2-sample test can also be used for the 1-sample test U1 = MU. The SIGN TEST and SIGNED RANK TEST commands only apply to paired samples. The RANK SUM TEST command does not require equal sample sizes. b) The following performs the Kruskal-Wallis non-parametric 1-sample ANOVA. KRUSKAL WALLIS Y X where Y is the response variable and X is the factor variable. 12) Added the following plot commands: a) TUKEY MEAN-DIFFERENCE PLOT Y1 Y2 A Tukey mean-difference plot is an enhancement of the quantile-quantile (q-q) plot. It converts the interpretation of the q-q plot from the differences around a diagonal line to the differences around a horizontal line. If T(i) and D(i) are the vertical and horizontal coordinates for the q-q plot, the Tukey mean-difference plot is (T(i) - D(i)) versus (T(i) + D(i))/2. A horizontal reference line is drawn at zero. b) SPREAD LOCATION PLOT Y TAG The spread-location (s-l) plot is a robust alternative to the homoscedasticity plot. Given a response variable Y and a group-id variable X, the homoscedasticity plot is the group standard deviations versus the group means. This is a graphical measure of constant spread across groups. The s-l plot has the square roots of the absolute value of the Y(i) minus their group medians on the vertical axis and the group medians on the horizontal axis. A reference line connects the group medians. When setting the LINE and CHARACTER commands, the reference line is the first trace and the data starts with trace 2 (each group is identified as a unique trace). That is, to draw the data points as circles and the reference line as a solid line, do something like the following CHARACTER CIRCLE ALL CHARACTER BLANK LINE BLANK ALL LINE SOLID SPREAD LOCATION PLOT Y X c) RF SPREAD PLOT The residuals-fitted (r-f) spread plot is a graphical measure of the goodness of fit. That is, this command is preceeded by some type of fit. It plots percent point (or quantile) plots of the fitted values minus their mean and the residuals arranged side by side with a common vertical scale. The vertical spread of the residuals compared to the vertical spread of the fitted values gives an indication of how much of the variation is explained by the fit. 13) Added the following special functions: a) LET A = ABRAM0(X,ORD) This computes the Abramowitz function for order ORD. currently, ORD can be an integer from 0 to 100. b) LET A = CLAUSN(X) This computes the Clausen integral. c) LET A = DEBEYE(X,ORD) This computes the Debeye function of order ORD. ORD can be 1, 2, 3, or 4. d) LET A = EXP3(X) This computes the cubic exponential integral. e) LET A = GOODST(X) This computes the Goodwin and Stanton integral. f) LET A = LOBACH(X) This computes the Lobachevski's integral. g) LET A = SYNCH1(X) LET A = SYNCH2(X) This computes the synchrotron radiation functions. h) LET A = STROM(X) This computes Stromgren's integral. i) LET A = TRAN(X,ORD) This computes the transport integrals of order ORD. ORD can be 2, 3, 4, 5, 6, 6, 8, or 9. These special functions are computed using ACM algorithm 757. Formulas for these functions are given in: Allan MacLead, "ACM Transactions of Mathematical Software", Vol. 22, No. 3, September 1996, pp. 288-301. 13) Fixed a bug in the CD command for Unix platforms. The CD command allows you to set the default directory. A few other miscellaneous bugs have also been fixed. ---------------------------------------------------------------------- The following enhancements were made to DATAPLOT September - Dec 1998. ---------------------------------------------------------------------- 1) Added the following MATRIX commands: LET MEAN = MATRIX GROUP MEANS M TAG LET SD = MATRIX GROUP SD M TAG LET SPOOL = POOLED VARIANCE-COVARIANCE MATRIX M TAG The MATRIX GROUP MEANS and MATRIX GROUP SD commands compute the group means and standard deviations of a matrix. The POOLED VARIANCE-COVARIANCE MATRIX computes a pooled variance-covariance matrix. These commands all operate on a matrix (M) and a group id variable (TAG). The TAG variable has the same number of rows as the matrix M. The values of TAG are typically integers and they identify the group to which the corresponding row of the matrix belongs. The MATRIX GROUP MEANS/SD commands return a matrix with the same number of columns as the original matrix M and with the number or rows equal the number of groups identified by the TAG variable. That is, MEANS(2,3) is the mean of of the third variable of the second group. The pooled variance-covariance matrix: SPOOL = (1/SUM(N(i)-1)) * SUM((1/N(i)-1)*C(i))) where N(i) is the number of elements in group i and C(i) is the variance-covariance matrix of the rows belonging to group i. An earlier implementation of this command works with 2 matrices (and no group id variable). This version of the command still works. That is, if the second argument to POOLED VARIANCE-COVARIANCE MATRIX command is a matrix, it is assumed that there are 2 groups and the data for each group is stored in a separate matrix. If the second argument is a variable, it is assumed that it is a group id variable and the data for all matrices are stored in a single matrix. For the 2 group case, either syntax will work. For more than 2 groups, only the new syntax will work. 2) The following control chart enhancements were added: a) HOTELLING CONTROL CHART Y1 Y2 ... YK GROUP This commands implements a Hotelling multivariate control chart. Given p response variables, the Hotelling control chart computes and plots the following for each group: T-square = n*(xbar - u0)'SINV(xbar - u0) N is the size of the group, xbar is a vector of the p sample means for the subgroup, and u0 is a vector of the p sample means for the entire data set. That is a 1-sample Hotelling test is computed to test whether the means for a given group are equal to the overall sample means. An upper control limit (there is no lower control limit) is drawn at the appropriate F statistic for the Hotelling test. The value of alpha for the F test is chosen so that alpha/(2*p) = 0.00135. This corresponds to the 3-sigma value for a univariate chart. You can specify your own control limit, set by whatever criterion that you deem appropriate, by entering the command: LET USL = <value> You can control the appearance of this chart by setting the lines and character switches. The traces are: Trace 1 = T-square values Trace 2 = Zero reference line Trace 3 = Dataplot calculated control limit Trace 4 = User specified upper control limit For example, to draw the T-square values as a solid line and an X, no zero reference line, the Dataplot control limit as a dotted line, and no user specified control limit, enter the commands: LINE SOLID BLANK DOTTED BLANK CHARACTER X BLANK BLANK BLANK b) CUSUM CONTROL CHART Y X This command implements a mean cumulative sum control chart. There are numerous variations on how cusum control charts are implemented. Dataplot follows the methods discussed by Thomas Ryan in "Statistical Methods for Quality Improvement". Dataplot does the following: i) Positive and negative sums are computed as follows: SUMH = MAX[0,(z(i) - k) + SUMH(i-1)] SUML = MAX[0,(-z(i) - k) + SUML(i-1)] SUMH and SUML have initial values of 0. Z(i) is the z-score of the ith group (that is, the sub-group mean minus the overall mean divided by the standard deviation of xbar. Dataplot plots the negative of SUML. This is to avoid overlap for the plottting of SUMH and SUML. SUMH is plotted on the positive scale vertically and SUML is plotted on the negative scale vertically. The value of k is set to one half of the smallest shift in location (in standard deviation units) that you want to detect. Dataplot by default selects a 1-sigma shift, that is k = 0.5. To overide this, enter the command LET K = <value> ii) By defauult, Dataplot sets the control limit at a value of 5. That is, if the one of the sums exceeds 5, the process is deemed out of control. To override the default value, enter the command LET H = <value> The value for H is typically between 4 and 5. 3) The following command was added: TOLERANCE LIMITS Y This computes univariate two-sided tolerance limits for the normal case and for the distribution free case. Tolerance limits are a generalization of confidence limits for the mean. However, instead of a confidence limit for a single value, it provides confidence limits for the interval that contains a given percentage of the data (this is called the coverage). That is, for 90% coverage, we are finding a confidence interval that contains 90% of the data. 4) Bug fixes: a) The PP command was fixed for the LAHEY and Microsoft PC versions of Dataplot. b) Fixed the RESET VARIABLES command so that it would not delete parameters, functions, and strings. Note that RESET DATA still deletes them. 5) Added the percentile statistic: LET A = <value> PERCENTILE Y where <value> is a number between 0 and 100. This statistic is now also supported for the following plots: LET P100 = <value> PERCENTILE PLOT Y X BOOTSTRAP PERCENTIL PLOT Y JACKNIFE PERCENTILE PLOT Y PERCENTILE BLOCK PLOT Y DEX PERCENTILE PLOT Y The LET P100 = <value> command defines the percentile you want to compute for all of these plots. Fixed a small bug in the ...DECILE command. 6) Added the CPM and CC capability index statistics: LET LSL = <value> LET USL = <value> LET TARGET = <value> LET A = CPM Y LET A = CC Y This statistic is now also supported for the following plots: LET LSL = <value> LET USL = <value> LET TARGET = <value> CPM PLOT Y X DEX CPM PLOT Y CC PLOT Y X DEX CC PLOT Y The LSL, USL, and TARGET specify the lower specification, upper specificiation, and target engineering limits. The CPM is a variant of the CP and CPK capability indices and is defined as: CPM = (USL-LSL)/(6*SQRT(S**2+(XBAR-TARGET)**2)) where XBAR and S are the sample mean and standard deviation. For this index, the larger the better. The CC statistic is defined as: CC = MAX((TARGET-XBAR)/(TARGET-LSL),(XBAR-TARGET)/USL) For this index, the smaller the better. 7) Added the following commands: <dist> CHI-SQUARE GOODNESS OF FIT TEST Y <dist> CHI-SQUARE GOODNESS OF FIT TEST Y X <dist> CHI-SQUARE GOODNESS OF FIT TEST Y X1 X2 <dist> KOLMOGOROV-SMIRNOV GOODNESS OF FIT TEST Y These commands test whether or not a data set comes from a specified distribution. All distributions for which Dataplot can generate a cdf function are supported (there are 70+ such distributions in Dataplot). The names are identical to the names used for the PROBABILITY PLOT command. A couple of notes on these commands: a) The KOLMOGOROV-SMIRNOV test is not supported for discrete distributions. b) The CHI-SQUARE test works with either binned or unbinned data. Dataplot supports 2 types of pre-binned data. If your data has equal sized bins, then the X variable contains the mid-point of each bin. If your bins may be of different sizes, then the X1 variable is the lower limit of each class and X2 is the upper limit of each class. Unequal bins usually result from combining classes with low expected frequency. It uses the same rules for binning as it does for the HISTOGRAM command. That is, the class width is 0.3*S where S is the standard deviation of Y. The upper and lower limits are the mean plus or minus 6 times the standard deviation. The BINNED command generates counts while the RELATIVE BINNED generates relative frequency. As with the histogram, you can override these defaults with the following commands: CLASS WIDTH <value> CLASS LOWER <value> CLASS UPPER <value> c) You need to specify shape parameters for distributions that require it. For example, LET GAMMA = 2 GAMMA CHI-SQUARE GOODNESS OF FIT Y The parameter names are equivalent to the names used for the PROBABILITY PLOT command. Location and shape parameters can be specified genrically for the CHI-SQUARE and KOLMOGOROV-SMIRNOV tests respectively by entering: LET CHSLOC = <value> LET CHSSCALE = <value> LET KSLOC = <value> LET KSSCALE = <value> These are optional. 8) Added the following commands: 2-SAMPLE CHI-SQUARE TEST Y1 Y2 2-SAMPLE KOLMOGOROV-SMIRNOV TEST Y1 Y2 These 2 commands test whether 2 data samples come from a common (unspecified) distribution. Y1 and Y2 do not need to be the same size. 9) Updated the TABULATE and CROSS-TABULATE commands. The computed group id's and the value of the statistic are written to the file DPST1F.DAT (or dpst1f.dat on Unix). This simplifies using the results in further analysis. For example, to compute the group means and store them in a variable, do something like the following: TABULATE MEANS Y X SKIP 1 READ DPST1F.DAT GROUPID YMEANS SKIP 0 The CROSS-TABULATE is similar, except there are 2 group-id variables instead of 1. 10) Added the following command: LET Y2 X = BINNED Y (or LET Y2 X = FREQUENCY TABLE Y) LET Y2 X = RELATIVE BINNED Y (or LET Y2 X = RELATIVE FREQUENCY TABLE Y) Here, Y2 will contain the counts (or frequencies) and X will contain the bin mid-points. This command bins your data. It uses the same rules as the histogram. That is, the class width is 0.3*S where S is the standard deviation of Y. The upper and lower limits are the mean plus or minus 6 times the standard deviation. The BINNED command generates counts while the RELATIVE BINNED generates relative frequency. As with the histogram, you can override these defaults with the following commands: CLASS WIDTH <value> CLASS LOWER <value> CLASS UPPER <value> The command SET RELATIVE HISTOGRAM <AREA/PERCENT> specifies whether or not relative binning is computed so that the area sums to 1 or so that the frequencies sum to 1. The first option, which is the default, is useful when using the relative binning as an estimate of a probability distribution. The second option is useful when you want to see what percentage of the data falls in a given class. ---------------------------------------------------------------------- The following enhancements were made to DATAPLOT June - August 1998. ---------------------------------------------------------------------- 1) Added the following command: EMPIRICAL CDF PLOT Y This generates an empirical CDF plot. 2) Made the following enhancements to the QWIN (the Microsoft 95/NT version) device driver: Added support for "true color". Previously, if the user had true color set for the display, the screen colors were all black (i.e., you couldn't see the output). Note that true color is something you set from the Windows 95/NT control panel, not something that Dataplot can set automatically. That is, you set true color or standard VGA mode from the control panel and then you enter the appropriate Dataplot commands to support that mode. a) If you have your display set to true color, enter the following commands in the C:\DPLOGF.TEX file: SET QWIN COLOR DIRECT DEVICE 1 QWIN Note that the order is significant here. The color model is set when the QWIN device is initialized, so the SET QWIN COLOR command must come before the DEVICE 1 QWIN command. Also, it is recommended that you put these commands in the DPLOGF.TEX file so that you do not get the initial blank screen where you cannot see the text that you type. The command SET QWIN COLRO VGA resets the default. b) For true color, the QWIN device driver supports the full complement of colors recognized by Dataplot (HELP COLORS for a description of the Dataplot color model). The default VGA mode only supports 16 colors. c) The foreground and background colors for the text window can now be set for both standard VGA and true color modes. The following 2 commands, if used, should be entered after the SET QWIN COLOR <DIRECT/VGA> command and before the DEVICE 1 QWIN command: SET QWIN TEXT BACKGROUND COLOR <index> SET QWIN TEXT FOREGROUND COLOR <index> where <index> is an integer identifying the desired color (HELP COLOR gives the index to color mapping in Dataplot). For VGA mode, <index> is restricted to 0 to 15. For DIRECT mode, <index> is restricted to 0 to 88. The default for both VGA and DIRECT mode is a white foreground on a black background. The colors for the graphics window are set by the normal Dataplot COLOR commands (e.g., BACKGROUND COLOR BLUE, LINE COLOR RED). 3) Added the following new matrix commands: The following 2 commands are used to obtain row or column statistics for a matrix. LET Y = MATRIX ROW <STAT> M LET Y = MATRIX COLUMN <STAT> M where <STAT> is one of: MEAN, MIDMEAN, TRIMMED MEAN, WINSORIZED MEAN, MEDIAN, SUM, PRODUCT, SD (or STANDARD DEVIATION), SD OF MEAN, VARIANCE, VARIANCE OF MEAN, RELATIVE VARIANCE, RELATIVE STANDARD DEVIATION, COEFFICIENT OF VARIATION, AVERAGE ABSOLUTE DEVIAITION, MEDIAN ABSOLUTE DEVIATION, RANGE, MIDRANGE, MAXIMUM, MINIMUM, EXTREME, LOWER HINGE, UPPER HINGE, LOWER QUARTILE, UPPER QUARTILE, SKEWNESS, KURTOSIS, AUTOCOVARIANCE, AUTOCORRELATION. The following command computes an overall mean for the matrix: LET A = MATRIX MEAN M The following command calculates the quadratic form of a vector and a matrix. The quadratic form is: x'Mx where x is a vector and M is a matrix. Quadratic forms are used frequently in multivariate statistical calculations. LET A = QUADRATIC FORM M X The following command is a commonly used quadratic form: LET Y = DISTANCE FROM MEAN M This command generates: Di = (Xi - XMEAN)'SINV(Xi-XMEAN) where Xi is the ith row, XMEAN is a vector of the column means, and SINV is the inverse of the variance-covariance matrix. That is, Di is the distance of the ith row of the matrix from the mean. Note that in the Dataplot command, you specify the original matrix, not the variance-covariance matrix. The following command cacluate X*X' for the vector X. The result is a pxp matrix where p is the number of rows of X. This computation is used in some multivariate analyses. LET M = VECTOR TIMES TRANSPOSE X The following command is used to create linear combinations: LET Y2 = LINEAR COMBINATION M C If the matrix M has p columns and n rows, C should be a vector with p rows. This commands calculates: y2 = c(1)*M1 + c(2)*M2 + c(3)*M3 + ... + c(p)*Mp where M1, M2, ... are the columns of the matrix. The result is a vector with n rows. The following commands are used to calculate various distance matrices: LET D = EUCLIDEAN ROW DISTANCE M LET D = EUCLIDEAN COLUMN DISTANCE M LET D = MAHALANOBIS ROW DISTANCE M LET D = MAHALANOBIS COLUMN DISTANCE M LET D = MINKOWSKY ROW DISTANCE M LET D = MINKOWSKY COLUMN DISTANCE M LET D = CHEBYCHEV ROW DISTANCE M LET D = CHEBYCHEV COLUMN DISTANCE M LET D = BLOCK ROW DISTANCE M LET D = BLOCK COLUMN DISTANCE M It is often desirable to scale the original matrix before calculating a distance matrix. The following commands can be used to scale the original matrix: SET MATRIX SCALE <NONE/MEAN/SD/RANGE/ZSCORE> LET MSCAL = MATRIX ROW SCALE M LET MSCAL = MATRIX COLUMN SCALE M The SET MATRIX SCALE command is used to define the type of scaling to perform. You can scale either across rows or down columns. The following command computes the pooled sample variance-covariance matrix for two matrices: LET MOUT = POOLED VARIANCE-COVARIANCE MATRIX MA MB Note that MA and MB should have the same number of columns. However, the number of rows can vary. The following computes a 1-sample Hotelling T-square test: LET A = 1-SAMPLE HOTELLING T-SQUARE M Y The 1-sample Hotelling t-square tests the following hypothesis: H0: U=U0 Here, U0 is a vector of population means. That is, the hypothesied means for each column of the matrix. In the above syntax, M is a matrix containing the original data and Y is a vector containing the hypothesized means. The returned parameter A contains the value of the Hotelling T-square test statistic. The critical values corresponding to alpha = .90, .95, .99, and .995 are saved in the internal parameters B90, B95, B99, and B995. The following computes a 2-sample Hotelling T-square test: LET A = 2-SAMPLE HOTELLING T-SQUARE MA MB The 2-sample Hotelling t-square tests the following hypothesis: H0: U1=U2 Here, U1 is a vector of population means for sample 1 and U2 is a vector of population means for sample 2. In the above syntax, MA is a matrix containing the original data for sample 1 and MB is a matrix containing the original data for sample 2. MA and MB must have the same number of columns. However, they can have a different number of rows. The returned parameter A contains the value of the Hotelling T-square test statistic. The critical values corresponding to alpha = .90, .95, .99, and .995 are saved in the internal parameters B90, B95, B99, and B995. The following 2 commands add or delete rows of a matrix: LET M = MATRIX ADD ROW M Y LET M = MATRIX DELETE ROW M ROWID Here, M is a matrix, Y is a variable with the number of rows equal to the number of columns in M, and ROWID is a scalar identifying the row to delete. 4) Fixed a bug in the character fill for the QWIN device driver (DEVICE 1 QWIN for Windows 95/NT). Removed the line CHARACTER FILL COLOR from the sample DPLOGF.TEX file (this caused problems for Postscript output). 5) Added support for SP() in the LET STRING command. SP() will be converted to a single space. Previously, LET STRING packed out any spaces in the string. 6) Added the command: LET Y2 = EXPONENTIAL SMOOTHING Y ALPHA This performs an exponential smoothing of Y. The formual is: Y2(1) = Y(1) Y2(I) = ALPHA*Y(I) + (1-ALPHA)*Y(I-1), I > 1 ALPHA is the smoothing parameter and should be greater than 0 and less than 1. 7) The PROBE command is used to return the values of certain internal parameters and strings. This command was updated so that the returned value is automatically saved. If the returned value is an integer or real number, then the value is stored in the internal parameter PROBEVAL. If the returned value is a string, then the value is stored in the internal string PROBESTR. PROBESTR and PROBEVAL can then be used in the same way as other parameters and strings. This feature is typically used in macros. For example, you might want to use the machine maximum value as a "missing value" indicator. A host independent way of using this value would now be: PROBE CPUMAX LET MACHMAX = PROBVAL You could then use the parameter MACHMAX wherever you wanted to define a missing value. 8) Multiplots create new 0 to 100 coordinate units for each subplot and character sizes are scaled according to this new subplot area. Although this is generally desirable, sometimes the resulting character sizes are too small or distorted if the rows to columns ratio is too far from 1. As a convenience, the following command was added to allows all character sizes to be scaled when multiplotting is in effect: MULTIPLOT SCALE FACTOR 3 MULTIPLOT SCALE FACTOR 1 2 In the first syntax, both the height and width sizes are scaled (by 3 in this example) by the same factor. In the second syntax, the height and width are scaled separately (the height by 1 and the width by 2 in this example). The word FACTOR is optional in the command. The scale factor is multiplied by the requested size. For example, if the title size is 2 and the scale factor is 3, then the effective size will be 6. The scale factor is ignored if multi-plotting is not in effect. This command allows character sizes to be easily adjutsted for multiplots without having to enter a number of separate size commands before the multiplot (and then after the multiplot to return to normal values). ---------------------------------------------------------------------- The following enhancements were made to DATAPLOT January - May 1998. ---------------------------------------------------------------------- 1) Reliability/Extreme Value Updates a) Added the following commands for finding maximum likelihood estimates for distribution parameters. WEIBULL MAXIMUM LIKELIHOOD Y EXPONENTIAL MAXIMUM LIKELIHOOD Y DOUBLE EXPONENTIAL MAXIMUM LIKELIHOOD Y NORMAL MAXIMUM LIKELIHOOD Y LOGNORMAL MAXIMUM LIKELIHOOD Y PARETO MAXIMUM LIKELIHOOD Y GAMMA MAXIMUM LIKELIHOOD Y INVERSE GAUSSIAN MAXIMUM LIKELIHOOD Y GUMBEL MAXIMUM LIKELIHOOD Y (or EV1) POWER MAXIMUM LIKELIHOOD Y BINOMIAL MAXIMUM LIKELIHOOD Y POISSON MAXIMUM LIKELIHOOD Y At this time, only the parameter estimates are computed, that is no standard errors or confidence intervals for the estimates are computed. There are various synonyms for these commands. For example, WEIBULL MAXIMUM LIKELIHOOD ESTIMATE Y WEIBULL MAXIMUM LIKELIHOOD Y WEIBULL MLE ESTIMATE Y WEIBULL MLE Y are all equivalent. Similar synonyms apply to the other commands. The exponential case is an exception in that it does print confidence intervals. It also supports type 1 and type 2 censored data. For example, the full sample case is: SET CENSORING TYPE NONE (this is the default) EXPONENTIAL MLE Y Type 1 censoring is censoring at a fixed time t0. This is handled via: SET CENSORING TYPE 1 LET TEND = <censor time> EXPONENTIAL MLE Y If you have data values that are censored before time t0, then create a TAG variable with 1 for failure times and 0 for censoring times. You would the enter: EXPONENTIAL MLE Y TAG Type 2 censoring is censoring after R failures have been observed. This case is handled via: SET CENSORING TYPE 2 EXPONENTIAL MLE Y TAG where TAG is variable with 1 for failure times and 0 for censoring times. Related to this are the commands DEHAAN Y CME Y These generate parameter estimates for the generalized Pareto distribution for extreme value applications. b) Added the following commands: 1) LET Y = CUMULATIVE HAZARD X TAG LET Y = HAZARD X TAG where X is a list of failure times and TAG is an array that identifies the value as a failure time (TAG = 1) or a censoring time (TAG = 0). 2) LET Y = INTERARRIVAL TIMES X where X is a list of failure times. This is similar to the SEQUENTIAL DIFFERENCE command in that it calculates X(I)-X(I-1). However, it sorts the data first and the first interarrival time is set equal to X(1). 3) LET Y = CUMULATIVE AVERAGE X LET Y = CUMULATIVE MEAN X As the name implies, this computes the cumulative mean of a variable. One use of this is to compute cumulative mean time between failures for reliability data. 4) LET Y = REVERSE X LET Y = FLIP X This reverses the order of a variable (i.e., Y(1)=X(N), Y(2)=X(N-1), and so on). For example, if you want to sort from high to low instead of low to high, you can enter LET Y = SORT X LET Y = REVERSE Y 5) LET ALPHA = <value> LET BETA = <value> LET Y = POWER LAW RANDOM NUMBERS FOR I = 1 1 N This generates N failure times from a non-homogeneous Poisson process following the power law. That is, M(t) = alpha*t**beta alpha, beta > 0 where M(t) is the expected number of failures at time t. The random failure times are generated from the formula for the interarrival times (i.e., the CDF for the waiting time for the next failure given a failure at time T): F (t) = 1 - EXP(-ALPHA*[(T+t)**BETA-T**BETA] T c) The following 2 plots were added: KAPLAN MEIER PLOT Y TAG MODIFIED KAPLAN MEIER PLOT Y TAG Here, Y is a list of failure times and TAG identifies censored data. A value of 1 for TAG means that the corresponding Y value is a failure time and a value of 0 means that the corresponding Y value was censored. The TAG variable is optional (if omitted, no censoring is performed). Kaplan-Meier estimates are discussed in most texts in survival or reliability analysis. The modified Kaplan-Meier is a slightly adjusted form of the estimate. The X axis of the plot is failure time and the Y axis is an estimate of survival (or reliability). Some analysts prefer that the Y axis be CDF estimate (i.e., 1 - Survival). Enter the command SET KAPLAN MEIER CDF to specify this (and SET KAPLAN MEIER RELIABILITY to reset it). If you want the numeric Kaplan Meier estimates, do KAPLAN MEIER PLOT Y TAG LET RELI = YPLOT LET FAILTIME = XPLOT The variables RELI and FAILTIME can be used in subsequent commands to do further analysis. d) The following plots were added: EXPONENTIAL HAZARD PLOT Y TAG NORMAL HAZARD PLOT Y TAG LOGNORMAL HAZARD PLOT Y TAG WEIBULL HAZARD PLOT Y TAG Hazard plots are similar to probability plots. However, they can be used with censored data and are commonly used in reliability studies. e) Added the following command: DUANE PLOT Y Given a set of failure times T, the Duane plot is Ti/i (where i is the index from 1 to N) versus Ti on a log-log scale. You do not need to specify XLOGON or YLOG ON as Dataplot does this automatically. Dataplot also resets the original values for these switches after the Duane plot is completed. A line is fit to the plotted data. Various parameters from the fit are saved as internal parameters (enter STATUS PARAMETERS after the DUANE PLOT to see what they are). A typical use would be: READ FAILURE.DAT Y Y1LABEL CUMULATIVE MEAN TIME BETWEEN FAILURE X1LABEL FAILURE TIME CHARACTER X BLANK LINE BLANK SOLID DUANE PLOT Y JUSTIFCATION CENTER MOVE 50 7 TEXT SLOPE OF FITTED LINE = ^BETA MOVE 50 4 TEXT INTERCEPT OF FITTED LINE = ^ALPHA f) The following command was added: RELIABILITY TRENDS TEST Y This command is used in reliability applications to determine if repair times show a significant trend. It computes the following 3 tests: a) Reverse Arrangement Test b) Military Handbook Test c) Laplace Test The last 2 tests require the censoring time. This is entered (before the RELIABILITY TRENDS TEST) as: LET TEND = <value> The value of TEND should be greater than the maximum value of the response variable. Some of the Probability and Recipe updates discussed below are also relevant to reliability applications. 2) Probability Updates a) Added optional location and scale parameters for many of the probability functions. Specifically, the following functions now support both location and scale parameters: CAUCDF, CAUPDF, CAUPPF, CAUSF DEXCDF, DEXPDF, DEXPPF, DEXSF DGACDF, DGAPDF, DGAPPF DWECDF, DWEPDF, DWEPPF EV1CDF, EV1PDF, EV1PPF EV2CDF, EV2PDF, EV2PPF EWECDF, EWEPDF, EWEPPF EXPCDF, EXPPDF, EXPPPF FLCDF, FLPDF, FLPPF GAMCDF, GAMPDF, GAMPPF GEVCDF, GEVPDF, GEVPPF GGDCDF, GGDPDF, GGDPPF GLOCDF, GLOPDF, GLOPPF HFCCDF, HFCPDF, HFCPPF HFNCDF, HFNPDF, HFNPPF IGCDF, IGPDF, IGPPF LGACDF, LGAPDF, LGAPPF LGNCDF, LGNPDF, LGNPPF LLGCDF, LLGPDF, LLGPPF LOGCDF, LOGPDF, LOGPPF NORCDF, NORPDF, NORPPF RIGCDF, RIGPDF, RIGPPF WEICDF, WEIPDF, WEIPPF NOTE: The help files and Reference Manual refer to the location parameter for the 2-parameter inverse gaussian (IG), reciprocal inverse gaussian (RIG), Wald (WAL), and fatigue life (FL) distributions. This is actually the scale parameter for these distributions. The following added a location parameter only: HFLCDF, HFLPDF, HFLPPF PA2CDF, PA2PDF, PA2PPF PARCDF, PARPDF, PARPPF PEXCDF, PEXPDF, PEXPPF PLNCDF, PLNPDF, PLNPPF PNRCDF, PNRPDF, PNRPPF VONCDF, VONPDF, VONPPF WALCDF, WALPDF, WALPPF WCACDF, WCAPDF, WCAPPF The following added a scale parameter only: GEPCDF, GEPPDF, GEPPPF POWCDF, POWPDF, POWPPF The following added a lower and upper limit (which is then converted by Dataplot into location and scale parameters). UNICDF, UNIPDF, UNIPPF, UNISF BETCDF, BETPDF, BETPPF, BETSF b) Added the following hazard and cumulative hazard functions: NOTE: In the following, LOC and SCALE specify location and scale parameters respectively and are optional. For the uniform, the lower and upper limits are specified (and are converted by Dataplot to location and scale parameters) and are also optional. All other parameters are the standard shape parameters for the distribution. UNIHAZ(X,LOWER,UPPER) - uniform hazard function UNICHAZ(X,LOWER,UPPER) - uniform cumulative hazard function NORHAZ(X,LOC,SCALE) - normal hazard function NORCHAZ(X,LOC,SCALE) - normal cumulative hazard function LGNHAZ(X,SD,LOC,SCALE) - normal hazard function LGNCHAZ(X,SD,LOC,SCALE) - normal cumulative hazard function PNRHAZ(X,SD,P,LOC) - power normal hazard function PNRCHAZ(X,SD,P,LOC) - power normal cumulative hazard function PLNHAZ(X,SD,P,LOC) - power log-normal hazard function PLNCHAZ(X,SD,P,LOC) - power log-normal cumulative hazard function EXPHAZ(X,LOC,SCALE) - exponential hazard function EXPCHAZ(X,LOC,SCALE) - exponential cumulative hazard function WEIHAZ(X,GAMMA,LOC,SCALE) - Weibull hazard function WEICHAZ(X,GAMMA,LOC,SCALE) - Weibull cumulative hazard function EWEHAZ(X,GAMMA,THETA,LOC,SCALE) - exponentiated Weibull hazard function EWECHAZ(X,GAMMA,THETA,LOC,SCALE) - exponentiated Weibull cumulative hazard function GAMHAZ(X,GAMMA,LOC,SCALE) - gamma hazard function GAMCHAZ(X,GAMMA,LOC,SCALE) - gamma cumulative hazard function IGAHAZ(X,GAMMA,LOC,SCALE) - inverted gamma hazard function IGACHAZ(X,GAMMA,LOC,SCALE) - inverted gamma cumulative hazard function GGDHAZ(X,GAMMA,K,LOC,SCALE) - generalized gamma hazard function GGDCHAZ(X,GAMMA,K,LOC,SCALE) - generalized gamma cumulative hazard function EV1HAZ(X,GAMMA,LOC,SCALE) - Gumbel hazard function EV1CHAZ(X,GAMMA,LOC,SCALE) - Gumbel cumulative hazard function EV2HAZ(X,GAMMA,LOC,SCALE) - Frechet hazard function EV2CHAZ(X,GAMMA,LOC,SCALE) - Frechet cumulative hazard function GEPHAZ(X,GAMMA,SCALE) - generalized Pareto hazard function GEPCHAZ(X,GAMMA,SCALE) - generalized Pareto cumulative hazard function IGHAZ(X,GAMMA,LOC,SCALE) - inverse gaussian hazard function IGCHAZ(X,GAMMA,LOC,SCALE) - inverse gaussian cumulative hazard function WALHAZ(X,GAMMA,LOC) - Wald hazard function WALCHAZ(X,GAMMA,LOC) - Wald cumulative hazard function RIGHAZ(X,GAMMA,LOC,SCALE) - reciprocal inverse gaussian hazard function RIGCHAZ(X,GAMMA,LOC,SCALE) - reciprocal inverse gaussian cumulative hazard function FLHAZ(X,GAMMA,LOC,SCALE) - fatigue life hazard function FLCHAZ(X,GAMMA,LOC,SCALE) - fatigue life cumulative hazard function PARHAZ(X,GAMMA,LOC) - Pareto hazard function PARCHAZ(X,GAMMA,LOC) - Pareto cumulative hazard function ALPHAZ(X,ALPHA,BETA) - alpha hazard function ALPCHAZ(X,ALPHA,BETA) - alpha cumulative hazard function PEXHAZ(X,ALPHA,BETA) - exponetial power hazard function PEXCHAZ(X,ALPHA,BETA) - exponential power cumulative hazard function NOTE: The hazard function is defined as: h(x) = pdf(x)/(1-cdf(x)) and the cumulative hazard function is defined as: H(x) = -log(1-cdf(x)) where pdf and cdf are the probability density and cumulative distribution functions respectively. These functions can be used to generate hazard and cumulative hazard functions for distributions that Dataplot does not support directly. c) Added the mixture of 2 normal probability functions. Specifically, NORMXCDF(X,U1,SD1,U2,SD2,PMIX) NORMXPDF(X,U1,SD1,U2,SD2,PMIX) NORMXPPF(P,U1,SD1,U2,SD2,PMIX) where U1 and SD1 are the mean and standard deviation of the first normal distribution, U2 and SD2 are the mean and standard deviation of the second normal distribution, and PMIX is the mixing proportion (between 0 and 1). You can generate a probability plot as follows: LET U1 = <value> LET SD1 = <value> LET U2 = <value> LET SD2 = <value> LET P = <value> NORMAL MIXTURE PROBABILITY PLOT Y You can generate random numbers as follows: LET U1 = <value> LET SD1 = <value> LET U2 = <value> LET SD2 = <value> LET P = <value> LET Y = NORMAL MIXTURE RANDOM NUMBERS FOR I = 1 1 1000 d) Added the inverted gamma probability functions: IGACDF(X,GAMMA,LOC,SCALE) IGAPDF(X,GAMMA,LOC,SCALE) IGAPPF(P,GAMMA,LOC,SCALE) This is not really a new function. It is simply the generalized gamma function with the second shape parameter set to -1. We added it as a separate set of functions since it is a common distribution in certain applications. Also added: LET GAMMA = <value> INVERSE GAMMA PROBABILITY PLOT INVERSE GAMMA PPCC PLOT e) Added following discrete PPCC PLOT commands: BINOMIAL PPCC PLOT NEGATIVE BINOMIAL PPCC PLOT LOGARIOTHMIC SERIES PPCC PLOT For the binonial and negative binomial, N must be specified (and then P is computed). f) Fixed the PROBABILITY PLOT X Y and PPCC PLOT X Y commands to handle zero count bins correctly. 3) Recipe Updates a) Added support for multi-factor recipe fits. For example, a common model is: Y = A0 + A1*X1 + A2*X1**2 + A3*X2 + A4*X2**2 + A5*X1*X2 In Dataplot, the recipe analysis could be done as follows: READ FILE.DAT Y X1 X2 BATCH READ FILE2.DAT XP1 XP2 LET X1S = X1*X1 LET X2S = X2*X2 LET X1X2 = X1*X2 LET XP1S = XP1*XP1 LET XP2S = XP2*XP2 LET XP1P2 = XP1*XP2 . RECIPE FIT FACTORS 5 RECIPE FIT Y X1 X1S X2 X2S X1X2 BATCH XP1 XP1S XP2 XP2S XP1P2 PRINT TOL XP1 and XP2 are the points at which you want the tolerance values computed. If they are omitted, then the tolerance values are computed at the unique points in the design matrix (i.e., all the unique combinations of X1 and X2). The BATCH variable is a batch identifier and is optional. X1 and X2 must have the same number of points and XP1 and XP2 should have the same number of points. However, X1 and XP1 do not need to have the same number of points (and they usually will not). The primary output from the RECIPE command is the tolerance values (by default, saved in TOL). Commands for setting the probability confidence and content are the same as for the 1-factor recipe fit. b) Recipe is generally used in the context of setting tolerance limits as defined in MIL-17 Handbook. A number of other statistical techniques are defined in this handbook. Dataplot had previously added support for the Grubbs test, Levene's test for shifts in scale, and the F test for shifts in location. The following additional tests defined in the handbook are now supported as well: ANDERSON-DARLING <DIST> TEST Y where DIST is: NORMAL, LOGNORMAL, WEIBULL, EXTREME VALUE ANDERSON-DARLING K-SAMPLE TEST Y X WEIBULL MAXIMUM LIKELIHOOD Y B BASIS <DIST> TOLERANCE LIMIT Y A BASIS <DIST> TOLERANCE LIMIT Y where DIST is: NORMAL, LOGNORMAL, WEIBULL, NON-PARAMETRIC The Anderson-Darling 1-sample test is used to determine if a data set can be assumed to come from a certain distribution. The EXTREME VALUE distribution is the type 1 extreme value distribution. The k-sample Anderson-Darling test is used to test if groups of data are the same (in the sense of coming from the same distribution with common location and scale). It is typically used to determine if data coming different batches can be treated as if they came from the same batch. The WEIBULL MAXIMUM LIKELIHOOD command is used to generate maximum likelihood estimates of the 2-parameter Weibull distribution (the shape and scale parameters). The B BASIS and A BASIS commands are used to generate b basis and a basis tolerance limits for a variable for a few common distributions. See the MIL-17 Handbook for more information on these techniques. 4) Matrix Updates Modified matrix commands to make more efficient use of storage. Upped default maximum number of rows from 1,500 to 3,000. Added a DIMENSION MATRIX COLUMNS <val> and DIMENSION MATRIX ROWS <val> command. This is used to dimension temporary matrices in the matrix routines. Note that unlike the DIMENSION command for variables, this command does not erase any previously created data. It is only used to dimension temporary matrices in the matrix code, not to store the original data. Each temporary matrix has a maximum of 920,000/3 elements. However, you cannot dimension the number of rows in a matrix to be greater than the number of rows in a variable. 5) Miscellaneous Updates a) Added the commands: LINE <SAVE/RESTORE> CHARACTER <SAVE/RESTORE> These were motivated by the graphical user interface, but they can be used directly by the user as well. b) Added the commands: SET PRINTER <id> PROBE PRINTER <id> These allow the user to specify the printer name for the PP command. It is currently supported for the Unix and Windows 95/NT versions. It would be straightforward to support on other systems as well. c) The ANOVA code was significantly rewritten. 1) The maximum number of factors was increased from 5 to 10. 2) The output was modified. Specifically, an ANOVA table was added other output was re-arranged. 3) Some information is now written out to files DPST1F.DAT and DPST2F.DAT. This is usefule if you need to use some of the ANOVA quantities in further analysis. 4) A check is now made to see if you have a balanced design (i.e., all cells have an equal number of observations). A warning message will be printed if an unbalanced case is detected. Note that the Dataplot calculations are based on the assumption of balanced data. However, it will still run the ANOVA for the unbalanced case (the output will not be accurate in this case). d) Added CODED as synonym for CODE (LET Y = CODE X or LET Y = CODED X). e) Modified data reads so that non-printing characters are converted to spaces. f) The BOOTSTRAP PLOT command was augmented so that the following parameters are now automatically saved: BMEAN - mean of the plotted bootstrap values BSD - standard deviation of the plotted bootstrap values B001 - the 0.1% percentile of the plotted bootstrap values B005 - the 0.5% percentile of the plotted bootstrap values B01 - the 1.0% percentile of the plotted bootstrap values B025 - the 2.5% percentile of the plotted bootstrap values B05 - the 5.0% percentile of the plotted bootstrap values B10 - the 10% percentile of the plotted bootstrap values B20 - the 20% percentile of the plotted bootstrap values B80 - the 80% percentile of the plotted bootstrap values B90 - the 90% percentile of the plotted bootstrap values B95 - the 95% percentile of the plotted bootstrap values B975 - the 97.5% percentile of the plotted bootstrap values B99 - the 99% percentile of the plotted bootstrap values B995 - the 99.5% percentile of the plotted bootstrap values B999 - the 99.9% percentile of the plotted bootstrap values These values are typically used in setting confidence levels. Also, the BOOTSTRAP COEFFICENT OF VARIATION PLOT and BOOTSTRAP RELATIVE VARIANCE PLOT commands were added. g) Some code not used by the user was added for the graphical front-end. h) Raised the maximum number of lines in a loop from 200 to 500. i) Fixed some minor bugs. ---------------------------------------------------------------------- The following enhancements were made to DATAPLOT October - December 1997. ---------------------------------------------------------------------- 1) The WRITE command was updated to allow WRITE VARIABLES ALL (or WRITE ALL VARIABLES) This was added to support some updates to the frontend, but it can be used in the command line as well. Currently, a maximum of 25 variables will be printed. 2) An update was made to allow exponential notation in commands where a number or parameter is expected. For example, LET Y = DATA 1.2E-7 2.0E3 4.26E+4 The above example shows the 3 forms of the E notation that are currently recognized. Note that using "D" instead of "E" is not currently supported. Parsing of expressions (e.g., transformations under LET, definition of functions, FIT expressions) is not yet supported. That is, LET Y(1) = 1.2E-3 does NOT work as of yet. The parsing of expresions under LET is handled in a different part of the code. Support may be added at a later time. 3) The command SKIP AUTOMATOC or SKIP ---- can be used to skip all lines in a data file until the first line containing a "----" string is found. It does not have to start in column 1. This was added primarily to to support the data files provided with Dataplot. However, you can use this with your own data files as well. If no line with "----" is found, Dataplot rewinds the file and tries to read data starting with the first line of the file. This option only applies if the read is performed on a file. If the read is from the terminal, SKIP AUTOMATIC is equivalent to a SKIP 0. 4) The following 2 commands were added: AUTOCOMOVEMENT PLOT Y CROSS COMOVEMENT PLOT Y1 Y2 These are similar to the AUTOCORRELATION PLOT and the CROSS CORRELATION PLOT commands. However, they are based on the COMOVEMENT statistic rather than the correlation statistic. At this time, no reference lines indicating statistical significance are drawn. 5) The following special function was added: LET A = PSIFN(X,K) - scaled k-th derivative of the PSI (or DIGAMMA) function Note that this computes a SCALED version of the function, specifically ((-1)**(K+1)/GAMMA(K+1))*PSI(X,K) where GAMMA is the gamma function and PSI(X,K) is the unscaled function. Also, it is the k-th derivative of PSI, not of the log gamma function. That is, K=1 computes the trigamma function, not the digamma function. 6) The DELETE command was modified so that blanked out values are reset to zero instead of machine negative infinity. 7) Added IF EXIST command. An IF NOT EXIST command was added several years ago. This commands works as follows: IF A EXIST PRINT A END OF IF where A is a parameter. A will be printed if it already exists. 8) Added the command REPLOT to regenerate the most recently created plot. Although this was motivated by enhancements to the graphical user interface, it can be useful in command line mode as well. ---------------------------------------------------------------------- The following enhancements were made to DATAPLOT September 1997. ---------------------------------------------------------------------- 1) Added a SLEEP <n> command to pause for <n> seconds. This is useful for macros so plots can be displayed for a given period of time without requiring user intervention to continue (as needed by the PAUSE command). This command is platform dependent and is currently implemented for Unix and Windows 95/NT versions. Added a CD command to change the current directory. This command is platform dependent and has currently been implemented for the Windows 95/NT version. This command is particularly useful for the Windows 95/NT version since when Dataplot is executed from a screen icon, the default directory is the the directory where the Dataplot executable resides. The SYSTEM command cannot be used to change the current directory since a "SYSTEM CD <directory>" does not persist after the SYSTEM command completes execution. 2) Added Mark Vangel's RECIPE code. RECIPE stands for "REgression Confidence Intervals on PErcentiles". It is used to calculate basis values for regression models with or without a random "batch effect". A full discussion of RECIPE is beyond the scope of this brief news item. Complete technical documentation for RECIPE is available at the following Web site: http://www.itl.nist.gov/div898/software/recipe/ This discusses RECIPE in general, not the Dataplot implementation. The basic RECIPE commands are: RECIPE FIT Y X BATCH XPRED - linear regression, polynomial models RECIPE ANOVA Y X1 ... XK BATCH - ANOVA, multilinear models The primary output from the RECIPE command is a set of tolerance values. These are saved in the internal Dataplot variable TOL by default. This variable can be plotted and manipulated like any other Dataplot variable. The RECIPE documentation (on the above web site) also discusses a program called SIMCOV. SIMCOV is used to determine whether or not Saitterthwaite approximation is adequate in determing the tolerance values. SIMCOV uses simulation to determine this. The following commands implement the SIMCOV program in Dataplot. RECIPE SIMCOV FIT Y X BATCH XPRED - linear regression, polynomial models RECIPE SIMCOV ANOVA Y X1 ... XK BATCH - ANOVA, multilinear models The following commands set switches for the RECIPE and SIMCOV analyses. RECIPE FIT DEGREE <N> - polynomial degree for RECIPE FIT RECIPE FACTORS <N> - number of factors for RECIPE ANOVA RECIPE OUTPUT <VAR> - name of variable to contain computed tolerance values RECIPE SATTERTHWAITE <YES/NO> - specifies whether or not Satterthwaite approximation is used RECIPE PROBABILITY CONTENT <VAL> - value for probability content RECIPE CONFIDENCE <VAL> - value for probability content RECIPE CORRELATION <N> - the number of correlation values at which to compute SIMCOV probabilities RECIPE SIMCOV REPLICATES <N> - the number of replications for SIMCOV RECIPE SIMPVT REPLICATES <N> - the number of replications for SIMPVT (applies when Satterthwaite approximation not used) In addition, the following commands were added to support RECIPE analyses (these techniques recommended by the MIL-HDBK-17E): GRUBB TEST Y - performs the Grubb test for outliers LEVENE TEST Y X - performs the Levene test for homogenuous variances (similar, but more robust for non-normal distributions, to Bartlett's test) F LOCATION TEST Y X - performs an F test for homogenuous locations These capabilities were originally implemented as the macros GRUBB.DP, LEVENE.DP, and FTESTLOC.DP which have been added to the Dataplot macro directory. In addition, four data sets (VANGEL31.DAT, VANGEL32.DAT, VANGEL33.DAT, and VANGEL34.DAT) that can be analyzed with RECIPE were added to the Dataplot data sets directory. Corresponding macros (VANGEL31.DP, VANGEL32.DP, VANGEL33.DP, and VANGEL34.DP) were added to the Dataplot programs directory. 3) The following control charts were added: EWMA CONTROL CHART Y - exponentially weighted moving average control chart EWMA CONTROL CHART Y X - exponentially weighted moving average control chart MOVING AVERAGE CONTROL CHART Y - moving average control chart MOVING AVERAGE CONTROL CHART Y X - moving average control chart MOVING RANGE CONTROL CHART Y - moving range control chart MOVING RANGE CONTROL CHART Y X - moving range control chart MOVING SD CONTROL CHART Y - moving standard deviation control chart MOVING SD CONTROL CHART Y X - moving standard deviation control chart These work in a similar fashion to previously available control charts. An important feature of all control charts was omitted from previous documentation (this feature has actually been available for quite some time). Dataplot allows you to specify the target and lower and upper control limits by entering the commands: LET A = TARGET = <value> - the target value LET A = USL <value> - the upper control limit LET A = LSL <value> - the lower control limit The data is drawn as trace 1, the target value and limits derived from the data are drawn as traces 2, 3, and 4, and the user specified target and control limits (if given) are drawn as traces 5, 6, and 7. You can control which of these values are actually plotted by setting the LINE and CHARACTER commands appropriately. 4) The REPEAT GRAPH, SAVE GRAPH, and LIST GRAPH commands that were previously added for X11 installations have been extended to support the Microsoft Windows 95/NT implementation. The commands work on Windows 95/NT as they do for Unix. The primary difference is that the plots are saved in Windows bitmap format. The Windows 95/NT still needs a little tidying up (the default positioning isn't ideal yet), but it is functional. 5) The following special functions were added: LET A = CGAMMA(XR,XC) - real component of complex gamma LET A = CGAMMAI(XR,XC) - complex component of complex gamma LET A = CLNGAM(XR,XC) - real component of complex log gamma LET A = CLNGAMI(XR,XC) - complex component of complex log gamma LET A = CBETA(AR,AC,BR,BC) - real component of complex beta LET A = CBETAI(AR,AC,BR,BC) - complex component of complex beta LET A = CLNBETA(AR,AC,BR,BC) - real component of complex beta LET A = CLNBETAI(AR,AC,BR,BC) - complex component of complex beta LET A = CPSI(XR,XC) - real component of complex psi LET A = CPSII(XR,XC) - complex component of complex psi LET A = CHM(X,A,B) - confluent hypergeometric M function LET A = HYPERGEO(X,A,B,C) - hypergeometric function (for restricted values of X, convergent case x < 1) LET A = PBDV(X,A) - parabolic cylinder function (Dv) LET A = PBDV1(X,A) - derivative of parabolic cylinder function (Dv) LET A = PBVV(X,A) - parabolic cylinder function (Vv) LET A = PBVV1(X,A) - derivative of parabolic cylinder function (Vv) LET A = PBWA(X,A) - parabolic cylinder function (Wa) (only for X < 5) LET A = PBWA1(X,A) - derivative of parabolic cylinder function (Wa) (only for X < 5) LET A = BER(XR) - Real component of Kelvin Ber function LET A = BERI(XR) - Complex component of Kelvin Ber function LET A = BER1(XR) - Real component of derivative of Kelvin Ber function LET A = BERI1(XR) - Complex component of derivative of Kelvin Ber function LET A = KER(XR) - Real component of Kelvin Ker function LET A = KERI(XR) - Complex component of Kelvin Ker function LET A = KER1(XR) - Real component of derivative of Kelvin Ker function LET A = KERI1(XR) - Complex component of derivative of Kelvin Ker function LET A = ZETA(S) - Riemann zeta function - 1 (s > 1) LET A = ETA(S) - eta function - 1 (s >= 1) LET A = CATLAN(S) - Catlan Beta function - 1 (s >= 1) LET A = BINOMIAL(N,M) - Binomial coefficent of N and M LET A = BINOM(N,M) - Binomial coefficent of N and M LET A = EN(N) - Euler number of order N LET A = EN(X,N) - Euler polynomial of order N LET A = BN(N) - Bernoulli number of order N LET A = BN(X,N) - Bernoulli polynomial of order N LET A = BERNOULLI NUMBERS FOR I = 1 1 N - Bernoulli numbers LET A = EULER NUMBERS FOR I = 1 1 N - Euler numbers ---------------------------------------------------------------------- The following enhancements were made to DATAPLOT July 1997. ---------------------------------------------------------------------- 1. Added support for printing tic mark labels in exponential format for linear scales. Enter the command ...TIC MARK LABEL FORMAT EXPONENTIAL The default is to write the number with an E15.7 format. To control the number of decimal points, enter the command ...TIC MARK LABEL DECIMAL <n> where <n> is a positive integer. For example, if <n> is 4, the number is printed with an E12.4 format. 2) For the diagrammatic graphics commands that draw a figure (AND, AMPLIFIER, ARC, ARROW, BOX, CAPACITOR, CIRCLE, DIAMOND, CUBE, ELLIPSE, GROUND, HEXAGON, INDUCTOR, LATTICE, NOR, OR, OVAL, PYRAMID, POINT, RESISTOR, SEMI-CIRCLE, TRIANGLE) were updated to include a "DATA" option (similar to the DRAWDATA and MOVEDATA commands). This "DATA" option draws the plot in units of the most recent plot rather than 0 to 100 screen units. For example, ELLIPSE DATA <list of points> draws the ellipse in units of the most recent plot. Similar to the DATA option, there is a RELATIVE option in the above commands. Although this capability has actually been available in Dataplot for quite some time, it was left out of the documentation for the diagrammatic graphics commands. Relative drawing means that the first point is drawn in absolute units and all subsequent points are relative to the prior point. For example DRAW RELATIVE 10 10 2 3 would draw a line from (10,10) to (12,13). The word "DATA" should come before the word "RELATIVE" in these commands. There are actually 4 forms to these commands. For example, ELLIPSE X1 Y1 X2 Y2 X3 Y3 ELLIPSE DATA X1 Y1 X2 Y2 X3 Y3 ELLIPSE RELATIVE X1 Y1 X2 Y2 X3 Y3 ELLIPSE DATA RELATIVE X1 Y1 X2 Y2 X3 Y3 The first form draws in absolute screen 0 to 100 units, the second form draws in absolute units of the most recent plot, the third form draws in relative screen 0 to 100 units, and the fourth form draws in relative units of the most recent plot. 3) POLYGON was added to the list of diagrammatic commands. This command takes the following form: POLYGON X Y <SUBSET/EXCEPT/FOR qualification> POLYGON DATA X Y <SUBSET/EXCEPT/FOR qualification> POLYGON RELATIVE X Y <SUBSET/EXCEPT/FOR qualification> POLYGON RELATIVE DATA X Y <SUBSET/EXCEPT/FOR qualification> The first form plots the polygon in 0 to 100 screen units while the second form plots the data in units of the most recent plot. The third and fourth forms are similar, but they use relative coordinates (the first coordiante pair is in absolute units, the remaining are coordinates relative to the previous point). Note that X and Y are arrays, not lists of points as used by the other diagrammatic graphics commands. Since these are arrays, the SUBSET, EXCEPT, and FOR qualifications can be applied to the list of points, although this is not common in the context of this command. Setting the last point to the first point (i.e., closing the polygon) is not required since Dataplot does this automatically. As with the other diagrammatic graphics commands, the attributes of the border of the polygon are set via the first setting of the LINE commands (e.g., LINE DASH, LINE COLOR BLUE, LINE THICKNESS 0.3). The attributes of the interioir of the polygon are set with the various REGION attribute commands (e.g., REGION FILL ON, REGION FILL COLOR BLUE). ---------------------------------------------------------------------- The following enhancements were made to DATAPLOT January-April 1997. ---------------------------------------------------------------------- 1. A check is now performed to determine if DPPL2F.DAT is opened successfully upon starting Dataplot. If not, an error message is printed and Dataplot is terminated. The typical cause for this is trying to run Dataplot in a read only directory. This change provides a more graceful exit. 2. The Dataplot Reference Manual is now available on-line. The Dataplot home page can be accessed from a Web browser using the URL: http://www.itl.nist.gov/div898/software/dataplot/homepage.html The Reference Manual is under the "documentation" table entry. The following should be noted: a) In order for these commands to work, you need to have a web browser available on your system. The Dataplot web pages display correctly with the Netscape, Internet Explorer, and HotJava 1.1 browsers. They do not display correctly with the HotJava 1.0, Mosaic, or character oriented browsers. We do not have access to other browsers, so we can make no specific comment on them. b) The Reference Manual is in PDF format (Portable Document Format), so it requires a PDF viewer. Typically, this is the Adobe Acrobat Reader. This reader is supported on most common platforms and can be downloaded for free. The PC installation typically takes about 10-15 minites to download and install. For best performance, it is strongly recommended that the Adobe Acrobat reader be installed as a plug-in (this is done automatically for Netscape on the PC) rather than as a helper application. The documentation web page contains a link to the Adobe Acrobat web site for downloading the reader. In addition, several commands are now available for accessing the Web, and the Dataplot Web pages and Reference Manual in particular, from within Dataplot. The first command is: WEB WEB NIST/SIMA/HPPC/SED/DATAPLOT WEB <url address> By default, this command activates Netscape with the specified URL. If no URL is given, the NIST home page is used. Several keywords are recognized. For example, SED activates the NIST Statistical Engineering Division home page. The second command is: WEB HELP <string> This command is similar to the standard Dataplot HELP command. However, it accesses the on-line Reference Manual rather than the ASCII text help files. <string> will usually be a Dataplot command (e.g., WEB HELP FIT, WEB HELP PLOT). However, many special keywords are also recognized. For example, WEB HELP or WEB HELP DATAPLOT access the Dataplot home page. Enter the command: LIST REFMAN.TEX to see a list of recognized keywords (the upper case entries in columns 1-40 identify the keywords while columns 40+ identify the associated URL). The WEB and WEB HELP commands are supported for Unix platforms and for the Windows 95/NT version. A few SET commands were added to support the WEB and WEB HELP commands. a) By default, Dataplot tries to use the Netscape browser. On Unix, it tries to do this by entering the command "netscape". On Windows 95/NT, it enters "C:\Program Files\NETSCAPE\NAVIGATOR\PROGRAM\netscape.exe" If you wish to use a different browser, or if Netscape is installed in a different location, you can enter the following command: SET BROWSER <file name> where <file name> is the string that activates your preferred browser. In particular, if you prefer to use the Internet Explorer under Windows 95/NT, you can enter: SET BROWSER "C:\Program Files\Plus!\Microsoft Internet\iexplore.exe" The enclosing quotes are required because the file name contains spaces. Again, check to see if this is the proper path on your system. Alternatively, you can enter the Unix command setenv BROWSER <file name> or the Windows 95/NT command SET BROWSER=<file name> to set the browser. These are typically placed in your start-up files (.login or .cshrc for Unix, AUTOEXEC.BAT for Windows 95/NT). You can shorten the browser name if you add the correct directory to your path. b) For the WEB command, the default URL is the NIST home page. You can change the default with the following Dataplot command: SET URL <default URL> For the WEB HELP command, the default URL is the Dataplot home page on the public NIST web server. This can be changed (for example, if you have installed the Dataplot web pages and Reference Manual on a local site) by entering the command: SET DATAPLOT URL <location of Dataplot web pages> Alternatively, you can enter the Unix commands setenv URL <location of default URL> setenv DPURL <location of Dataplot web pages> or the Windows 95/NT commands SET URL=<location of default URL> SET DPURL=<location of Dataplot web pages> For Unix platforms, the following command was added to tell Dataplot to use a currently open NETSCAPE window (this command is not needed for the PC): SET NETSCAPE <OLD/NEW> These commands have been tested with NETSCAPE on Unix and with Netscape and the Internet Explorer on the PC. One important difference between the Unix and PC versions of these commands should be noted. Under Unix, once the WEB command is initiated, control returns to Dataplot after the browser is started. You can independently navigate in the the browser and enter additional Dataplot commands. However, on the PC, control does not return to Dataplot until you exit the browser. 3. The following commands were added to allow previously viewed graphs to be saved for later recall. The primary purpose is to allow comparisons of a previous graph to a current graph. These commands are currently only supported for the X11 graphics device (available on most Unix implementations). SAVE PLOT <file> (or SAVE GRAPH, SP, SG) SAVE PLOT <file> AUTOMATIC SAVE PLOT AUTOMATIC REPEAT PLOT <file> (or REPEAT GRAPH, RP, RG, VIEW PLOT, VIEW GRAPH, VG, VP) REPEAT PLOT <+n> REPEAT PLOT <-n> LIST PLOT (or LIST GRAPH, LP, LG) CYCLE PLOT (or CYCLE GRAPH, CG, CP) PIXMAP TITLE <title> As a technical note, the plots are saved in X11 "bitmap" format. This is distinct from the X11 image format that is used by xwd to save a screen image. This choice was made for performance reasons (xlib provides direct routines for reading and writing bitmaps, but not for reading and writing images). The primary limitations are: i) Color is not supported for X11 bitmaps. Elements drawn in color will not be saved in the bitmap. ii) You cannot use the X11 tools xwd and xwud to view the saved plots independently of Dataplot. However, they can be viewed by any software the reads X11 bitmaps. The saved plots are essentially screen dumps. There is currently no "linking" in the sense that if a given variable is changed the saved plots are automatically updated. The SAVE GRAPH command saves the current plot in the user specified file. If no file name is specified, then the file name "pixmap.<n>", where <n> is a counter, is used. The keyword AUTOMATIC tells Dataplot to automatically save all subsequent plots. With the AUTOMATIC option, Dataplot does not save the current graph until the next plot is generated. This is done in order to correctly handle multi-plots and diagrammatic graphics. That is, the current graph is saved whenever a screen erase is performed. If a filename is provided, this will be used as the base (the ".<n>" is added). For example, SAVE PLOT HISTOGRAMS AUTOMATIC saves subsequent plots in the files HISTOGRAMS.1, HISTOGRAMS.2, and so on. Enter SAVE GRAPH AUTOMATIC OFF to terminate the automatic saving of the plots. The REPEAT PLOT command reads a saved plot and draws it in a window that is distinct from the normal Dataplot X11 graphics window. If no file is specified, or if <n> is 0 for REPEAT PLOT, the most current saved plot is drawn. A <+n> takes the Nth plot from the current list. A <-n> takes the "current - n"th plot from the current plot list. The DEVICE 1 X11 command must be entered before the REPEAT PLOT command can be used. The REPEAT PLOT command can redraw plots that were created in a previous Dataplot session. In fact, it will successfully redraw any file that is in the X11 bitmap format (but not in xwd format). The LIST PLOT command lists the currently saved plots (by sequence number, file name, and title). It only lists plots saved in the current session. However, this includes graphs created in a previous Dataplot session that have been redrawn with the REPEAT GRAPH command. Dataplot does not maintain a database of previously saved plots. The CYCLE PLOT command allows you to cycle through the pixmaps in the current list by clicking mouse buttons. Clicking the left mouse button moves down in the current list, clicking the right mouse button moves up in the current list, and clicking the middle mouse button returns control to Dataplot. At least one REPEAT PLOT command should be entered before using this command. The PIXMAP TITLE command allows you to specify the title for a saved plot. This title is simply for convenience in listing the saved plots. It is not saved as part of the file and the title only applies to the current Dataplot session. The default title is the file name. The pixmap title applies to the current plot when the SAVE GRAPH command is entered. It does not matter whether the PLOT or PIXMAP TITLE command is entered first. Be aware that for SAVE GRAPH AUTOMATIC the saving for a given plot is not executed until the next screen erase (typically the next plot) is encountered to allow for multi-plotting and the addition of diagrammatic graphics to a plot. The order of the commands would typically be something like: SAVE GRAPH AUTOMATIC 4-PLOT Y PIXMAP TITLE 4-PLOT PLOT Y PIXMAP TITLE PLOT Y HISTOGRAM Y PIXMAP TITLE HISTOGRAM The main point here is that the PIXMAP TITLE comes AFTER the plot command. Unlike the regular TITLE command, the PIXMAP TITLE command does not persist. That is, it applies only to the next saved plot and then reverts to the default of using the file name. 4. Added following special functions: a) LAMBDA(X,V) - Lambda function (V can be integer or real) b) LAMBDAP(X,V) - derivative of Lambda function (V can be integer or real) c) H0(X) - Struve function order 0 d) H1(X) - Struve function order 1 e) HV(X,V) - Struve function order V f) L0(X) - modified Struve function order 0 g) L1(X) - modified Struve function order 1 h) LV(X,V) - modified Struve function order V i) Added LOGBETA as synonym for LNBETA and LNGAMMA as synonym for LOGGAMMA. 5. The following bug fixes were made: a) Fixed bug where TEXT command automatically generated a software font (introduced by the DEVICE FONT command). b) Fixed bug in the ANOVA command. c) Fixed bug with ERASE command on Windows NT version. d) Fixed bug in HELP with conflict between STATUS and STATISTIC PLOT. e) Fixed bug if software font used and CHARACTER BLANK was entered in lower case. f) Fixed bug where CREATE <file> went into an infinite loop if a CALL command was encountered. The CALL command will now be saved correctly in the CREATE file. Note that the commands in the CALL file are not saved in the CREATE file (they are already saved as part of the CALL macro file). ---------------------------------------------------------------------- The following enhancements were made to DATAPLOT October-November 1996. ---------------------------------------------------------------------- 1. A native mode Windows 95/NT version is now available. This version was created using the Microsoft Windows 95 compiler. The initial release supports the command line version only. We will attempt over the next several months to port the Tcl/Tk based graphical user interface to the Windows 95/NT environment. To generate graphics to the screen for this version, enter the following command: DEVICE 1 QWIN Enter the command HELP QWIN for details of using this device. 2. For encapsulated Postscript files, DATAPLOT based the bounding box parameters assuming an 11 x 11 inch page. This was done to accomodate both landscape and portrait orientation plots. Unfortunately, this did not generate satisfactory results when importing DATAPLOT graphics into WordPerfect and other text processing software. The user had to do a fair amount of manual rotation and scaling of plots. DATAPLOT now adjusts the bounding box depending on the orientation. It uses 11 x 8.5 inch for landscape orientation and 8.5 x 11 inch for portrait. However, most text processors ignore the rotation and translation that the landscape plots request. To compensate for this, the following command was added: ORIENTATION LANDSCAPE WORDPERFECT This essentially generates a landscape orientation on a portrait page. That is, the bounding box specifies an 8.5 x 6.5 inch page. This generates execellent results with Word Perfect (users should normally never need to adjust the bounding box parameters or perform manual rotation and translation in Word Perfect). This option is only recognized for encapsulated Postscript. Regular Postscript should still use ORIENTATION LANDSCAPE. 3. Fixed a few bugs: a. Macros now accept more than 1,000 lines. b. Unix executables were not finding certain auxillary files if the file names were entered in lower case. c. NORMAL PLOT fixed. 4. The output for the YATES command was modified to be more readable and informative. ---------------------------------------------------------------------- The following enhancement was made to DATAPLOT July 1996. ---------------------------------------------------------------------- 1. The previous fix (checking the HOME environment variable for the user's root directory) was refined a bit. If HOME is defined, it looks for dplogf.tex in that directory. If dplogf.tex is not found, instead of printing an error message, it then strips off the path name and looks for it in the current directory and then in the DATAPLOT directory (typically /usr/local/lib/dataplot). Note that if an error message is printed saying that this file is not found, DATAPLOT will still run. This file simply lets you enter some DATAPLOT commands when starting DATAPLOT (i.e., for setting your preferred defaults). There should not be any negative side effects if this file is not executed. 2. Unix versions will check for the environment variable DATAPLOT_WEB. If this variable is defined, DATAPLOT assumes it is being run from the web (e.g., from Mosaic or Netscape). Currently, the only effect is that certain files that DATAPLOT typically creates in the current directory, such as dppl1f.dat and dpconf.tex, are opened in the /tmp directory. This may or may not be expanded upon as we gain more experience running DATAPLOT from web servers. 3. We built a "double precision" version for the Sun. That is, the -p8 option was used so that single precision numbers are 64-bit rather than 32-bit. The only complication was in how the X11 routines were called (these are compiled with 32-bit real numbers). Changes were made to the X11 driver to allow a "compile flag" to be set based on which case (i.e., 32 or 64-bit) is desired. This means that DATAPLOT can be easily built on any Unix system that supports the "-p8" option (or a compiler switch that provides a similar capability). 4. A version of DATAPLOT was built using the LAHEY compiler (previously, the OTG compiler was used). This version allows DATAPLOT to be run on PC's without special AUTOEXEC.BAT and CONFIG.SYS files (and therefore no rebooting to run DATAPLOT). A device driver that uses the LAHEY graphics library is also available. Enter DEVICE 1 LAHEY DEVICE 1 FONT SIMPLEX (this described below) 5. The following command was added: DEVICE <1/2/3> FONT <font name> This allows the screen device to use a different font than the printed output. This was specifically motivated for the LAHEY device driver. This driver does a very poor job with hardware characters. Using a software font avoids this problem, but often hardware characters are desired for the printed Postscript output (to take advantage of the typset quality fonts available with Postscript). Using the DEVICE 1 FONT SIMPLEX allows us to get decent characters on the screen and still retain the ability to use the Postscript fonts. Although this command was motivated by the LAHEY device, it is also useful for other screen devices (e.g., X11 hardware fonts are a fixed size, so only 1 character size is available at a time, Tektronix devices are limited to 4 discrete sizes, etc.). 6. Previously, log scales required at least 1 full cycle (e.g., 10 to 100). It is now possible to get around this limitation. For example, to have a log scale go from 85 to 125, do the following: YLOG ON YLIMITS 100 100 YTIC OFFSET 15 25 PLOT Y The key is that the lower and upper bound on the LIMITS command must be the same and at least one of the TIC OFFSETS must be greater than zero. Major TICS will be generated at this bound and also at the frame limits. Minor tics will be plotted where appropriate. Also, the TIC OFFSET is always interpreted in data units for this case (i.e., can't specify the offset in DATAPLOT 0 to 100 coordinates as you normally can). 7. Several bugs were fixed. ---------------------------------------------------------------------- The following enhancement was made to DATAPLOT June 1996. ---------------------------------------------------------------------- For Unix systems, check for the HOME environment variable. This normally specifies the user's home directory. If present, DATAPLOT looks for the user's start-up file (dplogf.tex) in the user's home directory rather than the current directory. This means you no longer have to include the start-up file in each directory from which you run DATAPLOT. If HOME is not found, look for dplogf.tex in the current directory . Note that if HOME is found and dplogf.tex is not found in the home directory, DATAPLOT will NOT look for it in the current directory. ---------------------------------------------------------------------- The following enhancements were made to DATAPLOT MAY, 1996. ---------------------------------------------------------------------- 1) Fixed a bug where the X11 driver bombed if being run remotely and the SET X11 PIXMAP ON command was used. 2) Fixed a bug where the 3D-PLOT was bombing when a large number of points were plotted. ---------------------------------------------------------------------- The following enhancements were made to DATAPLOT FEBRUARY-APRIL, 1996. ---------------------------------------------------------------------- 1) The following probability functions were added: LET A = BBNCDF(X,ALPHA,BETA,N) - beta-binomial cumulative distribution function LET A = BBNPDF(X,ALPHA,BETA,N) - beta-binomial probability density function LET A = BBNPPF(P,ALPHA,BETA,N) - beta-binomial percent point function LET A = BRACDF(X,BETA) - Bradford cumulative distribution function LET A = BRAPDF(X,BETA) - Bradford probability density function LET A = BRAPPF(P,BETA) - Bradford percent point function LET A = DGACDF(X,GAMMA) - double gamma cumulative distribution function LET A = DGAPDF(X,GAMMA) - double gamma probability density function LET A = DGAPPF(P,GAMMA) - double gamma percent point function LET A = FCACDF(X,U,SD) - folded Cauchy cumulative distribution function LET A = FCAPDF(X,U,SD) - folded Cauchy probability density function LET A = FCAPPF(P,U,SD) - folded Cauchy percent point function LET A = GEXCDF(X,LAM1,LAM2,S) - generalized exponential cumulative distribution function LET A = GEXPDF(X,LAM1,LAM2,S) - generalized exponential probability density function LET A = GEXPPF(P,LAM1,LAM2,S) - generalized exponential percent point function LET A = GLOCDF(X,ALPHA) - generalized logistic cumulative distribution function LET A = GLOPDF(X,ALPHA) - generalized logistic probability density function LET A = GLOPPF(P,ALPHA) - generalized logistic percent point function LET A = KAPCDF(X,AK,B,T) - Mielke's beta-kappa cumulative distribution function LET A = KAPPDF(X,AK,B,T) - Mielke's beta-kappa probability density function LET A = KAPPPF(P,AK,B,T) - Mielke's beta-kappa percent point function LET A = NCCPDF(X,V,DELTA) - non-central chi-square probability density function LET A = PEXCDF(X,ALPHA,BETA) - exponential power cumulative distribution function LET A = PEXPDF(X,ALPHA,BETA) - exponential power probability density function LET A = PEXPPF(P,ALPHA,BETA) - exponential power percent point function The following probability plots were added: LET ALPHA = <value> LET BETA = <value> LET N = <value> BETA BINOMIAL PROBABILITY PLOT Y LET BETA = <value> BRADFORD PROBABILITY PLOT Y LET GAMMA = <value> DOUBLE GAMMA PROBABILITY PLOT Y LET M = <value> LET SD = <value> FOLDED CAUCHY PROBABILITY PLOT Y LET LAMBDA1 = <value> LET LAMBDA2 = <value> LET S = <value> GENERALIZED EXPONENTIAL PROBABILITY PLOT Y LET ALPHA = <value> GENERALIZED LOGISTIC PROBABILITY PLOT Y LET BETA = <value> LET THETA = <value> LET K = <value> MIELKE BETA-KAPPA PROBABILITY PLOT Y LET ALPHA = <value> LET BETA = <value> EXPONENTIAL POWER PROBABILITY PLOT Y The following probability plot correlation coefficient plots were added: BRADFORD PPCC PLOT Y DOUBLE GAMMA PPCC PLOT Y GENERALIZED LOGISTIC PPCC PLOT Y 2) The WRITE command was updated to handle a maximum of 25 variables (up from 10). Support was added for writing Fortran unformatted data files. This was done primarily for sites that have created "mega" size versions of DATAPLOT where the time entailed in reading and writing large data files becomes important. For standard size DATAPLOT (typically a maximum of 10,000 rows with 10 columns for 100,000 data points total), the use of the SET READ FORMAT and SET WRITE FORMAT commands provides adequate performance. However, the unformatted read and write capability is available regardless of the workspace size. The advantage of unformatted read and writes is that the data files are much smaller (typically by a factor of 10 or more) and reading and writing the data significantly faster. The disadvantage is that unformatted files are binary, and thus cannot be modified or viewed with a standard text editor. Also, Fortran unformatted files are NOT transportable across different computer systems. Also, unformatted Fortran files are NOT equivalent to C language byte stream files (these types of files are not currently supported in DATAPLOT). An unformatted write is accomplished by entering the command: SET WRITE FORMAT UNFORMATTED and then entering a standard WRITE command. For example, WRITE LARGE.DAT X1 X2 X3 There are 2 ways to create the unformatted file in Fortran. For example, suppose X and Y are to be written to an unformatted file. The WRITE can be generated by: a) WRITE(IUNIT) (X(I),Y(I),I=1,N) b) WRITE(IUNIT) X,Y The distinction is that (a) stores the data as X(1), Y(1), X(2), Y(2), ..., X(N), Y(N) while (b) stores all of X then all of Y. There is no inherent advantage in either method in terms of performance or file size. The SET WRITE FORMAT UNFORMATTED command only supports (a). Unformatted writing is supported only for variables or matrices (i.e., not for parameters or strings). Be aware that Fortran unformatted files are NOT transportable across systems. This is due to the fact that the file contains various header bytes (the Fortran standard leaves implementation of this up to vendor) that are not standard. Also, the storage of real numbers can vary between platforms. This means that the SET WRITE FORMAT UNFORMATTED command can NOT be used to write raw binary files (as might be produced by a C program) and it cannot, in general, be used to write unformatted Fortran files that can be read on systems other than the one you are running DATAPLOT on. 3) The command SET RELATIVE HISTOGRAM <AREA/PERCENT> was added to specify whether or not relative histograms (and relative bi-histograms) are drawn so that the area under the histogram sums to 1 or so that the heights of the histograms sum to 1. The first option, which is the default, is useful when using the relative histogram as an estimate of a probability distribution. The second option is useful when you want to see what percentage of the data falls in a given class. 4) For Unix versions, the location of the DATAPLOT auxillary files can be specified with the following Unix command: setenv DATAPLOT_FILES <directory name> This can be useful if you do not have super user permission to copy the files into the /usr/local/lib/dataplot directory and you do not have a cooperative system adminstrator. 5) The LET STRING command was modified so that the case of the text in the string is preserved as entered. Note that the LET FUNCTION command still converts text to upper case. The READ STRING command was modified so that it ignores the SET READ FORMAT command. 6) Numerous minor bugs were fixed. ----------------------------------------------------------------- The following enhancements were made to DATAPLOT AUGUST-OCTOBER, 1995. ----------------------------------------------------------------- 1) The Numerical Recipes routine for calculating complex roots was replaced with a CMLIB routine. There is no change in the command syntax. 2) The Numerical Recipes routine for calculating the fast Fourier transform was replaced with CMLIB routines. A couple of changes were made as follows: a) the CMLIB routine does not require zero padding so that the length of the variable is a power of two. Previously, DATAPLOT did this automatically. It no longer does. However, the CMLIB algorithm loses efficiency if the length is not a factor of small primes. In this case, you may wish to zero pad the variable yourself before calling the FFT command. b) The SET FOURIER EXPONENT <+/-> command was corrected to work as intended (the default implemented the + case, which was really the only option that worked). In addition, this command was extended to apply to the FOURIER and INVERSE FOURIER command as well as the FFT and INVERSE FFT commands. Enter HELP FOURIER EXPONENT for more information on this command. c) Most FFT routines return the data in the following order: F(1) = zero frequency F(2) ... F(N/2) = smallest positive frequency to largest positive frequency F(N/2+1) = aliased point that contains the largest positive and the largest negative frequency F(N/2+2) ... F(N) = negative frequencies from largest magnitude to smallest magnitude By default, DATAPLOT returns the data in the following order: F(1) = aliased point that contains the largest positive and the largest negative frequency F(2) ... F(N/2) = Largest positive frequency to smallest positive frequency F(N/2+1) = zero frequency F(N/2+2) ... F(N) = negative frequencies from smallest magnitude to largest magnitude The command SET FOURIER ORDER <STANDARD/DATAPLOT> was implemented to allow you to specify which order to use. The option STANDARD returns the first order while the option DATAPLOT returns the second order. 3) Support was added for hypergeometric, non-central chi-square, singly and doubly non-central F, half-cauchy and folded normal random numbers, The following probability functions were added: LET A = ANGCDF(X) - anglit cumulative distribution function LET A = ANGPDF(X) - anglit density function LET A = ANGPPF(X) - anglit percent point function LET A = ARSCDF(X) - arcsin cumulative distribution function LET A = ARSPDF(X) - arcsin density function LET A = ARSPPF(X) - arcsin percent point function LET A = DWECDF(X,G) - double Weibull cumulative distribution function LET A = DWEPDF(X,G) - double Weibull density function LET A = DWEPPF(X,G) - double Weibull percent point function LET A = EWECDF(X,G) - exponentiated Weibull cumulative distribution function LET A = EWEPDF(X,G) - exponentiated Weibull density function LET A = EWEPPF(X,G) - exponentiated Weibull percent point function LET A = FNRCDF(X,U,SD) - folded normal cumulative distribution function LET A = FNRPDF(X,U,SD) - folded normal probability density function LET A = FNRPPF(X,U,SD) - folded normal percent point function LET A = GEVCDF(X,G) - generalized extreme value cumulative distribution function LET A = GEVPDF(X,G) - generalized extreme value density function LET A = GEVPPF(X,G) - generalized extreme value percent point function LET A = GOMCDF(X,C,B) - Gompertz cumulative distribution function LET A = GOMPDF(X,C,B) - Gompertz probability density function LET A = GOMPPF(X,C,B) - Gompertz percent point function LET A = HFCCDF(X) - half-Cauchy cumulative distribution function LET A = HFCPDF(X) - half-Cauchy density function LET A = HFCPPF(X) - half-Cauchy percent point function LET A = HFLCDF(X,G) - generalized half-logistic cumulative distribution function LET A = HFLPDF(X,G) - generalized half-logistic density function LET A = HFLPPF(X,G) - generalized half-logistic percent point function LET A = HSECDF(X) - hyperbolic secant cumulative distribution function LET A = HSEPDF(X) - hyperbolic secant density function LET A = HSEPPF(X) - hyperbolic secant percent point function LET A = LGACDF(X,G) - log-gamma cumulative distribution function LET A = LGAPDF(X,G) - log-gamma density function LET A = LGAPPF(X,G) - log-gamma percent point function LET A = PA2CDF(X,G) - Pareto type 2 cumulative distribution function LET A = PA2PDF(X,G) - Pareto type 2 density function LET A = PA2PPF(X,G) - Pareto type 2 percent point function LET A = TNRCDF(X,A,B,U,SD) - truncated normal cumulative distribution function LET A = TNRPDF(X,A,B,U,SD) - truncated normal probability density function LET A = TNRPPF(X,A,B,U,SD) - truncated normal percent point function LET A = TNECDF(X,X0,U,SD) - truncated exponential cumulative distribution function LET A = TNEPDF(X,X0,U,SD) - truncated exponential probability density function LET A = TNEPPF(X,X0,U,SD) - truncated exponential percent point function LET A = WCACDF(X,G) - wrapped-up Cauchy cumulative distribution function LET A = WCAPDF(X,G) - wrapped-up Cauchy density function LET A = WCAPPF(X,G) - wrapped-up Cauchy percent point function The following probability plots were added: ANGLIT PROBABILITY PLOT Y ARCSIN PROBABILITY PLOT Y HYPERBOLIC SECANT PROBABILITY PLOT Y HALF CAUCHY PROBABILITY PLOT Y LET M = <value> LET SD = <value> FOLDED NORMAL PROBABILITY PLOT Y LET A = <value> LET B = <value> LET M = <value> (optional, defaults to 0) LET SD = <value> (optional, defaults to 1) TRUNCATED NORMAL PROBABILITY PLOT Y LET X0 = <value> LET M = <value> (optional, defaults to 0) LET SD = <value> (optional, defaults to 1) TRUNCATED EXPONENTIAL PROBABILITY PLOT Y LET GAMMA = <value> DOUBLE WEIBULL PROBABILITY PLOT Y LOG GAMMA PROBABILITY PLOT Y GENERALIZED EXTREME VALUE PROBABILITY PLOT Y (or GEV PROB PLOT) PARETO SECOND KIND PROBABILITY PLOT Y (or PARETO TYPE 2) HALF LOGISTIC PROBABILITY PLOT Y (GAMMA optional for this case) LET GAMMA = <value> LET THETA = <value> EXPONENTIATED WEIBULL PROBABILITY PLOT Y LET C = <value> LET B = <value> GOMPERTZ PROBABILITY PLOT Y LET C = <value> WRAPPED CAUCHY PROBABILITY PLOT Y The following probability plot correlation coefficient plots were added: LOG GAMMA PPCC PLOT Y DOUBLE WEIBULL PPCC PLOT Y GENERALIZED EXTREME VALUE PPCC PLOT Y (or GEV PPCC PLOT) PARTEO SECOND KIND PPCC PLOT Y (or PARETO TYPPE 2 PPCC PLOT) WRAPPED CAUCHY PPCC PLOT Y HALF LOGISTIC PPCC PLOT Y 4) The following character option was added: CHARACTER PIXEL This option plots a single "pixel" on a given device. In addition, when this option is given, the CHARACTER SIZE is interpreted as an integer expansion factor. For example, CHARACTER SIZE 10 will plot a 10x10 pixel block. This option has been implemented for the Tektronix, X11, Postscript, HP-GL, Regis, HP-2622, and Sun devices. Other devices will print a message saying this option is unavailable (although additional devices will be added later). Although this capability was added with some possible future enhancements in mind, it can be useful in some plots such as fractal plots. ----------------------------------------------------------------- The following enhancements were made to DATAPLOT JULY, 1995. ----------------------------------------------------------------- Support was added for various types of orthogonal polynomials. The following commands were added. LET A = LEGENDRE(X,N) Compute the Legendre polynomial of order n LET A = LEGENDRE(X,N,M) Compute the associated Legendre polynomial of order n and degree m LET A = NRMLEG(X,N) Compute the normalized Legendre polynomial of order n LET A = NRMLEG(X,N,M) Compute the associated normalized Legendre polynomial of order n and degree m LET A = LEGP(X,N) Compute the Legendre function of the first kind of order n LET A = LEGP(X,N,M) Compute the associated Legendre function of the first kind of order n and degree m LET A = LEGQ(X,N) Compute the Legendre function of the second kind of order n LET A = LEGQ(X,N,M) Compute the associated Legendre function of the second kind of order n and degree m LET A = SPHRHRMR(X,P,N,M) Compute the real component of the spherical harmonic function LET A = SPHRHRMC(X,P,N,M) Compute the complex component of the spherical harmonic function LET A = LAGUERRE(X,N) Compoute the Laguerre polynomial of order n LET A = LAGUERRL(X,N,A) Compute the generalized Laguerre polynomial of order n LET A = NRMLAG(X,N) Compute the normalized Laguerre polynomial of order n LET A = CHEBT(X,N) Compute the Chebyshev T (first kind) polynomial of order n LET A = CHEBU(X,N) Compute the Chebyshev U (second kind) polynomial of order n LET A = JACOBIP(X,N,A,B) Compute the Jacobi polynomial of order n LET A = ULTRASPH(X,N,A) Compute the Ultraspherical (or Gegenbauer) polynomial of order n LET A = HERMITE(X,N) Compute the Hermite polynomial of order n LET A = LNHERMIT(X,N) Compute the log of the absolute value of the Hermite polynomial of order n LET A = HERMSGN(X,N) Compute the sign of the Hermite polynomial (1 for positive, -1 for negative, 0 for zero) In addition, an alpha version of a graphical user interface is available on some Unix systems. You can check with your local site installer to see if it is available on your system. If it is available, it is typically executed by entering the command: xdp At NIST, the frontend has been installed on the CAML Sun's and SGI's as well as the Convex. There are no plans to install it on the Cray. For non-NIST sites, the following non-DATAPLOT programs must be installed: 1) Tcl/TK - Tool Commmand Language 2) Expect - a program for controlling the dialog among interactive programs. These are both popular public domain Unix utilities that can be installed on most common Unix platforms. ----------------------------------------------------------------- The following enhancements were made to DATAPLOT APRIL, 1995. ----------------------------------------------------------------- 1) Support was added for reading Fortran unformatted data files. This was done primarily for sites that have created "mega" size versions of DATAPLOT where the time entailed in reading large data files becomes important. For standard size DATAPLOT (typically a maximum of 10,000 rows with 10 columns for 100,000 data points total), the use of the SET READ FORMAT command provides adequate performance. However, the unformatted read capability is available regardless of the workspace size. The advantage of unformatted reads is that the data files are much smaller (typically by a factor of 10 or more) and reading the data significantly faster. The disadvantage is that unformatted files are binary, and thus cannot be modified or viewed with a standard text editor. Also, Fortran unformatted files are NOT transportable across different computer systems. An unformatted read is accomplished by entering the command: SET READ FORMAT UNFORMATTED and then entering a standard READ command. For example, READ LARGE.DAT X1 X2 X3 There are 2 ways to create the unformatted file in Fortran. For example, suppose X and Y are to be written to an unformatted file. The WRITE can be generated by: a) WRITE(IUNIT) (X(I),Y(I),I=1,N) b) WRITE(IUNIT) X,Y The distinction is that (a) stores the data as X(1), Y(1), X(2), Y(2), ..., X(N), Y(N) while (b) stores all of X then all of Y. There is no inherent advantage in either method in terms of performance or file size. The SET READ FORMAT UNFORMATTED command assumes (a). To specify (b), enter the command: SET READ FORMAT COLUMNWISE (or UNFORMATTEDCOLUMNWISE) Unformatted reading is supported only for variables or matrices (i.e., not for parameters or strings). Also, it only applies when reading from a file. The limits for the maximum number of rows and columns for a matrix still apply (500 rows and 100 columns on most systems). When reading a matrix, the number of columns must be specified via the SET UNFORMATTED COLUMNS command. For example, SET READ FORMAT UNFORMATTED SET UNFORMATTED COLUMNS 25 READ MATRIX.DAT M The maximum size of the file that DATAPLOT can read is equal to the workspace size on your implementation (100,000 or 200,000 points on most installations). For larger files, it will read up to this number of data values. The data is assumed to be a rectangular grid of data written in a single chunk. Only single precision real numbers are supported. By default, the entire file (up to the maximum number of points) is read. DATAPLOT does provide 2 commands to allow some control of what portion of the file is read: SET UNFORMATTED OFFSET <value> SET UNFORMATTED RECORDS <value> The OFFSET specifies the number of data values at the begining of the file to skip. This is useful for skipping header lines (similar to a SKIP command for reading ASCII files) and other miscellaneous values. The RECORDS value is useful for reading part of a larger file. Be aware that Fortran unformatted files are NOT transportable across systems. This is due to the fact that the file contains various header bytes (the Fortran standard leaves implementation of this up to vendor) that are not standard. Also, the storage of real numbers can vary between platforms. This means that the SET READ FORMAT UNFORMATTED command can NOT be used to read raw binary files (as might be produced by a C program) and it cannot, in general, be used to read unformatted Fortran files created on systems other than the one you are running DATAPLOT on. 2) The following mathematical library functions were added: LET A = HEAVE(X,C) - Heavside function (=1 if X>=C, 0 otherwise, C is 0 if no second argument) LET A = CEIL(X) - ceiling function (integer value of x rounded to positive infinity LET A = FLOOR(X) - floor function (integer value rounded o negative infinity) LET A = STEP(X) - step function (synonym for FLOOR(X)) LET A = GCD(X1,X2) - greatest common divisor of X1 and X2 3) The following command was added: LET A = MAD Y - medain absolute deviation MEDIAN ABSOLUTE DEVIATION is a synonym for MAD. Given a variable X with median value MED, the MAD is defined as the median of the absolute value of (X-MED). The BOOTSTRAP PLOT, JACKNIFE PLOT, STATISTIC PLOT, BLOCK PLOT, and DEX PLOT commands were modified to support the MAD and AAD statistics. 4) The PHD command was renamed DEX PHD. In addition, some I/O was fixed in these routines. 5) Some bugs were fixed in the EDIT command. A few other miscellaneous bugs were fixed. 7) The following functions were added to the probability library. LET A = ALPCDF(X,ALPHA,BETA) - alpha cumulative distribution function LET A = ALPPDF(X,ALPHA,BETA) - alpha density function LET A = ALPPPF(X,ALPHA,BETA) - alpha percent point function LET A = CHCDF(X,NU) - chi cumulative distribution function LET A = CHPDF(X,NU) - chi density function LET A = CHPPF(X,NU) - chi percent point function LET A = COSCDF(X) - cosine cumulative distribution function LET A = COSPDF(X) - cosine density function LET A = COSPPF(X) - cosine percent point function LET A = DLGCDF(X,THETA) - logarithmic series cumulative distribution function LET A = DLGPDF(X,THETA) - logarithmic series density function LET A = DLGPPF(X,THETA) - logarithmic series percent point function LET A = GGDCDF(X,ALPHA,C) - generalized gamma cumulative distribution function LET A = GGDPDF(X,ALPHA,C) - generalized gamma density function LET A = GGDPPF(X,ALPHA,C) - generalized gamma percent point function LET A = LLGCDF(X,DELTA) - log-logistic cumulative distribution function LET A = LLGPDF(X,DELTA) - log-logistic density function LET A = LLGPPF(X,DELTA) - log-logistic percent point function LET A = PLNCDF(X,P,SD) - power lognormal cumulative distribution function LET A = PLNPDF(X,P,SD) - power lognormal density function LET A = PLNPPF(X,P,SD) - power lognormal percent point function LET A = PNRCDF(X,P,SD) - power normal cumulative distribution function LET A = PNRPDF(X,P,SD) - power normal density function LET A = PNRPPF(X,P,SD) - power normal percent point function LET A = POWCDF(X,C) - power function cumulative distribution function LET A = POWPDF(X,C) - power function density function LET A = POWPPF(X,C) - power function percent point function LET A = WARCDF(X,C,A) - Waring cumulative distribution function LET A = WARPDF(X,C,A) - Waring density function LET A = WARPPF(P,C,A) - Waring percent point function LET A = NCTPDF(X,NU,DELTA) - non-central t density function (density and percent point functions were added previously) LET A = TNRPDF(X,A,B) - truncated normal density function LET A = FNRPDF(X,U,SD) - folded normal density function The Yule distribution is a special case of the Waring distribution. Set A to 1 or simply omit the A parameter. The generalized gamma distribution can handle negative values for the C parameter (although not zero). Specifically, a value of C = -1 is the inverted gamma distribution. In addition, the log-normal cdf, pdf, and ppf functions were upgraded to handle the standard deviation shape parameter (LGNCDF, LGNPDF, LGNPPF). This parameter defaults to 1 if not specified. In addition the following probability plots were added. COSINE PROBABILITY PLOT Y LET ALPAHA = <value> LET BETA = <value> ALPHA PROBABILITY PLOT Y LET P = <value> LET SD = <value> (this parameter optional, defaults to 1) POWER NORMAL PROBABILITY PLOT Y LET P = <value> LET SD = <value> (this parameter optional, defaults to 1) POWER LOGNORMAL PROBABILITY PLOT Y LET SD = <value> LOGNORMAL PROBABILITY PLOT Y LET C = <value> POWER FUNCTION PROBABILITY PLOT Y LET NU = <value> CHI PROBABILITY PLOT Y LET THETA = <value> LOGARITMIC SERIES PROBABILITY PLOT Y LET DELTA = <value> LOG LOGISTIC PROBABILITY PLOT Y LET GAMMA = <value> LET C = <value> GENERALIZED GAMMA PROBABILITY PLOT Y LET A = <value> (can omit for the Yule distribution) LET C = <value> GENERALIZED GAMMA PROBABILITY PLOT Y In addition the following PPCC plots were added. LET SD = <value> (this parameter optional, defaults to 1) POWER NORMAL PPCC PLOT Y LET SD = <value> (this parameter optional, defaults to 1) POWER LOGNORMAL PPCC PLOT Y LET SD = <value> LOGNORMAL PPCC PLOT Y CHI PPCC PLOT Y VON MISES PPC PLOT Y POWER FUNCTION PPCC PLOT Y LOG LOGISTIC PPCC PLOT Y In addition the following random number generator was added. LET C = <value> LET Y = POWER FUNCTION RANDOM NUMBERS FOR I = 1 1 N ----------------------------------------------------------------- The following enhancements were made to DATAPLOT NOVEMBER, 1994. ----------------------------------------------------------------- 1) The following mathematical library functions were added: LET A = FRESNS(X) - Fresnel sine integral LET A = FRESNC(X) - Fresnel cosine integral LET A = FRESNF(X) - Fresnel auxillary function f integral LET A = FRESNG(X) - Fresnel auxillary function g integral LET A = SN(X,M) - Jacobian elliptic sn function LET A = CN(X,M) - Jacobian elliptic cn function LET A = DN(X,M) - Jacobian elliptic dn function LET A = PEQ(XR,XI) - the real component of the Weirstrass elliptic function (equianharmomic case) LET A = PEQI(XR,XI) - the complex component of the Weirstrass elliptic function (equianharmomic case) LET A = PEQ1(XR,XI) - the real component of the first derivative of the Weirstrass elliptic function (equianharmomic case) LET A = PEQ1I(XR,XI) - the complex component of the first derivative of the Weirstrass elliptic function (equianharmomic case) LET A = PLEM(XR,XI) - the real component of the Weirstrass elliptic function (cwlemniscatic case) LET A = PLEMI(XR,XI) - the complex component of the Weirstrass elliptic function (lemniscatic case) LET A = PLEM1(XR,XI) - the real component of the first derivative of the Weirstrass elliptic function (lemniscatic case) LET A = PLEM1I(XR,XI) - the complex component of the first derivative of the Weirstrass elliptic function (lemniscatic case) ------------------------------------------------------------ Changes prior to this are no longer in the news file because they are documented in the Reference Manual and the on-line help. ------------------------------------------------------------ YOU HAVE JUST ACCESSED THE FILE DPNEWF.

Privacy Policy/Security Notice
Disclaimer | FOIA

NIST is an agency of the U.S. Commerce Department.

Date created: 6/5/2001
Last updated: 10/30/2013

Please email comments on this WWW page to alan.heckert@nist.gov.