1.4.4. Dataplot Commands for EDA Techniques

1. Exploratory Data Analysis
1.4. EDA Case Studies

1.4.4. Dataplot Commands for EDA Techniques

This page documents the Dataplot commands that can be used for the graphical and analytical techniques discussed in this chapter. This is only meant to guide you to the appropriate commands. The complete documentation for these commands is available in the Dataplot Reference Manual.

Dataplot Commands for 1-Factor ANOVA

The Dataplot command for a one way analysis of variance is

ANOVA Y X where Y is a response variable and X is a group identifier variable.

Dataplot is currently limited to the balanced case (i.e., each level has the same number of observations) and it does not compute interaction effect estimates.

Return to the One-Way Analysis of Variance Page

Dataplot Commands for Multi-Factor ANOVA

The Dataplot commands for generating multi-factor analysis of variance are:

where Y is the response variable and X1, X2, X3, X4, and X5 are factor variables. Dataplot allows up to 10 factor variables.

Dataplot is currently limited to the balanced case (i.e., each level has the same number of observations) and it does not compute interaction effect estimates.

Return to the Multi-Factor Analysis of Variance Page

Dataplot Commands for the Anderson-Darling Test

The Dataplot commands for the Anderson-Darling test are

where Y is the response variable.

Return to the Anderson-Darling Test Page

Dataplot Commands for Autocorrelation

To generate the lag 1 autocorrelation value in Dataplot, enter

where Y is the response variable.

In Dataplot, the easiest way to generate the autocorrelations for lags greater than 1 is:

The AUTOCORRELATION PLOT command generates an autocorrelation plot for lags 0 to N/4. It also generates 95% and 99% confidence limits for the autocorrelations. Dataplot stores the plot coordinates in the internal variables XPLOT, YPLOT, and TAGPLOT. The 2 LET commands and the RETAIN command are used to extract the numerical values of the autocorrelations. The variable LAG identifies the lag while the corresponding row of AC contains the autocorrelation value.

Return to the Autocorrelation Page

Dataplot Commands for Autocorrelation Plots

The command to generate an autocorrelation plot is

AUTOCORRELATION PLOT Y The appearance of the autocorrelation plot can be controlled by appropriate settings of the LINE, CHARACTER, and SPIKE commands. Dataplot draws the following curves on the autocorrelation plot:

The auotocorrelations.
A reference line at zero.
A reference line at the upper 95% confidence limit.
A reference line at the lower 95% confidence limit.
A reference line at the upper 99% confidence limit.
A reference line at the lower 99% confidence limit.

For example, to draw the autocorrelations as spikes, the zero reference line as a solid line, the 95% lines as dashed lines, and the 99% line as dotted lines, enter the command

By default, the confidence bands are fixed width. This is appropriate for testing for white noise (i.e., randomness). For Box-Jenkins modeling, variable-width confidence bands are more appropriate. Enter the following command for variable-width confidence bands:

SET AUTOCORRELATION BAND BOX-JENKINS To restore fixed-width confidence bands, enter

SET AUTOCORRELATION BAND WHITE-NOISE

Return to the Autocorrelation Plot Page

Dataplot Commands for the Bartlett Test

The Dataplot command for the Bartlett test is

BARTLETT TEST Y X where Y is the response variable and X is the group id variable.

The above computes the standard form of Bartlett's test. To compute the Dixon-Massey form of Bartlett's test, the Dataplot command is one of the following (these are synonyms, not distinct commands)

Return to the Bartlett Test Page

Dataplot Commands for Bihistograms

The Dataplot command to generate a bihistogram is

BIHISTOGRAM Y1 Y2 As with the standard histogram, the class width, the lower class limit, and the upper class limit can be controlled with the commands

In addition, relative bihistograms, cumulative bihistograms, and relative cumulative bihistograms can be generated with the commands

Return to the Bihistogram Page

Dataplot Commands for the Binomial Probability Functions

Dataplot can compute the probability functions for the binomial distribution with the following commands.

cdf	LET Y = BINCDF(X,P,N)
pdf	LET Y = BINPDF(X,P,N)
ppf	LET Y = BINPPF(F,P,N)
random numbers	LET N = value LET P = value LET Y = BINOMIAL RANDOM NUMBERS FOR I = 1 1 1000
probability plot	LET N = value LET P = value BINOMIAL PROBABILITY PLOT Y

where X can be a number, a parameter, or a variable. P and N are the shape parameters and are required. They can be a number, a parameter, or a variable. They are typically a number or a parameter.

These functions can be used in the Dataplot PLOT and FIT commands as well. For example,

PLOT BINPDF(X,0.5,100) FOR X = 0 1 100

Return to the Binomial Distribution Page

Dataplot Commands for the Block Plot

The Dataplot command for the block plot is

where

Y is the response variable,
X1, X2, X3, etc. are the one or more nuisance (= secondary) factors, and
XP is the primary factor of interest.

The following commands typically precede the block plot.

These commands set the plot character for the primary factor. Although 1 and 2 are useful indicators, the choice of plot character is at the discretion of the user.

Return to the Block Plot Page

Dataplot Commands for the Bootstrap Plot

The Dataplot command for the bootstrap plot is

where <STAT> is one of the following:

SUM
PRODUCT
MINIMUM
MAXIMUM

STANDARD DEVIATION
VARIANCE
STANDARD DEVIATION OF MEAN
VARIANCE OF MEAN
RELATIVE STANDARD DEVIATION
RELATIVE VARIANCE
AVERAGE ABSOLUTE DEVIATION
MEDIAN ABSOLUTE DEVIATION
LOWER QUARTILE
LOWER HINGE
UPPER QUARTILE
UPPER HINGE

FIRST DECILE
SECOND DECILE
THIRD DECILE
FOURTH DECILE
FIFTH DECILE
SIXTH DECILE
SEVENTH DECILE
EIGHTH DECILE
NINTH DECILE
PERCENTILE

SKEWNESS
KURTOSIS

AUTOCORRELATION
AUTOCOVARIANCE
SINE FREQUENCY
COSINE FREQUENCY

TAGUCHI SN0
TAGUCHI SN+
TAGUCHI SN-
TAGUCHI SN00

The BOOTSTRAP PLOT command is almost always followed by a histogram or some other distributional plot.

Dataplot automatically stores the following internal parameters after a BOOTSTRAP PLOT command:

These internal parameters are useful for generating confidence intervals and can be printed (PRINT BMEAN) or used as any user-defined parameter could (e.g., LET UCL = B95).

To specify the number of bootstrap subsamples to use, enter the command

BOOTSTRAP SAMPLE <N> where <N> is the number of samples you want. The default is 500 (it may be 100 in older implementations).

Dataplot can also generate bootstrap estimates for statistics that are not directly supported. The following example shows a bootstrap calculation for the mean of 500 normal random numbers. Although we can do this directly in Dataplot, this demonstrates the steps necessary for an unsupported statistic. The subsamples are generated with a loop. The BOOTSTRAP INDEX and BOOTSTRAP SAMPLE commands generate a single subsample which is stored in Y2. The desired statistic is then calculated for Y2 and the result stored in an array. After the loop, the array XMEAN contains the 100 mean values.

Return to the Bootstrap Page

Dataplot Command for the Box-Cox Linearity Plot

The Dataplot command to generate a Box-Cox linearity plot is

BOX-COX LINEARITY PLOT Y X where Y and X are the response variables.

Return to the Box-Cox Linearity Plot Page

Dataplot Command for the Box-Cox Normality Plot

The Dataplot command to generate a Box-Cox normality plot is

BOX-COX NORMALITY PLOT Y where Y is the response variable.

Return to the Box-Cox Normality Plot Page

Dataplot Commands for the Boxplot

The Dataplot command to generate a boxplot is

BOX PLOT Y X The BOX PLOT command is usually preceded by the commands

These commands set the default line and character settings for the box plot. You can use the CHARACTER and LINE commands to choose your own line and character settings if you prefer.

To show the outliers as circles, enter the command

FENCES ON

Return to the Boxplot Page

Dataplot Commands for the Cauchy Probability Functions

Dataplot can compute the probability functions for the Cauchy distribution with the following commands.

cdf	LET Y = CAUCDF(X,A,B)
pdf	LET Y = CAUPDF(X,A,B)
ppf	LET Y = CAUPPF(X,A,B)
hazard	LET Y = CAUHAZ(X,A,B)
cumulative hazard	LET Y = CAUCHAZ(X,A,B)
survival	LET Y = 1 - CAUCDF(X,A,B)
inverse survival	LET Y = CAUPPF(1-X,A,B)
random numbers	LET Y = CAUCHY RANDOM NUMBERS FOR I = 1 1 1000
probability plot	CAUCHY PROBABILITY PLOT Y

where X can be a number, a parameter, or a variable. A and B are the location and scale parameters and they are optional (a location of 0 and scale of 1 are used if they are omitted). If given, A and B can be a number, a parameter, or a variable. However, they are typically either a number or a parameter.

These functions can be used in the Dataplot PLOT and FIT commands as well. For example,

PLOT CAUPDF(X) FOR X = -5 0.01 5

Return to the Cauchy Distribution Page

Dataplot Commands for the Chi-Square Probability Functions

Dataplot can compute the probability functions for the chi-square distribution with the following commands.

cdf	LET Y = CHSCDF(X,NU,NU2,A,B)
pdf	LET Y = CHSPDF(X,NU,A,B)
ppf	LET Y = CHSPPF(X,NU,A,B)
random numbers	LET NU = value LET Y = CHI-SQUARE RANDOM NUMBERS FOR I = 1 1 1000
probability plot	LET NU = value CHI-SQUARE PROBABILITY PLOT Y
ppcc plot	LET NU = value CHI-SQUARE PPCC PLOT Y

where X can be a number, a parameter, or a variable. NU is the shape parameter (number of degrees of freedom). NU can be a number, a parameter, or a variable. However, it is typically either a number or a parameter. A and B are the location and scale parameters and they are optional (a location of 0 and scale of 1 are used if they are omitted). If given, A and B can be a number, a parameter, or a variable. However, they are typically either a number or a parameter.

These functions can be used in the Dataplot PLOT and FIT commands as well. For example,

PLOT CHSPDF(X,5) FOR X = 0 0.01 5

Return to the Chi-Square Distribution Page

Dataplot Commands for the Chi-Square Goodness of Fit Test

The Dataplot commands for the chi-square goodness of fit test are

where <dist> is one of 70+ built-in distributions. Dataplot supports the chi-square goodness-of-fit test for all distributions that support the cumulative distribution function. To see a list of supported distributions, enter the command LIST DISTRIBUTIONS. Some specific examples are

You can specify the location and scale parameters (for any of the supported distributions) by entering

You may need to enter the values for 1 or more shape parameters for distributions that require them. For example, to specify the shape parameter gamma for the gamma distribution, enter the commands

Dataplot also allows you to control the class width, the lower limit (i.e., start of the first bin), and the upper limit (i.e., the end value for the last bin). These commands are

In most cases, the default Dataplot class intervals will be adequate.

If your data are already binned, you can enter the commands

In both commands above, Y is the frequency variable. If one X variable is given, Dataplot assumes that it is the bin mid point and that bins have equal width. If two X variables are given, Dataplot assumes that these are the bin end points and that the bin widths are not necessarily of equal width. Unequal bin widths are typically used to combine classes with small frequencies since the chi-square approximation for the test may not be accurate if there are frequency classes with less than five observations.

Return to the Chi-Square Goodness of Fit Page

Dataplot Command for the Chi-Square Test for the Standard Deviation

The Dataplot command for the chi-square test for the standard deviation is

CHI-SQUARE TEST Y A where Y is the response variable and A is the value being tested.

Return to the Chi-Square Test for the Standard Deviation Page

Dataplot Command for Complex Demodulation Amplitude Plot

The Dataplot command for a complex demodulation amplitude plot is

COMPLEX DEMODULATION AMPLITUDE PLOT Y where Y is the response variable.

Return to the Complex Demodulation Amplitude Plot Page

Dataplot Commands for Complex Demodulation Phase Plot

The Dataplot commands for a complex demodulation phase plot are

where Y is the response variable. The DEMODULATION FREQUENCY is used to specify the desired frequency for the COMPLEX DEMODULATION PLOT. The value of the demodulation frequency is usually obtained from a spectral plot.

Return to the Complex Demodulation Phase Plot Page

Dataplot Commands for Conditioning Plot

The Dataplot command to generate a conditioning plot is

CONDITION PLOT Y X COND

Y is the response variable, X is the independent variable, and COND is the conditioning variable. Dataplot expects COND to contain a discrete number of distinct values. Dataplot provides a number of commands for creating a discrete variable from a continuous variable. For example, suppose X2 is a continuous variable that we want to split into 4 regions. We could enter the following sequence of commands to create a discrete variable from X2.

The SUBSET feature can be used as above to create whatever ranges we want. A simpler, more automatic way is to use the CODE command in Dataplot. For example,

splits the data into quartiles and assigns a value of 1 to 4 to COND based on what quartile the corresponding value of X2 is in.

The appearance of the plot can be controlled by appropriate settings of the CHARACTER and LINE commands and their various attribute setting commands.

In addition, Dataplot provides a number of SET commands to control the appearance of the conditioning plot. In Dataplot, enter HELP CONDITION PLOT for details.

Return to the Condition Plot Page

Dataplot Commands for Confidence Limits and One Sample t-test

The following commands can be used in Dataplot to generate a confidence interval for the mean or to generate a one sample t-test, respectively.

where Y is the response variable and U0 is a parameter or scalar value that defines the hypothesized value.

Return to the Confidence Limits for the Mean Page

Dataplot Commands for Contour Plots

The Dataplot command for generating a contour plot is

CONTOUR PLOT Z X Y Z0 The variables X and Y define the grid, the Z variable is the response variable, and Z0 defines the desired contour levels. Currently, Dataplot only supports contour plots over regular grids. Dataplot does provide 2D interpolation capabilities to form regular grids from irregular data. Dataplot also does not support labels for the contour lines or solid fills between contour lines.

Return to the Contour Plot Page

Dataplot Commands for Control Charts

The Dataplot commands for generating control charts are

where Y is the response variable and X is the group identifier variable.

Dataplot computes the control limits. In some cases, you may have pre determined values to put in as control limits (e.g., based on historical data). Dataplot allows you to specify these limits by entering the following commands before the control chart command.

The appearance of the plot can be controlled by appropriate settings of the LINE and CHARACTER commands. Specifically, there are seven settings:

the response curve
the reference line at the Dataplot determined target value
the reference line at the Dataplot determined upper specification limit
the reference line at the Dataplot determined lower specification limit
the reference line at the user-specified target value
the reference line at the user-specified upper specification limit
the reference line at the user-specified lower specification limit

Return to the Control Chart Page

Dataplot Commands for DEX Contour Plots

The Dataplot command for generating a linear dex contour plot is

DEX CONTOUR PLOT Y X1 X2 Y0 The variables X1 and X2 are the two factor variables, Y is the response variable, and Y0 defines the desired contour levels.

Dataplot does not have a built-in quadratic dex contour plot. However, the macro DEXCONTQ.DP will generate a quadratic dex contour plot. Enter LIST DEXCONTQ.DP for more information.

Return to the DEX Contour Plot Page

Dataplot Commands for DEX Interaction Effects Plots

The Dataplot command to generate a dex mean interaction effects plot is

DEX MEAN INTERACTION EFFECTS PLOT Y X1 X2 X3 X4 X5 where Y is the response variable and X1, X2, X3, X4, and X5 are the factor variables. The number of factor variables can vary, and is at least one.

Dataplot supports the following additional plots for other location statistics

If you want the raw data plotted rather than a statistic, enter

DEX INTERACTION EFFECTS PLOT Y X1 X2 X3 X4 X5 The LINE and CHARACTER commands can be used to control the appearance of the plot. For example, a typical sequence of commands might be

This draws the connecting line between the levels of a factor and the overall mean reference line as solid lines. In addition, the level means are drawn with a solid fill circle.

This command is a variant of the SCATTER PLOT MATRIX command. There are a number of options to control the appearance of these plots. In Dataplot, you can enter HELP SCATTER PLOT MATRIX for details.

Return to the Dex Mean Plot Page

Dataplot Commands for DEX Mean Plots

The Dataplot command to generate a dex mean plot is

DEX MEAN PLOT Y X1 X2 X3 X4 X5 where Y is the response variable and X1, X2, X3, X4, and X5 are the factor variables. The number of factor variables can vary, and is at least one.

Dataplot supports the following additional plots for other location statistics

The LINE and CHARACTER commands can be used to control the appearance of the plot. For example, a typical sequence of commands might be

This draws the connecting line between the levels of a factor and the overall mean reference line as solid lines. In addition, the level means are drawn with a solid fill circle.

It is often desirable to provide alphabetic labels for the factors. For example, if there are 2 factors, time and temperature, the following commands could be used to define alphabetic labels:

Return to the Dex Mean Plot Page

Dataplot Commands for a DEX Scatter Plot

The Dataplot command for generating a dex scatter plot is

DEX SCATTER PLOT Y X1 X2 X3 X4 X5 where Y is the response variable and X1, X2, X3, X4, and X5 are the factor variables. The number of factor variables can vary, and is at least one.

The DEX SCATTER PLOT is typically preceded by the commands

However, you can set the plot character and line settings to whatever seems appropriate.

It is often desirable to provide alphabetic labels for the factors. For example, if there are 2 factors, time and temperature, the following commands could be used to define alphabetic labels:

Return to the Dex Scatter Plot Page

Dataplot Commands for a DEX Standard Deviation Plot

The Dataplot command to generate a dex standard deviation plot is

DEX STANDARD DEVIATION PLOT Y X1 X2 X3 X4 X5 where Y is the response variable and X1, X2, X3, X4, and X5 are the factor variables. The number of factor variables can vary, and is at least one.

Dataplot supports the following additional plots for other scale statistics.

The LINE and CHARACTER commands can be used to control the appearance of the plot. For example, a typical sequence of commands might be

This draws the connecting line between the levels of a factor and the overall mean reference line as solid lines. In addition, the level means are drawn with a solid fill circle.

It is often desirable to provide alphabetic labels for the factors. For example, if there are 2 factors, time and temperature, the following commands could be used to define alphabetic labels:

Return to the Dex Standard Deviation Plot Page

Dataplot Commands for the Double Exponential Probability Functions

Dataplot can compute the probability functions for the double exponential distribution with the following commands.

cdf	LET Y = DEXCDF(X,A,B)
pdf	LET Y = DEXPDF(X,A,B)
ppf	LET Y = DEXPPF(X,A,B)
hazard	LET Y = DEXHAZ(X,A,B)/(1 - DEXCDF(X,A,B))
cumulative hazard	LET Y = -LOG(1 - DEXCHAZ(X,A,B))
survival	LET Y = 1 - DEXCDF(X,A,B)
inverse survival	LET Y = DEXPPF(1-X,A,B)
random numbers	LET Y = DOUBLE EXPONENTIAL RANDOM NUMBERS FOR I = 1 1 1000
probability plot	DOUBLE EXPONENTIAL PROBABILITY PLOT Y
maximum likelihood	LET MU = MEDIAN Y LET BETA = MEDIAN ABSOLUTE DEVIATION Y

These functions can be used in the Dataplot PLOT and FIT commands as well. For example,

PLOT DEXPDF(X) FOR X = -5 0.01 5

Return to the Double Exponential Distribution Page

Dataplot Command for Confidence Interval for the Difference Between Two Proportions

The Dataplot command for a confidence interval for the difference of two proportions is

DIFFERENCE OF PROPORTIONS CONFIDENCE INTERVAL Y1 Y2 where Y1 contains the data for sample 1 and Y2 contains the data for sample 2. For large samples, Dataplot uses the binomial computation, not the normal approximation.

The following command sets the lower and upper bounds that define a success in the response variable

ANOP LIMITS <lower bound> <upper bound>

Return to the Difference of Two Proportions Page

Dataplot Command for Duane Plot

The Dataplot command for a Duane plot is

DUANE PLOT Y where Y is a response variable containing failure times.

Return to the Duane Plot Page

Dataplot Command for Starting Values for Rational Function Models

Starting values for a rational function model can be obtained by fitting an exact rational function to a subset of the original data. The number of points in the subset should equal the number of parameters to be estimated in the rational function model. The EXACT RATIONAL FIT can be used to fit this subset model and thus to provide starting values for the rational function model. For example, to fit a quadratic/quadratic rational function model to data in X and Y, you might do something like the following.

The DATA command is used to define the subset variables and EXACT 2/2 FIT is used to fit the exact rational function. The "2/2" identifies the degree of the numerator as 2 and the degree of the denominator as 2. It provides values for A0, A1, A2, B1, and B2, which are used to fit the rational function model for the full data set.

Hit the "Back" button on your browser to return to your original location.

Dataplot Commands for the Exponential Probability Functions

Dataplot can compute the probability functions for the exponential distribution with the following commands.

cdf	LET Y = EXPCDF(X,A,B)
pdf	LET Y = EXPPDF(X,A,B)
ppf	LET Y = EXPPPF(X,A,B)
hazard	LET Y = EXPHAZ(X,A,B)
cumulative hazard	LET Y = EXPCHAZ(X,A,B)
survival	LET Y = 1 - EXPCDF(X,A,B)
inverse survival	LET Y = EXPPPF(1-X,A,B)
random numbers	LET Y = EXPONENTIAL RANDOM NUMBERS FOR I = 1 1 1000
probability plot	EXPONENTIAL PROBABILITY PLOT Y
parameter estimation	If your data are not censored, enter the commands SET CENSORING TYPE NONE EXPONENTIAL MLE Y If your data have type 1 censoring at fixed time t₀, enter the commands LET TEND = censoring time SET CENSORING TYPE 1 EXPONENTIAL MLE Y X If your data have type 2 censoring, enter the commands SET CENSORING TYPE 2 EXPONENTIAL MLE Y X Y is the response variable and X is the censoring variable where a value of 1 indicates a failure time and a value of 0 indicates a censoring time. In addition to the point estimates, confidence intervals for the parameters are generated.

In the above, X can be a number, a parameter, or a variable. A and B are the location and scale parameters and they are optional (a location of 0 and scale of 1 are used if they are omitted). If given, A and B can be a number, a parameter, or a variable. However, they are typically either a number or parameter.

These functions can be used in the Dataplot PLOT and FIT commands as well. For example,

PLOT EXPPDF(X) FOR X = 0 0.01 4

Return to the Exponential Distribution Page

Dataplot Command for Generalized ESD Test

The Dataplot command for the generalized ESD (Extreme Studentized Deviate) test is

where Y is the response variable and NOUTLIER specifies the upper bound on the number of outliers to test.

Return to the Generalized ESD Page

Dataplot Commands for the Extreme Value Type I (Gumbel) Distribution

To specify the form of the Gumbel distribution based on the smallest value, enter the command

SET MINMAX 1 To specify the form of the Gumbel distribution based on the largest value, enter the command

SET MINMAX 2 One of these commands must be entered before using the commands below.

Dataplot can compute the probability functions for the extreme value type I distribution with the following commands.

cdf	LET Y = EV1CDF(X,A,B)
pdf	LET Y = EV1PDF(X,A,B)
ppf	LET Y = EV1PPF(X,A,B)
hazard	LET Y = EV1HAZ(X,A,B)
cumulative hazard	LET Y = EV1CHAZ(X,A,B)
survival	LET Y = 1 - EV1CDF(X,A,B)
inverse survival	LET Y = EV1PPF(1-X,A,B)
random numbers	LET Y = EXTREME VALUE TYPE 1 RANDOM NUMBERS FOR I = 1 1 1000
probability plot	EXTREME VALUE TYPE 1 PROBABILITY PLOT Y
maximum likelihood	EV1 MLE Y This returns a point estimate for the full sample case. It does not provide confidence intervals for the parameters and it does not handle censored data.

These functions can be used in the Dataplot PLOT and FIT commands as well. For example,

Return to the Extreme Value Type I Distribution Page

Dataplot Commands for the F Distribution Probability Functions

Dataplot can compute the probability functions for the F distribution with the following commands.

cdf	LET Y = FCDF(X,NU1,NU2,A,B)
pdf	LET Y = FPDF(X,NU1,NU2,A,B)
ppf	LET Y = FPPF(X,NU1,NU2,A,B)
random numbers	LET NU1 = value LET NU2 = value LET Y = F RANDOM NUMBERS FOR I = 1 1 1000
probability plot	LET NU1 = value LET NU2 = value F PROBABILITY PLOT Y

where X can be a number, a parameter, or a variable. NU1 and NU2 are the shape parameters (= number of degrees of freedom). NU1 and NU2 can be a number, a parameter, or a variable. However, they are typically either a number or a parameter. A and B are the location and scale parameters and they are optional (a location of 0 and scale of 1 are used if they are omitted). If given, A and B can be a number, a parameter, or a variable. However, they are typically either a number or a parameter.

These functions can be used in the Dataplot PLOT and FIT commands as well. For example,

PLOT FPDF(X,10,10) FOR X = 0 0.01 5

Return to the F Distribution Page

Dataplot Command for F Test for Equality of Two Standard Deviations

The Datpalot command for the F test for the equality of two standard deviations is

F TEST Y1 Y2 where Y1 is the data for sample one and Y2 is the data for sample two.

Return to the F Test for Equality of Two Standard Deviations Page

Dataplot Commands for the Histogram

The Dataplot command to generate a histogram is

HISTOGRAM Y where Y is the response variable. The different variants of the histogram can be generated with the commands

The class width, the start of the first class, and the end of the last class can be specified with the commands

By default, Dataplot uses a class width of 0.3*SD where SD is the standard deviation of the data. The lower class limit is the sample mean minus 6 times the sample standard deviation. Similarly, the upper class limit is the sample mean plus 6 times the sample standard deviation.

By default, Dataplot uses the probability normalization for relative histograms. If you want the relative counts to sum to one instead, enter the command

SET RELATIVE HISTOGRAM PERCENT To reset the probability interpretation, enter

SET RELATIVE HISTOGRAM AREA

Return to the Histogram Page

Dataplot Commands for a Lag Plot

The Dataplot command to generate a lag plot is

LAG PLOT Y The appearance of the lag plot can be controlled with appropriate settings for the LINE and CHARACTER commands. Typical settings for these commands would be

To generate a linear fit of the points on the lag plot when an autoregressive fit is suggested, enter the following commands

The variables YPLOT and XPLOT are internal variables that store the coordinates of the most recent plot.

Return to the Lag Plot Page

Dataplot Commands for the Fatigue Life Probability Functions

Dataplot can compute the probability functions for the fatigue life distribution with the following commands.

cdf	LET Y = FLCDF(X,GAMMA,A,B)
pdf	LET Y = FLPDF(X,GAMMA,A,B)
ppf	LET Y = FLPPF(X,GAMMA,A,B)
hazard	LET Y = FLHAZ(X,GAMMA,A,B)
cumulative hazard	LET Y = FLCHAZ(X,GAMMA,A,B)
survival	LET Y = 1 - FLCDF(X,GAMMA,A,B)
inverse survival	LET Y = FLPPF(1-X,GAMMA,A,B)
random numbers	LET GAMMA = value LET Y = FATIGUE LIFE RANDOM NUMBERS FOR I = 1 1 1000
probability plot	LET GAMMA = value FATIGUE LIFE PROBABILITY PLOT Y
ppcc plot	LET GAMMA = value FATIGUE LIFE PPCC PLOT Y

where X can be a number, a parameter, or a variable. FLMA is the shape parameter and is required. It can be a number, a parameter, or a variable. It is typically a number or a parameter. A and B are the location and scale parameters and they are optional (a location of 0 and scale of 1 are used if they are omitted). If given, A and B can be a number, a parameter, or a variable. However, they are typically either a number or a parameter.

These functions can be used in the Dataplot PLOT and FIT commands as well. For example,

PLOT FLPDF(X,2) FOR X = 0.01 0.01 10

Return to the Fatigue Life Distribution Page

Dataplot Command for Fitting

Dataplot can generate both linear and nonlinear fit commands.

For example, to generate a linear fit of Y versus X1, X2, and X3, the command is:

FIT Y X1 X2 X3 To generate quadratic and cubic fits of Y versus X, the commands are:

Nonlinear fits are generated by entering an equation. For example,

In the above equations, there are variables (X and Y), parameters (A, B, C, and D), and constants (10 and 3.14159). The FIT command estimates values for the parameters. If you have a parameter that you do not want estimated, enter it as a constant or with the "^" (e.g., FIT Y = ^C/(1+^C*A*X**B). The "^" substitutes the value of a parameter into a command.

You can also define a function and then fit the function. For example,

Hit the "Back" button on your browser to return to your original location.

Dataplot Commands for the Gamma Probability Functions

Dataplot can compute the probability functions for the gamma distribution with the following commands.

cdf	LET Y = GAMCDF(X,GAMMA,A,B)
pdf	LET Y = GAMPDF(X,GAMMA,A,B)
ppf	LET Y = GAMPPF(X,GAMMA,A,B)
hazard	LET Y = GAMHAZ(X,GAMMA,A,B)
cumulative hazard	LET Y = GAMCHAZ(X,GAMMA,A,B)
survival	LET Y = 1 - GAMCDF(X,GAMMA,A,B)
inverse survival	LET Y = GAMPPF(1-X,GAMMA,A,B)
random numbers	LET GAMMA = value LET Y = Gamma RANDOM NUMBERS FOR I = 1 1 1000
probability plot	LET GAMMA = value Gamma PROBABILITY PLOT Y
ppcc plot	LET GAMMA = value Gamma PPCC PLOT Y
maximum likelihood	GAMMA MLE Y This returns a point estimate for the full-sample case. It does not provide confidence intervals for the parameters and it does not handle censored data.

where X can be a number, a parameter, or a variable. GAMMA is the shape parameter and is required. It can be a number, a parameter, or a variable. It is typically a number or a parameter. A and B are the location and scale parameters and they are optional (a location of 0 and scale of 1 are used if they are omitted). If given, A and B can be a number, a parameter, or a variable. However, they are typically either a number or a parameter.

These functions can be used in the Dataplot PLOT and FIT commands as well. For example,

PLOT GAMPDF(X,2) FOR X = 0.01 0.01 10

Return to the Gamma Distribution Page

Dataplot Command for Grubbs' Test

The Dataplot command for Grubbs' test is

GRUBBS <MINIMUM/MAXIMUM> TEST Y where Y is the response variable. Dataplot identifies one outlier at a time. The MINIMUM or MAXIMUM keyword is optional. If omitted, the most extreme value will be checked (regardless of whether it is in the minimum or maximum direction).

Return to the Grubbs Test Page

Dataplot Commands for Hazard Plots

The Dataplot commands for hazard plots are

where Y is a response variable containing failure times and X is a censoring variable (0 means failure time, 1 means censoring time).

Return to the Hazard Plot Page

Dataplot Command for Kruskal-Wallis Test

The Dataplot command for a Kruskal-Wallis test is

KRUSKAL WALLIS TEST Y X where Y is the response variable and X is the group identifier variable.

Return to the Kruskal-Wallis Test Page

Dataplot Commands for the Kolmogorov Smirnov Goodness-of-Fit Test

The Dataplot command for the Kolmogorov-Smirnov goodness-of-fit test is

where <dist> is one of 60+ built-in distributions. The K-S goodness of fit test is supported for all Dataplot internal continuous distributions that support the CDF (cumulative distribution function). The command LIST DISTRIBUTIONS shows the currently supported distributions in Dataplot. Some specific examples are

You can specify the location and scale parameters by entering

You may need to enter the values for 1 or more shape parameters for distributions that require them. For example, to specify the shape parameter gamma for the gamma distribution, enter the commands

Be aware that you should not use the same data to estimate these distributional parameters as you use to calculate the K-S test as the critical values of the K-S test assume the distribution is fully specified.

The empirical cdf function can be plotted with the following command

EMPIRICAL CDF PLOT Y

Return to the Kolmogorov-Smirnov Goodness of Fit Test Page

Dataplot Commands for Least Squares Estimation of Distributional Parameters

The following example shows how to use Dataplot to obtain least squares estimates for data generated from a Weibull distribution.

The RELATIVE HISTOGRAM generates a relative histogram. The command SET RELATIVE HISTOGRAM specifies that the relative histogram is created so that the area under the histogram is 1 (i.e., the integral is 1) rather than the sum of the bars equaling 1. This effectively makes the relative histogram an estimator of the underlying density function. Dataplot saves the coordinates of the histogram in the internal variables XPLOT and YPLOT. The SUBSET command eliminates zero frequency classes. The FIT command then performs the least squares fit.

The same general approach can be used to compute least squares estimates for any distribution for which Dataplot has a pdf function. The primary difficulty with the least squares fitting is that it can be quite sensitive to starting values. For distributions with no shape parameters, the probability plot can be used to determine starting values for the location and scale parameters. For distributions with a single shape parameter, the ppcc plot can be used to determine a starting value for the shape parameter and the probability plot used to determine starting values for the location and scale parameters.

The approach above can be used in any statistical software package that provides non-linear least squares fitting and a method for defining the probability density function (either built-in or user definable).

Return to the Least Squares Estimation Page

Dataplot Command for Levene's Test

The Dataplot command for the Levene test is

LEVENE TEST Y X where Y is the response variable and X is the group id variable.

Return to the Levene Test Page

Dataplot Command for the Linear Correlation Plot

The Dataplot command to generate a linear correlation plot is

LINEAR CORRELATION PLOT Y X TAG where Y is the response variable, X is the independent variable, and TAG is the group id variable.

The appearance of the plot can be controlled with appropriate settings for the LINE and CHARACTER commands. Typical settings would be

Return to the Linear Correlation Plot Page

Dataplot Command for the Linear Intercept Plot

The Dataplot command to generate a linear intercept plot is

LINEAR INTERCEPT PLOT Y X TAG where Y is the response variable, X is the independent variable, and TAG is the group id variable.

The appearance of the plot can be controlled with appropriate settings for the LINE and CHARACTER commands. Typical settings would be

Return to the Linear Intercept Plot Page

Dataplot Command for the Linear Slope Plot

The Dataplot command to generate a linear slope plot is

LINEAR SLOPE PLOT Y X TAG where Y is the response variable, X is the independent variable, and TAG is the group id variable.

The appearance of the plot can be controlled with appropriate settings for the LINE and CHARACTER commands. Typical settings would be

Return to the Linear Slope Plot Page

Dataplot Command for the Linear Residual Standard Deviation Plot

The Dataplot command to generate a linear residual standard deviation plot is

LINEAR RESSD PLOT Y X TAG where Y is the response variable, X is the independent variable, and TAG is the group id variable.

The appearance of the plot can be controlled with appropriate settings for the LINE and CHARACTER commands. Typical settings would be

Return to the Linear Residual Standard Deviation Plot Page

Dataplot Commands for Measures of Location

Various measures of location can be computed in Dataplot as follows:

In the above, P1 and P2 are used to set the percentage of values that are trimmed or Winsorized. Use P1 to set the percentage for the lower tail and P2 the percentage for the upper tail.

Return to the Measures of Location Page

Dataplot Commands for the Lognormal Probability Functions

Dataplot can compute the probability functions for the lognormal distribution with the following commands.

cdf	LET Y = LGNCDF(X,SD,A,B)
pdf	LET Y = LGNPDF(X,SD,A,B)
ppf	LET Y = LGNPPF(X,SD,A,B)
hazard	LET Y = LGNHAZ(X,SD,A,B)
cumulative hazard	LET Y = LGNCHAZ(X,SD,A,B)
survival	LET Y = 1 - LGNCDF(X,SD,A,B)
inverse survival	LET Y = LGNPPF(1-X,SD,A,B)
random numbers	LET SD = value LET Y = LOGNORMAL RANDOM NUMBERS FOR I = 1 1 1000
probability plot	LET SD = value LOGNORMAL PROBABILITY PLOT Y
ppcc plot	LET SD = value LOGNORMAL PPCC PLOT Y
parameter estimation	LOGNORMAL MLE Y This returns point estimates for the shape and scale parameters. It does not handle censored data and it does not generate confidence intervals for the parameters.

where X can be a number, a parameter, or a variable. SD is the shape parameter and is optional. It can be a number, a parameter, or a variable. It is typically a number or a parameter. A and B are the location and scale parameters and they are optional (a location of 0 and scale of 1 are used if they are omitted). If given, A and B can be a number, a parameter, or a variable. However, they are typically either a number or a parameter.

These functions can be used in the Dataplot PLOT and FIT commands as well. For example,

PLOT LGNPDF(X,5) FOR X = 0.01 0.01 5

Return to the Lognormal Distribution Page

Dataplot Commands for Maximum Likelihood Estimation for Distributions

Dataplot performs maximum likelihood estimation for a few specific distributions as documented in the table below. Unless specified otherwise, censored data are not supported and only point estimates are generated (i.e., no confidence intervals for the parameters). For censored data, create an id variable that is equal to 1 for a failure time and equal to 0 for a censoring time. Type I censoring is censoring at a fixed time t₀. Type II censoring is censoring after a pre-determined number of units have failed.

Normal NORMAL MAXIMUM LIKELIHOOD Y

Exponential EXPONENTIAL MAXIMUM LIKELIHOOD Y
Confidence intervals are generated for the parameters and both type I and type II censoring are supported.
For type I censoring, enter the following commands
SET CENSORING TYPE 1
LET TEND = censoring time
EXPONENTIAL MAXIMUM LIKELIHOOD Y ID

For type II censoring, enter the following commands
SET CENSORING TYPE 2
EXPONENTIAL MAXIMUM LIKELIHOOD Y ID

Weibull WEIBULL MAXIMUM LIKELIHOOD Y
Confidence intervals are generated for the parameters and both type I and type II censoring are supported.
For type I censoring, enter the following commands
SET CENSORING TYPE 1
LET TEND = censoring time
WEIBULL MAXIMUM LIKELIHOOD Y ID

For type II censoring, enter the following commands
SET CENSORING TYPE 2
WEIBULL MAXIMUM LIKELIHOOD Y ID

Lognormal LOGNORMAL MAXIMUM LIKELIHOOD Y

Double
Exponential DOUBLE EXPONENTIAL MAXIMUM LIKELIHOOD Y

Pareto PARETO MAXIMUM LIKELIHOOD Y

Gamma GAMMA MAXIMUM LIKELIHOOD Y

Inverse
Gaussian INVERSE GAUSSIAN MAXIMUM LIKELIHOOD Y

Gumbel GUMBEL MAXIMUM LIKELIHOOD Y

Binomial BINOMIAL MAXIMUM LIKELIHOOD Y

Poisson POISSON MAXIMUM LIKELIHOOD Y

Return to the Maximum Likelihood Estimation Page

Dataplot Command for the Mean Plot

The Dataplot command to generate a mean plot is

MEAN PLOT Y X where Y is a response variable and X is a group id variable.

Dataplot supports this command for a number of other common location statistics. For example, MEDIAN PLOT Y X and MID-RANGE PLOT Y X compute the median and mid-range instead of the mean for each group.

Return to the Mean Plot Page

Dataplot Commands for Normal Probability Functions

Dataplot can compute the various probability functions for the normal distribution with the following commands.

cdf	LET Y = NORCDF(X,A,B)
pdf	LET Y = NORPDF(X,A,B)
ppf	LET Y = NORPPF(X,A,B)
hazard	LET Y = NORHAZ(X,A,B)
cumulative hazard	LET Y = NORCHAZ(X,A,B)
survival	LET Y = 1 - NORCDF(X,A,B)
inverse survival	LET Y = NORPPF(1-X,A,B)
random numbers	LET Y = NORMAL RANDOM NUMBERS FOR I = 1 1 1000
probability plot	NORMAL PROBABILITY PLOT Y
parameter estimates	LET YMEAN = MEAN Y LET YSD = STANDARD DEVIATION Y

These functions can be used in the Dataplot PLOT and FIT commands as well. For example,

PLOT NORPDF(X) FOR X = -4 0.01 4

Return to the Normal Distribution Page

Dataplot Commands for a Normal Probability Plot

The Dataplot command to generate a normal probability plot is

NORMAL PROBABILITY PLOT Y where Y is the response variable.

If your data are already grouped (i.e., Y contains counts for the groups identified by X), the Dataplot command is

NORMAL PROBABILITY PLOT Y X Dataplot returns the following internal parameters when it generates a probability plot.

PPCC - the correlation coefficient of the fitted line on the probability plot. This is a measure of how well the straight line fits the probability plot.
PPA0 - the intercept term for the fitted line on the probability plot. This is an estimate of the location parameter.
PPA1 - the slope term for the fitted line on the probability plot. This is an estimate of the scale parameter.
SDPPA0 - the standard deviation of the intercept term for the fitted line on the probability plot.
SDPPA1 - the standard deviation of the slope term for the fitted line on the probability plot.
PPRESSD - the residual standard deviation of the fitted line on the probability plot. This is a measure of the adequacy of the fitted line.
PPRESDF - the residual degrees of freedom of the fitted line on the probability plot.

Return to the Normal Probability Plot Page

Dataplot Commands for the Generation of Normal Random Numbers

The Dataplot commands to generate 1,000 normal random numbers with a location of 50 and a scale of 20 are

Programs that automatically generate random numbers are typically controlled by a seed, which is usually an integer value. The importance of the seed is that it allows the random numbers to be replicated. That is, giving the program the same seed should generate the same sequence of random numbers. If the ability to replicate the set of random numbers is not important, you can give any valid value for the seed.

In Dataplot, the seed is an odd integer with a minimum (and default) value of 305. Seeds less than 305 generate the same sequence as 305 and even numbers generate the same sequence as the preceding odd number. To change the seed value to 401 in Dataplot, enter the command:

SEED 401

Return to the Normal Random Numbers Case Study (Background and Data) Page

Dataplot Commands for Partial Autocorrelation Plots

The command to generate a partial autocorrelation plot is

PARTIAL AUTOCORRELATION PLOT Y The appearance of the partial autocorrelation plot can be controlled by appropriate settings of the LINE, CHARACTER, and SPIKE commands. Dataplot draws the following curves on the autocorrelation plot:

The autocorrelations.
A reference line at zero.
A reference line at the upper 95% confidence limit.
A reference line at the lower 95% confidence limit.
A reference line at the upper 99% confidence limit.
A reference line at the lower 99% confidence limit.

For example, to draw the partial autocorrelations as spikes, the zero reference line as a solid line, the 95% lines as dashed lines, and the 99% line as dotted lines, enter the command

Return to the Partial Autocorrelation Plot Page

Dataplot Commands for the Poisson Probability Functions

Dataplot can compute the probability functions for the Poisson distribution with the following commands.

cdf	LET Y = POICDF(X,LAMBDA)
pdf	LET Y = POIPDF(X,LAMBDA)
ppf	LET Y = POIPPF(X,LAMBDA)
random numbers	LET LAMBDA = value LET Y = POISSON RANDOM NUMBERS FOR I = 1 1 1000
probability plot	LET LAMBDA = value POISSON PROBABILITY PLOT Y
ppcc plot	POISSON PPCC PLOT Y

where X can be a number, a parameter, or a variable. LAMBDA is the shape parameter and is required. It can be a number, a parameter, or a variable. It is typically a number or a parameter.

These functions can be used in the Dataplot PLOT and FIT commands as well. For example,

PLOT POIPDF(X,15) FOR X = 0 1 50

Return to the Poisson Distribution Page

Dataplot Commands for the Power Lognormal Distribution

Dataplot can compute the probability functions for the power lognormal distribution with the following commands.

cdf	LET Y = PLNCDF(X,P,SD,MU)
pdf	LET Y = PLNPDF(X,P,SD,MU)
ppf	LET Y = PLNPPF(X,P,SD,MU)
hazard	LET Y = PLNHAZ(X,P,SD,MU)
cumulative hazard	LET Y = PLNCHAZ(X,P,SD,MU)
survival	LET Y = 1 - PLNCDF(X,P,SD,MU)
inverse survival	LET Y = PLNPPF(1-X,P,SD,MU)
probability plot	LET P = value LET SD = value (defaults to 1) POWER LOGNORMAL PROBABILITY PLOT Y
ppcc plot	LET SD = value POWER LOGNORMAL PPCC PLOT Y

In the above, X can be a number, a parameter, or a variable. SD and MU are the scale and location parameters, respectively, and they are optional (a location of 0 and scale of 1 are used if they are omitted). If given, SD and MU can be a number, a parameter, or a variable. However, they are typically either a number or a parameter.

These functions can be used in the Dataplot PLOT and FIT commands as well. For example, the command

PLOT PLNPDF(X,5,1) FOR X = 0.01 0.01 5

Return to the Power Lognormal Distribution Page

Dataplot Commands for the Power Normal Probability Functions

Dataplot can compute the probability functions for the power normal distribution with the following commands.

cdf	LET Y = PNRCDF(X,P,SD,MU)
pdf	LET Y = PNRPDF(X,P,SD,MU)
ppf	LET Y = PNRPPF(X,P,SD,MU)
hazard	LET Y = PNRHAZ(X,P,SD,MU)
cumulative hazard	LET Y = PNRCHAZ(X,P,SD,MU)
survival	LET Y = 1 - PNRCDF(X,P,SD,MU)
inverse survival	LET Y = PNRPPF(1-X,P,SD,MU)
probability plot	LET P = value LET SD = value (defaults to 1) POWER NORMAL PROBABILITY PLOT Y
ppcc plot	POWER NORMAL PPCC PLOT Y

These functions can be used in the Dataplot PLOT and FIT commands as well. For example,

PLOT PNRPDF(X,10,1) FOR X = -5 0.01 5

Return to the Power Normal Distribution Page

Dataplot Commands for Probability Plots

The Dataplot command for a probability plot is

where <dist> is the name of the specific distribution. Dataplot currently supports probability plots for over 70 distributions. For example,

For some distributions, you may need to specify one or more shape parameters. For example, to specify the shape parameter for the gamma distribution, you might enter the following commands:

Enter the command LIST DISTRIBUTIONS to see a list of distributions for which Dataplot supports probability plots (and to see what parameters need to be specified).

Dataplot returns the following internal parameters when it generates a probability plot.

PPCC - the correlation coefficient of the fitted line on the probability plot. This is a measure of how well the straight line fits the probability plot.
PPA0 - the intercept term for the fitted line on the probability plot. This is an estimate of the location parameter.
PPA1 - the slope term for the fitted line on the probability plot. This is an estimate of the scale parameter.
SDPPA0 - the standard deviation of the intercept term for the fitted line on the probability plot.
SDPPA1 - the standard deviation of the slope term for the fitted line on the probability plot.
PPRESSD - the residual standard deviation of the fitted line on the probability plot. This is a measure of the adequacy of the fitted line.
PPRESDF - the residual degrees of freedom of the fitted line on the probability plot.

Return to the Probability Plot Page

Dataplot Commands for the PPCC Plot

The Dataplot command to generate a PPCC plot for unbinned data is:

<dist> PPCC PLOT Y where <dist> identifies the distributional family and Y is the response variable.

The Dataplot command to generate a PPCC plot for binned data is:

<dist> PPCC PLOT Y X where <dist> identifies the distributional family, Y is the counts variable, and X is the bin identifier variable.

Dataplot supports the PPCC plot for over 25 distributions. Some of the most common are WEIBULL, TUKEY LAMBDA, GAMMA, PARETO, and INVERSE GAUSSIAN. Enter the command LIST DISTRIBUTIONS for a list of supported distributions.

Dataplot allows you to specify the range of the shape parameter. Dataplot generates 50 probability plots in equally spaced intervals from the smallest value of the shape parameter to the largest value of the shape parameter. For example, to generate a Weibull PPCC plot for values of the shape parameter gamma from 2 to 4, enter the commands:

The command LIST DISTRIBUTIONS gives the name of the shape parameter for the supported distributions. The "1" and "2" suffixes imply the minimum and maximum value for the shape parameter, respectively.

Whenever Dataplot generates a PPCC plot, it saves the following internal parameters:

MAXPPCC - the maximum correlation coefficient from the PPCC plot.
SHAPE - the value of the shape parameter that generated the maximum correlation coefficient.

Return to the PPCC Plot Page

Dataplot Command for Proportion Defective Confidence Interval

The Dataplot command for a confidence interval for the proportion defective is

PROPORTION CONFIDENCE LIMITS Y where Y is a response variable. Note that for large samples, Dataplot generates the interval based on the exact binomial probability, not the normal approximation.

The following command sets the lower and upper bounds that define a success in the response variable:

ANOP LIMITS <lower bound> <upper bound>

Return to the Proportion Defective Page

Dataplot Command for Q-Q Plot

The Dataplot command to generate a q-q plot is

QUANTILE-QUANTILE PLOT Y1 Y2 The CHARACTER and LINE commands can be used to control the appearance of the q-q plot. For example, to draw the quantile points as circles and the reference line as a solid line, enter the commands

Return to the Quantile-Quantile Plot Page

Dataplot Commands for the Generation of Random Walk Numbers

To generate a random walk with 1,000 points requires the following Dataplot commands:

LET Y = UNIFORM RANDOM NUMBERS FOR I = 1 1 1000
LET Y2 = Y - 0.5
LET RW = CUMULATIVE SUM Y2

Return to the Random Walk Case Study (Background and Data) Page

Dataplot Commands for Rank Sum Test

The Dataplot commands for a rank sum (Wilcoxon rank sum, Mann-Whitney) test are

where Y1 contains the data for sample 1, Y2 contains the data for sample 2, and A is a scalar value (either a number or a parameter). Y1 and Y2 need not have the same number of observations.

The first syntax is used to test the hypothesis that two sample means are equal. The second syntax is used to test that the difference between two means is equal to a specified constant.

Return to the Sign Test Page

Dataplot Commands for the Run Sequence Plot

The Dataplot command to generate a run sequence plot is

RUN SEQUENCE PLOT Y Equivalently, you can enter

PLOT Y The appearance of the plot can be controlled with appropriate settings of the LINE, CHARACTER, SPIKE, and BAR commands and their associated attribute-setting commands.

Return to the Run Sequence Plot Page

Dataplot Command for the Runs Test

The Dataplot command for a runs test is

RUNS TEST Y where Y is a response variable.

Return to the Runs Test Page

Dataplot Commands for Measures of Scale

The various scale measures can be computed in Dataplot as follows:

Return to the Measures of Scale Page

Dataplot Commands for Scatter Plots

The Dataplot command to generate a scatter plot is

PLOT Y X The appearance of the plot can be controlled by appropriate settings of the CHARACTER and LINE commands and their various attribute-setting commands.

Return to the Scatter Plot Page

Dataplot Commands for Scatterplot Matrix

The Dataplot command to generate a scatterplot matrix is

SCATTER PLOT MATRIX X1 X2 ... XK The appearance of the plot can be controlled by appropriate settings of the CHARACTER and LINE commands and their various attribute-setting commands.

In addition, Dataplot provides a number of SET commands to control the appearance of the scatterplot matrix. The most common commands are:

SET MATRIX PLOT LOWER DIAGONAL <ON/OFF>
This command controls whether or not the plots below the diagonal are plotted.
SET MATRIX PLOT TAG <ON/OFF>
If ON, the last variable on the SCATTER PLOT MATRIX command is not plotted directly. Instead, it is used as a group-id variable. You can use the CHARACTER and LINE commands to set the plot attributes for each group.
SET MATRIX PLOT FRAME <DEFAULT/USER/CONNECTED>
If DEFAULT, the plot frames are connected (that is, it does a FRAME CORNER COORDINATES 0 0 100 100). The axis tic marks and labels are controlled automatically. If CONNECTED, then it is similar to DEFAULT except the current value of FRAME CORNER COORDINATES is used. This is useful for putting a small gap between the plots (e.g., enter FRAME CORNER COORDINATES 3 3 97 97 before generating the scatterplot matrix). If USER, Dataplot does not connect the plot frames. The tic marks and labels are as the user set them.
SET MATRIX PLOT FIT <NONE/LOWESS/LINEAR/QUADRATIC> This controls whether a lowess fit, a linear fit, a quadratic fit line, or no fit is superimposed on the plot points. If lowess, a rather high value of the lowess fraction is recommended (e.g., LOWESS FRACTION 0.6).

In Dataplot, enter HELP SCATTER PLOT MATRIX for additional options for this plot.

Return to the Scatterplot Matrix Page

Dataplot Commands for Seasonal Subseries Plot

The Dataplot commands to generate a seasonal subseries plot are

The value of PERIOD defines the length of the seasonal period (e.g., 12 for monthly data) and START identifies which group the series starts with (e.g., if you have monthly data that starts in March, set START to 3).

The appearance of the plot can be controlled by appropriate settings of the CHARACTER and LINE commands and their various attribute-setting commands.

Return to the Seasonal Subseries Plot Page

Dataplot Commands for Sign Test

The Dataplot commands for a sign test are

where Y1 contains the data for sample 1, Y2 contains the data for sample 2, and A is a scalar value (either a number or a parameter). Y1 and Y2 should have the same number of observations.

The first syntax is used to test the hypothesis that the mean for one sample equals a specified constant. The second syntax is used to test the hypothesis that two sample means are equal. The third syntax is used to test that the difference between two means is equal to a specified constant.

Return to the Sign Test Page

Dataplot Commands for Signed Rank Test

The Dataplot commands for a signed rank (or Wilcoxon signed-rank) test are

where Y1 contains the data for sample 1, Y2 contains the data for sample 2, and A is a scalar value (either a number or a parameter). Y1 and Y2 should have the same number of observations.

Return to the Signed RANK Test Page

Dataplot Commands for Skewness and Kurtosis

The Dataplot commands for skewness and kurtosis are

where Y is the response variable. Dataplot can also generate plots of the skewness and kurtosis for grouped data or one-factor data with the following commands:

where Y is the response variable and X is the group id variable.

Return to the Measures of Skewness and Kurtosis Page

Dataplot Command for the Spectral Plot

The Dataplot command to generate a spectral plot is

SPECTRAL PLOT Y

Return to the Spectral Plot Page

Dataplot Command for the Standard Deviation Plot

The Dataplot command to generate a standard deviation plot is

STANDARD DEVIATION PLOT Y X where Y is a response variable and X is a group id variable.

Dataplot supports this command for a number of other common scale statistics. For example, AAD PLOT Y X and MAD PLOT Y X compute the average absolute deviation and median absolute deviation, respectively, instead of the standard deviation for each group.

Return to the Standard Deviation Plot Page

Dataplot Command for the Star Plot

The Dataplot command to generate a star plot is

STAR PLOT X1 TO XP FOR I = 10 1 10 where there are p response variables called X1, X2, ... , XP. Note that this syntax prints one star, specifically the tenth row of the X1, X2, ..., XP variables.

Typically, multiple star plots will be displayed on the same page. For example, to plot the first 25 rows on the same page, enter the following sequence of commands

Return to the Star Plot Page

Dataplot Command to Generate a Table of Summary Statistics

The Dataplot command to generate a table of summary statistics is

SUMMARY Y where Y is the response variable.

Return to the Normal Random Numbers Case Study (Quantitative Output) Page

Dataplot Commands for the t Probability Functions

Dataplot can compute the probability functions for the t distribution with the following commands.

cdf	LET Y = TCDF(X,NU,A,B)
pdf	LET Y = TPDF(X,NU,A,B)
ppf	LET Y = TPPF(X,NU,A,B)
random numbers	LET NU = value LET Y = T RANDOM NUMBERS FOR I = 1 1 1000
probability plot	LET NU = value T PROBABILITY PLOT Y
ppcc plot	LET NU = value T PPCC PLOT Y

In the above, X can be a number, a parameter, or a variable. NU is the shape parameter (= number of degrees of freedom). NU can be a number, a parameter, or a variable. However, it is typically either a number or a parameter. A and B are the location and scale parameters, respectively, and they are optional (a location of 0 and scale of 1 are used if they are omitted). If given, A and B can be a number, a parameter, or a variable. However, they are typically either a number or a parameter.

These functions can be used in the Dataplot PLOT and FIT commands as well. For example,

PLOT TPDF(X) FOR X = -4 0.01 4

Return to the T Distribution Page

Dataplot Command for Tietjen-Moore Test

The Dataplot command for the Tietjen-Moore test is

where Y is the response variable and NOUTLIER specifies the number of outliers to test. The MINIMUM or MAXIMUM keyword is optional. If it is omitted, outliers will be checked in both the minimum and the maximum direction.

Return to the Tietjen-Moore Page

Dataplot Command for Tolerance Intervals

The Dataplot command for tolerance intervals is

TOLERANCE Y where Y is the response variable. Both normal and nonparametric tolerance intervals are printed.

Return to the Tolerance Interval Page

Dataplot Command for Two-Sample t-Test

The Dataplot command to generate a two-sample t-test is

T TEST Y1 Y2 where Y1 contains the data for sample 1 and Y2 contains the data for sample 2. Y1 and Y2 do not need to have the same number of observations.

Return to the Two-Sample t-Test Page

Dataplot Commands for the Tukey-Lambda Probability Functions

Dataplot can compute the probability functions for the Tukey-Lambda distribution with the following commands.

cdf	LET Y = LAMCDF(X,LAMBDA,A,B)
pdf	LET Y = LAMPDF(X,LAMBDA,A,B)
ppf	LET Y = LAMPPF(X,LAMBDA,A,B)
random numbers	LET LAMBDA = value LET Y = TUKEY-LAMBDA RANDOM NUMBERS FOR I = 1 1 1000
probability plot	LET LAMBDA = value TUKEY-LAMBDA PROBABILITY PLOT Y
ppcc plot	TUKEY-LAMBDA PPCC PLOT Y

In the above, X can be a number, a parameter, or a variable. LAMBDA is the shape parameter and is required. It can be a number, a parameter, or a variable. It is typically a number or a parameter. A and B are the location and scale parameters, respectively, and they are optional (a location of 0 and scale of 1 are used if they are omitted). If given, A and B can be a number, a parameter, or a variable. However, they are typically either a number or a parameter.

These functions can be used in the Dataplot PLOT and FIT commands as well. For example,

PLOT LAMPDF(X,0.14) FOR X = -5 0.01 5

Return to the Tukey-Lambda Distribution Page

Dataplot Commands for the Uniform Probability Functions

Dataplot can compute the probability functions for the uniform distribution with the following commands.

cdf	LET Y = UNICDF(X,A,B)
pdf	LET Y = UNIPDF(X,A,B)
ppf	LET Y = UNIPPF(X,A,B)
hazard	LET Y = UNIHAZ(X,A,B)
cumulative hazard	LET Y = UNICHAZ(X,A,B)
survival	LET Y = 1 - UNICDF(X,A,B)
inverse survival	LET Y = UNIPPF(1-X,A,B)
random numbers	LET Y = UNIFORM RANDOM NUMBERS FOR I = 1 1 1000
probability plot	UNIFORM PROBABILITY PLOT Y
parameter estimation	The method of moment estimators can be computed with the commands LET YMEAN = MEAN Y LET YSD = STANDARD DEVIATION Y LET A = YMEAN - SQRT(3)YSD LET B = YMEAN + SQRT(3)YSD The maximum likelihood estimators can be computed with the commands LET YRANGE = RANGE Y LET YMIDRANG = MID-RANGE Y LET A = YMIDRANG - 0.5YRANGE LET B = YMIDRANG + 0.5YRANGE

In the above, X can be a number, a parameter, or a variable. A and B are the lower and upper limits of the uniform distribution and they are optional (A is 0 and B is 1 if they are omitted). The location parameter is A and the scale parameter is (B - A). If given, A and B can be a number, a parameter, or a variable. However, they are typically either a number or a parameter.

These functions can be used in the Dataplot PLOT and FIT commands as well. For example,

PLOT UNIPDF(X) FOR X = 0 0.1 1

Return to the Uniform Distribution Page

Dataplot Commands for the Generation of Uniform Random Numbers

The Dataplot commands to generate 1,000 uniform random numbers in the interval (-100,100) are

A similar technique can be used for any package that can generate standard uniform random numbers. Simply multiply by the scale value (equals upper limit minus lower limit) and add the location value.

In Dataplot, the seed is an odd integer with a minimum (and default) value of 305. Seeds less than 305 generate the same sequence as 305 and even numbers generate the same sequence as the preceeding odd number. To change the seed value to 401 in Dataplot, enter the command:

SEED 401

Return to the Uniform Random Numbers Case Study (Background and Data) Page

Dataplot Commands for the Weibull Probability Functions

Dataplot can compute the probability functions for the Weibull distribution with the following commands.

cdf	LET Y = WEICDF(X,GAMMA,A,B)
pdf	LET Y = WEIPDF(X,GAMMA,A,B)
ppf	LET Y = WEIPPF(X,GAMMA,A,B)
hazard	LET Y = WEIHAZ(X,GAMMA,A,B)
cumulative hazard	LET Y = WEICHAZ(X,GAMMA,A,B)
survival	LET Y = 1 - WEICDF(X,GAMMA,A,B)
inverse survival	LET Y = WEIPPF(1-X,GAMMA,A,B)
random numbers	LET GAMMA = value LET Y = WEIBULL RANDOM NUMBERS FOR I = 1 1 1000
probability plot	LET GAMMA = value WEIBULL PROBABILITY PLOT Y
ppcc plot	LET GAMMA = value WEIBULL PPCC PLOT Y
parameter estimation	If your data are not censored, enter the commands SET CENSORING TYPE NONE WEIBULL MLE Y If your data have type 1 censoring at fixed time t₀, enter the commands LET TEND = censoring time SET CENSORING TYPE 1 WEIBULL MLE Y X If your data have type 2 censoring, enter the commands SET CENSORING TYPE 2 WEIBULL MLE Y X Y is the response variable and X is the censoring variable where a value of 1 indicates a failure time and a value of 0 indicates a censoring time. In addition to the point estimates, confidence intervals for the parameters are generated.

In the above, X can be a number, a parameter, or a variable. GAMMA is the shape parameter and is required. It can be a number, a parameter, or a variable. It is typically a number or a parameter. A and B are the location and scale parameters, respectively, and they are optional (a location of 0 and scale of 1 are used if they are omitted). If given, A and B can be a number, a parameter, or a variable. However, they are typically either a number or a parameter.

These functions can be used in the Dataplot PLOT and FIT commands as well. For example,

PLOT WEIPDF(X,2) FOR X = 0.01 0.01 5

Return to the Weibull Distribution Page

Dataplot Commands for the Weibull Plot

The Dataplot commands to generate a Weibull plot are

where Y is the response variable containing failure times and X is an optional censoring variable. A value of 1 indicates the item failed by the failure mode of interest while a value of 0 indicates that the item failed by a failure mode that is not of interest.

The appearance of the plot can be controlled with appropriate settings for the LINE and CHARACTER commands. For example, to draw the raw data with the "X" character and the 2 reference lines as dashed lines, enter the commands

Dataplot saves the following internal parameters after the Weibull plot.

Return to the Weibull Plot Page

Dataplot Command for the Wilk-Shapiro Normality Test

The Dataplot command for a Wilk-Shapiro normality test is

WILK SHAPIRO TEST Y where Y is the response variable.

The significance value is only valid if there is less than 5,000 points.

Return to the Wilk Shapiro Page

Dataplot Commands for Yates Analysis

The Dataplot command for a Yates analysis is

YATES Y where Y is a response variable in Yates order.

Return to the Yates Analysis Page

Dataplot Commands for the Youden Plot

The Dataplot command to generate a Youden plot is

YOUDEN PLOT Y1 Y2 LAB where Y1 and Y2 are the response variables and LAB is a laboratory (or run number) identifier. The LINE and CHARACTER commands can be used to control the appearance of the Youden plot. For example, if there are 5 labs, a typical sequence would be

Return to the Youden Plot Page

Dataplot Commands for the 4-plot

The Dataplot command to generate the 4-plot is

4-PLOT Y where Y is the response variable.

Return to the 4-Plot Page

Dataplot Commands for the 6-Plot

The Dataplot commands to generate a 6-plot are

where Y is the response variable and X is the independent variable.

Return to the 6-Plot Page