PERCENT POINT PLOT

Name:

PERCENT POINT PLOT Type:

Graphics Command Purpose:

Generates a percent point plot. Description:

Vertical axis	=	percent point;
Horizontal axis	=	percent (0 to 100).

Thus, for example, if the value of 50 is chosen on the horizontal axis, then the corresponding value on the vertical axis is the estimated 50% point (that is, the median) from the data.

The percent point plot can be generated for either raw data or for binned data.

For raw data, the percentile plot is constructed by plotting the sorted data on the vertical axis. The corresponding horizontal axis value for the i-th point is 100*Y_i/N with Y_i and N denoting the i-th observation of the sorted data and the sample size, respectively. The multiplication by 100 is to covert the horizontal axis to a percentage value.

For binned data, the vertical axis value is the mid-point of the bin. The corresponding horizontal axis values are the cumulative sums of the frequencies of the bins divided by the sum of the frequencies for all bins. This value is multiplied by 100 to convert the horizontal axis to a percentage value.

By default, raw data is first binned into frequency data. To suppress this binning (i.e., generate the raw data version of the plot), enter the command

SET PERCENT POINT PLOT UNBINNED

To restore the default of binning raw data, enter

SET PERCENT POINT PLOT BINNED

Typically no binning is preferred for small to moderate size data sets. Binning can be helpful for large data sets in that it reduces the number of points that are plotted.

Syntax 1:

This syntax is used for the case where you have raw data.

Syntax 2:

This syntax is used for the case where you have pre-computed frequencies at each data level. This syntax is used when you have equal width bins.

Syntax 3:

This syntax is used for the case where you have pre-computed frequencies at each data level. This syntax is used when you have unequal width bins.

Syntax 4:

This syntax will generate percent point plots of each of the listed response variables on the same plot. You can specify different plot attributes for each response variable.

This syntax is only supported for raw data (i.e., no binned data).

Syntax 5:

From one to six group-id variables can be specified (most commonly there is a single group-id variable).

Note that with this syntax, the plot points corresponding to each group are drawn with different attributes (i.e., the first group uses the first setting for the CHARACTER and LINE and related attribute setting commands, the second group uses the second setting, and so on). For example, this syntax can be used to label the plot points with the group-id.

If there is more than one group-id variable, the attribute settings work from right to left. That is, if X1 has 2 levels and X2 has 2 levels, then

trace 1	=	Level 1 of X1 and Level 1 of X2
trace 2	=	Level 1 of X1 and Level 2 of X2
trace 3	=	Level 2 of X1 and Level 1 of X2
trace 4	=	Level 2 of X1 and Level 1 of X2

Syntax 6:

Although this syntax is similar to the REPLICATION case, it is generally used in a different way. The REPLICATION case is used when we have distinct groups of data and we want to generate separate percent point plots for each group. Highlighting is used when we have a single group of data, but we want to draw some of the points with different attributes. For example, we may want to emphasize the extreme points in the plot.

Examples:

Note:

The SET HISTOGRAM CLASS WIDTH can be used to define several other algorithms for binning the data (HELP HISTOGRAM CLASS WIDTH for details). The SET HISTOGRAM OUTLIERS command also applies to the PERCENT POINT PLOT if raw data is being binned.

Note:

Percent point plots are also referred to as quantile plots in the statistical literature. Note:

The attributes of the plot can be set by the first setting of the LINE, CHARACTER, SPIKE, and BAR commands (and there corresponding attribute setting commands). This is demonstrated in the sample program below. Default:

None Synonyms:

None Related Commands:

QUAN-QUAN PLOT	Generates a quantile-quantile plot.
HISTOGRAM	= Generates a histogram.
PIE CHART	= Generates a pie chart.
FREQUENCY PLOT	= Generate a frequency plot.
PROBABILITY PLOT	= Generate a probability plot.
PPCC PLOT	= Generates probability plot correlation coefficient plot.
PLOT	= Generate a data or function plot.
CLASS LOWER	= Set the lower class minimum for histograms, frequency plots, and pie charts.
CLASS UPPER	= Set the upper class maximum for histograms, frequency plots, and pie charts.
CLASS WIDTH	= Set the class width for histograms, frequency plots, and pie charts.
HISTOGRAM CLASS WIDTH	= Specify alternative default class wdith algorithms for histograms.

Applications:

Distributional Analysis Reference:

Graphical Methods for Data Analysis

Implementation Date:

Program 1:

 
SKIP 25
READ SUNSPOT2.DAT Y
.
LET ALOW = MINIMUM Y
LET AHIGH = MAXIMUM Y
CLASS LOWER ALOW
CLASS UPPER AHIGH
CLASS WIDTH 1.0
CHARACTER CIRCLE
CHARACTER FILL ON
CHARACTER SIZE 1.2
X1LABEL PERCENT POINT
Y1LABEL DATA VALUE
TITLE AUTOMATIC
.
PERCENT POINT PLOT Y

Program 2:

 
let y1 = norm rand numb for i = 1 1 100
.
title case asis
title offset 2
title automatic
label case asis
tic mark offset units screen
tic mark offset 3 3
.
char circle
char fill on
char hw 0.5 0.375
line blank
.
multiplot corner coordinates 5 5 95 95
multiplot scale factor 2
multiplot 2 2
.
set percent point plot unbinned
set histogram outliers on
set histogram empty bins off
title Unbinned Data
percent point plot y1
.
set percent point plot binned
title Data Binned by Command
percent point plot y1
.
title User Created Bins: Equi-Spaced Bins
let z2 x2 = binned y1
percent point plot z2 x2
.
let minsize = 5
let z3 xlow xhigh = combine frequency table z2 x2
title User Created Bins: Unequal-Spaced Bins
percent point plot z3 xlow xhigh
.
end of multiplot
justification center
move 50 97
text Percent Point Plots for 100 Normal Random Numbers
move 50 5
text Percentile
direction vertical
move 3 50
text Response Value

Program 3:

 
dimension 500 rows
skip 25
read iris.dat y1 y2 y3 y4
let m = create matrix y1 y2 y3 y4
.
title case asis
title offset 2
label case asis
.
char circle all
char color black
char fill on all
char hw 0.5 0.375 all
line blank all
.
y1label Response Value
x1label Percentile
title IRIS Data (all species combined)
.
set percent point plot unbinned
set histogram outliers on
set histogram empty bins off
percent point plot m
.
char color red blue cyan green
title IRIS Data (species plotted separately)
multiple percent point plot y1 to y4

plot generated by sample program

Program 4:

 
skip 25
read gear.dat y x
.
title case asis
title offset 2
label case asis
tic mark offset units screen
tic mark offset 5 5
.
char circle all
char color black red blue green cyan grey brown magenta dgreen orange
char fill on all
char hw 0.5 0.375 all
line blank all
.
title Percent Point Plots for GEAR.DAT
y1label Response Value
x1label Percentile
.
set percent point plot unbinned
set histogram outliers on
set histogram empty bins off
replicated percent point plot y x