TUKEY MEAN-DIFFERENCE PLOT

Name:

TUKEY MEAN-DIFFERENCE PLOT Type:

Graphics Command Purpose:

Generates a Tukey mean-difference plot. Description:

A quantile-quantile plot (or q-q plot) is a graphical data analysis technique for comparing the distributions of 2 data sets. The quantile-quantile plot is a graphical alternative for the various classical 2-sample tests (e.g., t for location, F for dispersion).

The plot consists of the following:

The "quantiles" of a distribution are the distribution's "percent points" (e.g., .5 quantile = 50% point = median). The advantage of the quantile-quantile plot is 2-fold:

the sample sizes do not need to be identical;
many distributional aspects can be simultaneously tested. For example, shifts in location, shifts in dispersion, changes in symmetry/skewness, outliers, etc.

The quantile-quantile plot has 2 components:

the quantile points themselves;
a 45 degree reference line.

Given a q-q plot, assume its y coordinates are in T(i) and its x coordinates are in D(i), then the Tukey mean-difference is defined as:

The Tukey mean-difference plot also plots a horizontal reference line at zero.

That is, it plots the difference of the quantiles against their average. The advantage of the Tukey mean-difference compared to the q-q plot is that it converts interpretation of the differences around a 45 degree diagonal line to interpretation of differences around a horizontal zero line. However, the Tukey mean-difference plot should only be applied if the two variables are on a common scale.

Like usual, the appearance of the 2 components is controlled by the first 2 settings of the CHARACTERS and LINES commands. It is typical for the response points to be represented as some character, say X's, with no connecting line, and the reference line as a connected line with no character. This is demonstrated in the sample program below.

Syntax 1:

Syntax 2:

This syntax can be used to plot different plot points with different attributes. For example, it can used to highlight groups in the data or to emphasize the extremes.

Examples:

Note:

This same technique can be used other distributions (use the appropriate PPF function).

Note:

where <value> specifies the desired number of quantiles. This is demonstrated in the Program 2 example below.

Default:

None Synonyms:

TUKEY M-D PLOT is a synonym for TUKEY MEAN DIFFERENCE PLOT. Related Commands:

CHARACTERS	= Sets the type for plot characters.
LINES	= Sets the type for plot lines.
QUANTILE-QUANTILE PLOT	= Generates a q-q plot.
BOX PLOT	= Generates a box plot.
BIHISTOGRAM	= Generates a bihistogram.
PLOT	= Generates a data or function plot.
PROBABILITY PLOT	= Generates a probability plot.
T-TEST	= Carries out a 2-sample t test.
F-TEST	= Carries out a 2-sample F test.

Reference:

Visualizing Data

Chambers, Cleveland, Kleiner, and Tukey (1983), "Graphical Methods of Data Analysis", Wadsworth, pp. 48-57.

Applications:

Exploratory Data Analysis Implementation Date:

2000/1 Program 1:

 
SKIP 25
READ AUTO83B.DAT Y1 Y2
.
DELETE Y2 SUBSET Y2 < 0
LINE BLANK SOLID
CHARACTER CIRCLE BLANK
CHARACTER FILL ON OFF
TIC OFFSET UNITS DATA
YTIC OFFSET 0 2
TITLE AUTOMATIC
LABEL CASE ASIS
Y1LABEL Difference of Percentiles
X1LABEL Average of Percentiles
TUKEY MEAN DIFFERENCE PLOT Y1 Y2

Program 2:

 
LET Y1 = NORMAL RANDOM NUMBER FOR I = 1 1 1000000
LET Y2 = DOUBLE EXPONENTIAL RANDOM NUMBER FOR I = 1 1 1000000
.
LINE BLANK SOLID
CHARACTER CIRCLE BLANK
CHARACTER FILL ON OFF
CHARACTER HW 0.5 0.375
TITLE AUTOMATIC
TITLE OFFSET 2
LABEL CASE ASIS
Y1LABEL Normal Random Numbers
X1LABEL Double Exponential Random Numbers
.
SET QUANTILE QUANTILE PLOT NUMBER OF PERCENTILES 1000
TUKEY MEAN DIFFERENCE PLOT Y1 Y2

Program 3:

 
SKIP 25
READ AUTO83B.DAT Y1 Y2
DELETE Y2 SUBSET Y2 < 0
.
LINE BLANK BLANK SOLID
CHARACTER CIRCLE CIRCLE BLANK
CHARACTER FILL ON ON OFF
CHARACTER HW 0.5 0.375 ALL
CHARACTER COLOR BLACK RED
TITLE AUTOMATIC
TITLE OFFSET 2
TIC MARK OFFSET UNITS SCREEN
YTIC MARK OFFSET 5 5
.
LET N2 = SIZE Y2
LET TAG = 1 FOR I = 1 1 N2
LET TAG = 2 SUBSET Y2 > 32
.
HIGHLIGHT TUKEY MEAN DIFFERENCE PLOT Y2 Y1 TAG