![]() |
TUKEY MEAN-DIFFERENCE PLOTName:
A quantile-quantile plot (or q-q plot) is a graphical data analysis technique for comparing the distributions of 2 data sets. The quantile-quantile plot is a graphical alternative for the various classical 2-sample tests (e.g., t for location, F for dispersion). The plot consists of the following:
Horizontal axis = estimated quantiles from data set 2. The "quantiles" of a distribution are the distribution's "percent points" (e.g., .5 quantile = 50% point = median). The advantage of the quantile-quantile plot is 2-fold:
The quantile-quantile plot has 2 components:
Given a q-q plot, assume its y coordinates are in T(i) and its x coordinates are in D(i), then the Tukey mean-difference is defined as:
Horizontal axis = (T(i) + D(i)/2. The Tukey mean-difference plot also plots a horizontal reference line at zero. That is, it plots the difference of the quantiles against their average. The advantage of the Tukey mean-difference compared to the q-q plot is that it converts interpretation of the differences around a 45 degree diagonal line to interpretation of differences around a horizontal zero line. However, the Tukey mean-difference plot should only be applied if the two variables are on a common scale. Like usual, the appearance of the 2 components is controlled by the first 2 settings of the CHARACTERS and LINES commands. It is typical for the response points to be represented as some character, say X's, with no connecting line, and the reference line as a connected line with no character. This is demonstrated in the sample program below.
<SUBSET/EXCEPT/FOR qualification> where <y1> is the first response variable; <y2> is the second response variable; and where the <SUBSET/EXCEPT/FOR qualification> is optional.
<SUBSET/EXCEPT/FOR qualification> where <y1> is the first response variable; <y2> is the second response variable; <tag> is the group-id variable that defines the highlighting; and where the <SUBSET/EXCEPT/FOR qualification> is optional. This syntax can be used to plot different plot points with different attributes. For example, it can used to highlight groups in the data or to emphasize the extremes.
TUKEY MEAN DIFFERENCE PLOT RUN1 RUN2 TUKEY MEAN DIFFERENCE PLOT BATCH1 BATCH2 TUKEY MEAN DIFFERENCE PLOT Y1 Y2 SUBSET AUTO 4 TUKEY MEAN DIFFERENCE PLOT Y1 Y2 SUBSET STATE 25
LET X = SEQUENCE .01 .01 .99 LET Y2 = NORPPF(X) TUKEY MEAN DIFFERENCE PLOT Y1 Y2 This same technique can be used other distributions (use the appropriate PPF function).
<value> where <value> specifies the desired number of quantiles. This is demonstrated in the Program 2 example below.
Chambers, Cleveland, Kleiner, and Tukey (1983), "Graphical Methods of Data Analysis", Wadsworth, pp. 48-57.
SKIP 25 READ AUTO83B.DAT Y1 Y2 . DELETE Y2 SUBSET Y2 < 0 LINE BLANK SOLID CHARACTER CIRCLE BLANK CHARACTER FILL ON OFF TIC OFFSET UNITS DATA YTIC OFFSET 0 2 TITLE AUTOMATIC LABEL CASE ASIS Y1LABEL Difference of Percentiles X1LABEL Average of Percentiles TUKEY MEAN DIFFERENCE PLOT Y1 Y2 ![]() LET Y1 = NORMAL RANDOM NUMBER FOR I = 1 1 1000000 LET Y2 = DOUBLE EXPONENTIAL RANDOM NUMBER FOR I = 1 1 1000000 . LINE BLANK SOLID CHARACTER CIRCLE BLANK CHARACTER FILL ON OFF CHARACTER HW 0.5 0.375 TITLE AUTOMATIC TITLE OFFSET 2 LABEL CASE ASIS Y1LABEL Normal Random Numbers X1LABEL Double Exponential Random Numbers . SET QUANTILE QUANTILE PLOT NUMBER OF PERCENTILES 1000 TUKEY MEAN DIFFERENCE PLOT Y1 Y2 ![]() SKIP 25 READ AUTO83B.DAT Y1 Y2 DELETE Y2 SUBSET Y2 < 0 . LINE BLANK BLANK SOLID CHARACTER CIRCLE CIRCLE BLANK CHARACTER FILL ON ON OFF CHARACTER HW 0.5 0.375 ALL CHARACTER COLOR BLACK RED TITLE AUTOMATIC TITLE OFFSET 2 TIC MARK OFFSET UNITS SCREEN YTIC MARK OFFSET 5 5 . LET N2 = SIZE Y2 LET TAG = 1 FOR I = 1 1 N2 LET TAG = 2 SUBSET Y2 > 32 . HIGHLIGHT TUKEY MEAN DIFFERENCE PLOT Y2 Y1 TAG ![]()
Date created: 06/05/2001 |
Last updated: 12/04/2023 Please email comments on this WWW page to alan.heckert@nist.gov. |