|
FLUCTUATION PLOTName:
For the fluctuation plot, a grid is created so that is each combination of categories has a fixed position on the grid. At each grid position, two rectangles are drawn. The first is drawn in a background color and is full size (i.e., the maximum count). A second rectangle is drawn in a foreground color with a height proportional to the count for that particular combination of categories. The background rectangle is drawn to give a sense of scale. If you do not want this background rectangle, then set the color equal to the background color of the plot. Some analysts find the format of the flucuation plot easier to interpret than the mosaic plot. Although the mosaic and fluctuation plots were developed to visualize counts for categorical data, Dataplot can also generate the fluctuation plot for various statistics. For example, you could use it to display mean values for several factor variables. In particular, we have found it useful for displaying binomial probabilities. For displaying the value of a statistic, the minimum value of the statistic over all combinations of categories will be drawn with zero height and the maximum value of the statistic over all categories will be drawn at the full height. Intermediate values will be scaled between the minimum and maximum values.
<SUBSET/EXCEPT/FOR qualification> where <stat> is the one of the following statistics:
MEAN, MIDMEAN, MEDIAN, TRIMMED MEAN, WINSORIZED MEAN, GEOMETRIC MEAN, HARMONIC MEAN, HODGES LEHMAN, BIWEIGHT LOCATION, LP LOCATION, SUM, PRODUCT, STANDARD DEVIATION, STANDARD DEVIATION OF MEAN, VARIANCE, VARIANCE OF THE MEAN, TRIMMED MEAN STANDARD ERROR, AVERAGE ABSOLUTE DEVIATION (or AAD), MEDIAN ABSOLUTE DEVIATION (or MAD), IQ RANGE, BIWEIGHT MIDVARIANCE, BIWEIGHT SCALE, PERCENTAGE BEND MIDVARIANCE, WINSORIZED VARIANCE, WINSORIZED STANDARD DEVIATION, VARIANCE OF LP LOCATION, SD OF LP LOCATION, RELATIVE STANDARD DEVIATION, RELATIVE VARIANCE (or COEFFICIENT OF VARIATION), RANGE, MIDRANGE, MAXIMUM, MINIMUM, EXTREME, LOWER HINGE, UPPER HINGE, LOWER QUARTILE, UPPER QUARTILE, <FIRST/SECOND/THIRD/FOURTH/FIFTH/SIXTH/SEVENTH/EIGHTH/ NINTH/TENTH> DECILE, PERCENTILE, QUANTILE, QUANTILE STANDARD ERROR, SKEWNESS, KURTOSIS, NORMAL PPCC, AUTOCORRELATION, AUTOCOVARIANCE, SIN FREQUENCY, SIN AMPLITUDE, CP, CPK, CNPK, CPM, CC, EXPECTED LOSS, PERCENT DEFECTIVE, TAGUCHI SN0 (or SN), TAGUCHI SN+ (or SNL), TAGUCHI SN- (or SNS), TAGUCHI SN00 (or SN2); <x1> ... <xk> is a list of one to six categorical variables; and where the <SUBSET/EXCEPT/FOR qualification> is optional. This syntax is used for the case where you have raw data (i.e., the data has not yet been cross tabulated) and you are computing a statistic that requires a single response variable.
If
<SUBSET/EXCEPT/FOR qualification> where <stat> is the one of the following statistics:
LINEAR INTERCEPT, LINEAR SLOPE, LINEAR RESSD, LINEAR CORRELATION, CORRELATION, RANK CORRELATION, COVARIANCE, RANK COVARIANCE, WINSORIZED COVARIANCE, WINSORIZED COVARIANCE, BIWEIGHT MIDCOVARIANCE, BIWEIGHT MIDCORRELATION, PERCENTAGE BEND CORRELATION, ODDS RATIO, ODDS RATIO STANDARD ERROR, LOG ODDS RATIO, LOG ODDS RATIO STANDARD ERROR, FALSE POSITIVE, FALSE NEGATIVE, TRUE POSITIVE, TRUE NEGATIVE, TEST SENSITIVITY, TEST SPECIFICITY, POSITIVE PREDICTIVE VALUE, NEGATIVE PREDICTIVE VALUE, RELATIVE RISK, RATIO; <y2> is the second response variable; <x1> ... <xk> is a list of one to six categorical variables; and where the <SUBSET/EXCEPT/FOR qualification> is optional. This syntax is used for the case where you have raw data (i.e., the data has not yet been cross tabulated) and you are computing a statistic that requires two response variables.
<SUBSET/EXCEPT/FOR qualification> where <stat> is the one of the following statistics:
GEOMETRIC MEAN, HARMONIC MEAN, HODGES LEHMAN, MIDRANGE, BIWEIGHT LOCATION, SUM, STANDARD DEVIATION, STANDARD DEVIATION OF MEAN, VARIANCE, VARIANCE OF THE MEAN, TRIMMED MEAN STANDARD ERROR, AVERAGE ABSOLUTE DEVIATION (or AAD), MEDIAN ABSOLUTE DEVIATION (or MAD), IQ RANGE, BIWEIGHT MIDVARIANCE, BIWEIGHT SCALE, PERCENTAGE BEND MIDVARIANCE, WINSORIZED VARIANCE, WINSORIZED STANDARD DEVIATION, RELATIVE STANDARD DEVIATION, RELATIVE VARIANCE, COEFFICIENT OF VARIATION, RANGE, MAXIMUM, MINIMUM, EXTREME, QUANTILE, SKEWNESS, KURTOSIS; <y2> is the second response variable; <x1> ... <xk> is a list of one to six categorical variables; and where the <SUBSET/EXCEPT/FOR qualification> is optional. This syntax is used for the case where you have raw data (i.e., the data has not yet been cross tabulated) and you are computing the difference between two response variables for the specified statistic. The variables can either independent (i.e., not paired) or dependent (i.e., paired), but the response variables must have the same number of elements.
where <m> is a matrix containing a two-way table; and where the <SUBSET/EXCEPT/FOR qualification> is optional. This syntax is used for the case where the data have already been cross-tabulated into a two-way table. Although this is typically used for the COUNTS case, the table can in fact contain values for any statistic that has been previously cross-tabulated (including statistics not listed in Syntax 1 - Syntax 3 above).
<SUBSET/EXCEPT/FOR qualification> where <y> is a response variable; <x1> ... <xk> is a list of one to six categorical variables; and where the This syntax is used for the case when you want to compute binomial probabilities from raw data. In this case, the response variable should be set to 1 to indicate "success" and to 0 to indicate "failure".
FLUCTUATION BINOMIAL PROBABILITY PLOT Y X1 X2 FLUCTUATION PLOT M
When there are two or more categorical variables, the division is first performed vertically, then horizontally. This vertical/horizontal subdivision is repeated until all the categorical variables are accommodated.
SET FLUCTUATION PLOT CEILING <value> Values less than the floor value will be set to the floor value and values greater than the ceiling value are set to the ceiling value. The default is to use the minimum and maximum values of the computed statistic. For the COUNT case, the floor value will be set to 0. For the BINOMIAL PROBABILITY case, the floor and ceiling values will be set to 0 and 1, respectively. After the fluctuation plot is generated, Dataplot will save the internal parameters STATMINI and STATMAXI that contain the minimum and maximum values, respectively, of the computed statistic.
To reset fixed width bars, enter the command
This option does not apply to the case where the statistic being computed is the frequency counts (COUNT). In this case, the height of the bars already indicates the frequency counts.
Friendly (2000), "Visualizing Categorical Data", SAS Institute Inc., p. 90.
. Example from page 61 of Friendly
. Data denotes counts.
read matrix m
5 29 14 16
15 54 14 10
20 84 17 94
68 119 26 7
end of data
.
label case asis
tic mark label case asis
title case asis
title offset 2
.
x3label
title Fluctuation Plot
y1label Eye Color
x1label Hair Color
tic offset units data
xlimits 1 4
major xtic mark number 4
minor xtic mark number 0
xtic mark offset 1 1
x1tic mark label format alpha
x1tic mark label content Black Brown Red Blond
ylimits 1 4
major ytic mark number 4
minor ytic mark number 0
ytic mark offset 1 1
y1tic mark label format alpha
y1tic mark label content Green Hazel Blue Brown
y1tic mark label justification right
.
line color g75 black
region fill color g75 black
region border color g75 black
.
fluctuation plot m
Program 2:
skip 25
read alarm.dat inst src expalarm obsalarm
let n = size expalarm
let correct = 0 for i = 1 1 n
let correct = 1 subset expalarm = 0 subset obsalarm = 0
let correct = 1 subset expalarm = 1 subset obsalarm = 1
.
label case asis
tic mark label case asis
title case asis
title offset 2
.
x3label
title Fluctuation Plot of Binomial Probability for Correct Alarm
y1label Instrument
x1label Source
tic offset units data
xlimits 1 6
major xtic mark number 6
minor xtic mark number 0
xtic mark offset 1 1
ylimits 1 15
major ytic mark number 15
minor ytic mark number 0
ytic mark offset 1 1
.
line color g75 black
region fill color g75 black
region border color g75 black
.
set fluctuation plot width proportional
fluctuation binomial probability plot correct inst src
Program 3:
skip 25
read ripken.dat y x1 to x4
.
label case asis
tic mark label case asis
title case asis
.
x3label
title Fluctuation Plot for Cal Ripken Mean Batting Average
let string v1 = Low
let string v2 = Middle
let string v3 = Left:sp()High
let string v4 = Low
let string v5 = Middle
let string v6 = Right:sp()High
let igy = group label v1 to v6
let string h1 = Inside
let string h2 = Middlecr()Fastball
let string h3 = Outside
let string h4 = Inside
let string h5 = Middlecr()Curveball
let string h6 = Right
let igx = group label h1 to h6
.
tic offset units data
xlimits 1 6
major xtic mark number 6
minor xtic mark number 0
xtic mark offset 1 1
x1tic mark label format group label
x1tic mark label content igx
ylimits 1 6
major ytic mark number 6
minor ytic mark number 0
ytic mark offset 1 1
y1tic mark label format group label
y1tic mark label content igy
y1tic mark label justification right
.
line color g75 black
region fill color g75 black
region border color g75 black
.
fluctuation mean plot y x2 x1 x4 x3
.
move 50 92
just center
text (Minimun BA: ^statmini, Maximum BA: ^statmaxi)
Date created: 1/6/2009 |