Dataplot Vol 2 Vol 1

# HODGES LEHMANN

Name:
HODGES LEHMANN (LET)
Type:
Let Subcommand
Purpose:
Compute the Hodges-Lehmann location estimate for a variable.
Description:
The mean is the sum of the observations divided by the number of observations. The mean can be heavily influenced by extreme values in the tails of a variable.

Mosteller and Tukey (see Reference section below) define two types of robustness:

1. resistance means that changing a small part, even by a large amount, of the data does not cause a large change in the estimate

2. robustness of efficiency means that the statistic has high efficiency in a variety of situations rather than in any one situation. Efficiency means that the estimate is close to optimal estimate given that we know what distribution that the data comes from. A useful measure of efficiency is:

Efficiency = (lowest variance feasible)/ (actual variance)

For location estimaors, the mean is the optimal estimator for Gaussian data. However, it is not resistant and it does not have robustness of efficiency.

The Hodge-Lehmann location estimate is based on ranks. This makes it more resistant, as defined above, than the mean. This estimator also has high efficiency for symmetric disributions. It may be less successful with some skewed distributions.

Specifically, the Hodges-Lehmann estimate for location is defined as

$$\hat{\mu} = \mbox{median} \frac{X_i + X_j} {2} \hspace{0.5in} 1 \le i \le j \le n$$

Dataplot uses ACM algorithm 616 (HLQEST written by John Monohan) to compute the estimate. This is a fast, exact algoirthm. One modification is that for n <= 25 Dataplot computes the estimate directly from the definition.

Syntax:
LET <par> = HODGES LEHMANN <y1>
<SUBSET/EXCEPT/FOR qualification>
where <y1> is the response variable;
<par> is a parameter where the computed Hodges-Lehmann location estimate is stored;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
Examples:
LET A = HODGES LEHMANN Y1
LET A = HODGES LEHMANN Y1 SUBSET TAG > 2
Note:
Dataplot statistics can be used in a number of commands. For details, enter

Default:
None
Synonyms:
None
Related Commands:
 MEAN = Compute the mean. MEDIAN = Compute the median. TRIMMED MEAN = Compute the trimmed mean. WINSORIZED MEAN = Compute the Winsorized mean. RANK CORRELATION = Compute the rank correlation between two variables. STATISTICS PLOT = Generate a statistic versus subset plot. BOOTSTRAP PLOT = Generate a bootstrap plot. TABULATE = Perform a tabulation for a specified statistic.
Reference:
John Monahan (1984), "Algorithm 616: Fast Computation of the Hodges-Lehmann Location Estimator," ACM Transactions on Mathematical Software, Vol. 10, No. 3, pp. 265-270.

Rand Wilcox (1997), "Introduction to Robust Estimation and Hypothesis Testing," Academic Press.

Applications:
Robust Data Analysis
Implementation Date:
2002/07
Program 1:

LET Y1 = NORMAL RANDOM NUMBERS FOR I = 1 1 100
LET Y2 = LOGISTIC RANDOM NUMBERS FOR I = 1 1 100
LET Y3 = CAUCHY RANDOM NUMBERS FOR I = 1 1 100
LET Y4 = DOUBLE EXPONENTIAL RANDOM NUMBERS FOR I = 1 1 100
LET A1 = HODGES LEHMANN Y1
LET A2 = HODGES LEHMANN Y2
LET A3 = HODGES LEHMANN Y3
LET A4 = HODGES LEHMANN Y4

Program 2:

MULTIPLOT 2 2
MULTIPLOT CORNER COORDINATES 0 0 100
MULTIPLOT SCALE FACTOR 2
X1LABEL DISPLACEMENT 12
.
LET Y1 = NORMAL RANDOM NUMBERS FOR I = 1 1 200
LET Y2 = CAUCHY RANDOM NUMBERS FOR I = 1 1 200
.
BOOTSTRAP SAMPLES 500
BOOTSTRAP HODGES LEHMANN PLOT Y1
X1LABEL B025 = ^B025, B975=^B975
HISTOGRAM YPLOT
X1LABEL
.
BOOTSTRAP BIWEIGHT MIDVARIANCE PLOT Y1
X1LABEL B025 = ^B025, B975=^B975
HISTOGRAM YPLOT
.
END OF MULTIPLOT
.
JUSTIFICATION CENTER
MOVE 50 46
TEXT HODGES LEHMANN BOOTSTRAP: CAUCHY
MOVE 50 96
TEXT HODGES LEHMANN BOOTSTRAP: NORMAL


NIST is an agency of the U.S. Commerce Department.

Date created: 07/22/2002
Last updated: 11/16/2015