Dataplot Vol 1 Vol 2

CME

Name:
CME
Type:
Analysis Command
Purpose:
Estimate the parameters of a generalized Pareto distribution using the conditional mean exceedance (CME) method.
Description:
The generalized Pareto distribution (GPD) is an asymptotic distribution developed by using the fact that exceedances of a sufficiently high threshold are rare events to which the Poisson distribution applies.

The cumulative distribution function of the generalized Pareto distribution is

$$G(y) = 1 - {[1 + (cy/a)]^{-1/c}} \hspace{0.5 in} a > 0, (1 + (cy/a)) > 0$$

Here, c is the shape parameter and a is the scale parameter.

This equation can be used to represent the conditional cumulative distribution of the excess Y = X - u of the variate X over the threshold u, given X > u for u sufficiently large.

The cases c > 0, c = 0, and c < 0 correspond respectively to the extreme value type II (Frechet), extreme value type I (Gumbel), and reverse Weibull domains of attraction.

Given the mean E(Y) and standard deviation sY of the variate Y, then

a = 0.5*E(Y)*{1 + [E(Y)/sY]2}
c = 0.5*{1 - [E(Y)/sY]2}

The CME, or mean residual life (MRL), is the expectation of the amount by which a value exceeds a threshold u, conditional on that threshold being attained.

If the exceedance data are fitted by the GPD model and c < 1, u > 0, and (a + u*c) > 0, then a plot of CME versus u should follow a line with intercept a/(1-c) and slope c/(1-c). The linearity of the CME plot can thus be used as an indicator of the appropriateness of the GPD model and both c and a can be estimated.

Note that for the case where c < 0, then $$\gamma$$ = -1/c is the estimate of the shape parameter for the reverse Weibull (SET MINMAX 2 case in Dataplot) distribution.

The CME command performs a least squares fit of the CME versus u data points. It does this as follows:

1. All points above the user specified threshold are saved and sorted.

2. Loop through the sorted points from smallest to largest.

3. For a given point in the loop, set the threshold u equal to that point. Then compute the CME. The CME is simply the sum of the points minus the threshold for those points greater than the threshold divided by the number of points greater than the threshold.

The NISTIR 5531 document (see the References section below) gives the formula for the standard deviation of c.

Syntax:
CME MLE <y> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response variable;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
Examples:
CME MLE Y
CME MLE Y SUBSET TAG > 0
Note:
The user specified threshold is determined by entering the following command before the CME command:

LET THRESHOL = <value>

If no threshold is specified, then the minimum data value is used as the threshold.

Note:
The following internal parameters will be saved.

 GAMMA = shape parameter for generalized Pareto distribution A = scale parameter for generalized Pareto distribution SDGAMMA = standard deviation of GAMMA

If the absolute value of GAMMA is within a user-specified tolerance of zero, then the following are also saved.

 LOC = location parameter for Gumbel distribution. SCALE = scale parameter for Gumbel distribution.

To specify this tolerance, enter the command

SET PEAKS OVER THRESHOLD TOLERANCE <value>

The default tolerance is 0.05.

If GAMMA is less than zero with an absolute value greater than the above tolerance, then the following are also saved.

 GAMMA2 = shape parameter for reverse Weibull distribution. LOC = location parameter for reverse Weibull distribution. SCALE = scale parameter for reverse Weibull distribution.

These estimates for the reverse Weibull and Gumbel distributions are based on moment estimators. The formulas are given on page 3 of NIST Building Science Series 174 (see the Reference section below). Currently, no estimates for the Frechet case (GAMMA > 0) are saved.

Note:
The May, 2005 version added support for generating the output in Latex or HTML. Enter

HELP CAPTURE HTML HELP CAPTURE LATEX

for details. The ASCII output was also modified somewhat. This was a cosemetic change to make the output clearer.

Note:
The PEAKS OVER THRESHOLD PLOT was added in the 5/2005 version. This plot shows how the estimate of the shape parameter changes as the the threshold changes.
Default:
None.
Synonyms:
None
Related Commands:
 DEHAAN = Compute the Dehaan estimates for the generalized Pareto distribution. CME PLOT = Compute a CME plot. GEPPDF = Compute the probability density function for the generalized Pareto distribution. PEAKS OVER THRESHOLD PLOT = Generate a peaks over threshold plot.
Reference:
Johnson, Kotz, and Balakrishnan (1994), "Continuous Univariate Distributions: Volume I," 2nd. ed., John Wiley and Sons.

Heckert, Simiu, and Whalen (1998), "Estimates of Hurricane Wind Speeds by the "Peaks Over Threshold" Approach," Journal of Structural Engineering.

Simiu and Heckert (1996), "Extreme Wind Distribution Tails: A "Peaks Over Threshold" Approach," Journal of Structural Engineering.

Lechner, Simiu, and Heckert (1993), "Assessment of 'peak over threshold' Methods for Estimating Extreme Value Distribution Tails," Structural Safety.

Simiu, Heckert, and Whalen (1996), "Estimates of Hurricane Wind Speeds by the 'Peaks Over Threshold' Method," NIST Technical Note 1416.

Gross, Simiu, Heckert, and Lechner (1995), "Extreme Wind Estimates by the Conditional Mean Exceedance Procedure," NISTIR 5531.

Simiu and Heckert (1995), "Extreme Wind Distribution Tails: A 'Peaks Over Threshold' Approach," NIST Building Science Series 174.

Applications:
Extreme Value Analysis
Implementation Date:
1998/5 2005/5: Updated the output.
2005/5: Added support for HTML and Latex output.
2005/5: Added support for the standard deviation of c.
Program:

SKIP 25
SET WRITE DECIMALS 5
LET Y2 = SORT Y
LET THRESHOL = Y2(900)
SET WRITE DECIMALS 5
CME MLE Y

The following output is generated.
            Generalized Pareto Parameter Estimation (CME)
(Maximum Case)

Summary Statistics (Full Data Set):
Number of Observations:                    977
Sample Mean:                               7.81898
Sample Standard Deviation:                 17.76409
Sample Minimum:                            0.00000
Sample Maximum:                            90.04000

Summary Statistics for
Observations Above Threshold:
Threshold:                                 43.36000
Number of Observations Above Threshold:    77
Sample Mean:                               56.76623
Sample Standard Deviation:                 10.39647

CME Parameter Estimates:
Location Parameter:                        43.36000
Scale Parameter:                           16.24223
Shape Parameter (Gamma):                   -0.20209
Standard Deviation of Gamma:               0.05361
Log-likelihood:                            -0.8068094E+02
AIC:                                        0.1673619E+03
AICc:                                       0.1676907E+03
BIC:                                        0.1743933E+03

For negative Gamma, the generalized Pareto
is equivalent to a reverse Weibull
(SET MINMAX MAX) with:
Shape Parameter (Gamma):                   4.94840
Location Parameter:                        101.72727
Scale Parameter:                           48.99755



NIST is an agency of the U.S. Commerce Department.

Date created: 06/05/2001
Last updated: 10/13/2015