SED navigation bar go to SED home page go to Dataplot home page go to NIST home page SED Home Page SED Contacts SED Projects SED Products and Publications Search SED Pages
Dataplot Vol 1 Auxiliary Chapter

CME

Name:
    CME
Type:
    Analysis Command
Purpose:
    Estimate the parameters of a generalized Pareto distribution using the conditional mean exceedance (CME) method.
Description:
    The generalized Pareto distribution (GPD) is an asymptotic distribution developed by using the fact that exceedances of a sufficiently high threshold are rare events to which the Poisson distribution applies.

    The cumulative distribution function of the generalized Pareto distribution is

      G(y) = 1 - {[1 + (c*y/a)]**(-1/k)}    a > 0, (1+(cy/a)) > 0

    Here, c is the shape parameter and a is the scale parameter.

    This equation can be used to represent the conditional cumulative distribution of the excess Y = X - u of the variate X over the threshold u, given X > u for u sufficiently large.

    The cases c > 0, c = 0, and c < 0 correspond respectively to the extreme value type II (Frechet), extreme value type I (Gumbel), and reverse Weibull domains of attraction.

    Given the mean E(Y) and standard deviation sY of the variate Y, then

      a = 0.5*E(Y)*{1 + [E(Y)/sY]2}
      c = 0.5*{1 - [E(Y)/sY]2}

    The CME, or mean residual life (MRL), is the expectation of the amount by which a value exceeds a threshold u, conditional on that threshold being attained.

    If the exceedance data are fitted by the GPD model and c < 1, u > 0, and (a + u*c) > 0, then a plot of CME versus u should follow a line with intercept a/(1-c) and slope c/(1-c). The linearity of the CME plot can thus be used as an indicator of the appropriateness of the GPD model and both c and a can be estimated.

    Note that for the case where c < 0, then gamma = -1/c is the estimate of the shape parameter for the reverse Weibull (SET MINMAX 2 case in Dataplot) distribution.

    The CME command performs a least squares fit of the CME versus u data points. It does this as follows:

    1. All points above the user specified threshold are saved and sorted.

    2. Loop through the sorted points from smallest to largest.

    3. For a given point in the loop, set the threshold u equal to that point. Then compute the CME. The CME is simply the sum of the points minus the threshold for those points greater than the threshold divided by the number of points greater than the threshold.

    The NISTIR 5531 document (see the References section below) gives the formula for the standard deviation of c.

Syntax:
    CME <y> <SUBSET/EXCEPT/FOR qualification>
    where <y> is the response variable;
    and where the <SUBSET/EXCEPT/FOR qualification> is optional.
Examples:
    CME Y
    CME Y SUBSET TAG > 0
Note:
    The user specified threshold is determined by entering the following command before the CME command:

      LET THRESHOL = <value>

    If no threshold is specified, then the minimum data value is used as the threshold.

Note:
    The following internal parameters will be saved.

      GAMMA = shape parameter for generalized Pareto distribution
      A = scale parameter for generalized Pareto distribution
      SDGAMMA = standard deviation of GAMMA

    If the absolute value of GAMMA is within a user-specified tolerance of zero, then the following are also saved.

      LOC = location parameter for Gumbel distribution.
      SCALE = scale parameter for Gumbel distribution.

    To specify this tolerance, enter the command

      SET PEAKS OVER THRESHOLD TOLERANCE <value>

    The default tolerance is 0.05.

    If GAMMA is less than zero with an absolute value greater than the above tolerance, then the following are also saved.

      GAMMA2 = shape parameter for reverse Weibull distribution.
      LOC = location parameter for reverse Weibull distribution.
      SCALE = scale parameter for reverse Weibull distribution.

    These estimates for the reverse Weibull and Gumbel distributions are based on moment estimators. The formulas are given on page 3 of NIST Building Science Series 174 (see the Reference section below). Currently, no estimates for the Frechet case (GAMMA > 0) are saved.

Note:
    The May, 2005 version added support for generating the output in Latex or HTML. Enter

      HELP CAPTURE HTML HELP CAPTURE LATEX

    for details. The ASCII output was also modified somewhat. This was a cosemetic change to make the output clearer.

Note:
    The PEAKS OVER THRESHOLD PLOT was added in the 5/2005 version. This plot shows how the estimate of the shape parameter changes as the the threshold changes.
Default:
    None.
Synonyms:
    The following are synonyms for CME Y:

      CME GENERALIZED PARETO ESTIMATE Y
      CME GENERALIZED PARETO Y
      CME ESTIMATE Y
Related Commands:
    DEHAAN = Compute the Dehaan estimates for the generalized Pareto distribution.
    CME PLOT = Compute a CME plot.
    GEPPDF = Compute the probability density function for the generalized Pareto distribution.
    PEAKS OVER THRESHOLD PLOT = Generate a peaks over threshold plot.
Reference:
    "Continuous Univariate Distributions: Volume I", 2nd. ed., Johnson, Kotz, and Balakrishnan, John Wiley and Sons, 1994.

    "Estimates of Hurricane Wind Speeds by the "Peaks Over Threshold" Approach", Alan Heckert, Emil Simiu, and Tim Whalen, Journal of Structural Engineering, April, 1998.

    "Extreme Wind Distribution Tails: A "Peaks Over Threshold" Approach", Simiu and Heckert, Journal of Structural Engineering, May, 1996.

    "Assessment of 'peak over threshold' Methods for Estimating Extreme Value Distribution Tails", J. A. Lechner, E. Simiu, N. A. Heckert, Structural Safety, 1993.

    "Estimates of Hurricane Wind Speeds by the 'Peaks Over Threshold' Method", E. Simiu, N. A. Heckert, T. Whalen, NIST Technical Note 1416, February, 1996.

    "Extreme Wind Estimates by the Conditional Mean Exceedance Procedure", J. L. Gross, E. Simiu, N. A. Heckert, J. A. Lechner, NISTIR 5531, April, 1995.

    "Extreme Wind Distribution Tails: A 'Peaks Over Threshold' Approach", E. Simiu, N. A. Heckert, NIST Building Science Series 174, March, 1995

Applications:
    Extreme Value Analysis
Implementation Date:
    1998/5 2005/5: Updated the output.
    2005/5: Added support for HTML and Latex output.
    2005/5: Added support for the standard deviation of c.
Program:
     
    SKIP 25
    READ MPOST550.DAT Y
    LET Y2 = SORT Y
    LET THRESHOL = Y2(900)
    CME Y
        
    The following output is generated.
           *************
           **  CME Y  **
           *************
      
      
               CME ESTIMATION FOR THE GENERALIZED PARETO DISTRIBUTION
      
     NUMBER OF OBSERVATIONS                     =      977
     THRESHOLD                                  =    43.36000
     NUMBER OF OBSERVATIONS ABOVE THE THRESHOLD =       77
      
     ESTIMATE OF SHAPE PARAMETER GAMMA          =  -0.2020854
     STANDARD DEVIATION OF GAMMA                =   0.5361708E-01
     SCALE PARAMETER A                          =    16.24223
      
      
     FOR NEGATIVE GAMMA, THE GENERALIZED PARETO IS EQUIVALENT TO
     A REVERSE WEIBULL (SET MINMAX MAX) WITH:
     SHAPE PARAMETER GAMMA                    =    4.948402
     LOCATION PARAMETER                       =    101.7272
     SCALE PARAMETER                          =    48.99753
      
      
     GAMMA, SDGAMMA, AND A WILL BE SAVED AS INTERNAL PARAMETERS.
     THE REVERSE WEIBULL PARAMETERS WILL BE SAVED AS
     THE INTERNAL PARAMETERS GAMMA2, LOC, AND SCALE,  RESPECTIVELY.
        

Date created: 6/5/2001
Last updated: 5/16/2005
Please email comments on this WWW page to alan.heckret@nist.gov.