Extreme Wind Speeds: Overview

Extreme Wind Speeds: Overview
Introduction	Extreme value analysis is concerned with statistical inference on extreme values and is of interest in a wide range of fields. Two of the main areas of focus are environmental extremes (e.g., river flow, wind speeds, temperature and rainfall) and engineering (e.g., structural reliability and strength of materials).

	Extreme Wind Speeds: Types of Extreme Value Data There are two primary models used in practice for obtaining extreme wind data from a series of wind measurements (these models are commonly used for other types of extreme data as well). These two methods are referred to as the epochal method and the peaks over thresholds method. We will give a brief discussion of each of these.
Epochal Method	In the epochal method, we take the most extreme value for a specified time frame. For example, if we collect wind speeds daily, we might take the most extreme value for each month. So if we had 60 months of data, our extreme value series would be the 60 monthly maximum wind speeds. Note that these are not neccessarily the 60 most extreme points in the daily data. Months that have several high wind speed events would still only return a single value for that month. Likewise, a month where all wind speeds are relatively small would still return a single maximum value.
Peaks Over Threshold Method	In contrast, for the peaks over threshold method we define a single threshold value. Then any values over that threshold are included in our extreme value series. So for our example above of daily wind speeds collected over 60 months, any of the daily wind speeds above the specified threshold will be included in the extreme value series. Unlike the epochal method, the number of extreme values will not be fixed. In our example, we could have months with no extreme values and we could have other months with multiple extreme values. The number of extreme values will depend on the specific threshold chosen.

	Extreme Wind Speeds: Data Sets This section focuses primarily on extreme wind speeds data. Data sets are provided for both non-hurricane and hurricane prone regions.
Selected Data Sets for Extreme Wind Speeds	The following data sets for extreme wind speeds are available at this site: Hurricane Wind Speeds Standardized Non-Hurricane Wind Speeds in the Conterminous United States An Automated Surface Observing System (ASOS) site list is available for about 180 primary sites listed on http://www.ncdc.noaa.gov/oa/ncdc.html. To access the ASOS site list, once you are at the NCDC web site (courtesy of William Brown of the National Climatic Data Center): Click on "Start here" along the left margin. Click on "Inventories" at the bottom right. Click on the fourth link, "Inventories/Surface lists". Click on the second link, "Surface Inventories/Lists". The ASOS site list is the first link. Further information on anemometer elevation history is available. Note 2024/04: The above web site no longer seems to be available. See anemometer elevation history for an alternative site. Several additional archival data sets are available.

	Extreme Wind Speeds: Software This section discusses various software for analyzing extreme wind speeds. Software for extracting data from ASOS data sets Extreme wind speed maps General purpose software for analyzing univariate extreme value data potMax for estimating the peak value of a stationary process based on a peaks over threshold approach Special purpose archival Fortran and Matlab programs
ASOS Data Sets	Software for Extracting Data from ASOS Data Sets Matlab software is provided for extracting wind speed data from ASOS data sets. The software has the capability of extracting separately non-thunderstorm and thunderstorm wind speeds.
General Purpose Software for Analyzing Extreme Wind Speeds	General Purpose Software for Analyzing Extreme Wind Speeds This section gives an overview of the approach for analyzing a univariate set of data containing extreme winds. It also provides links to several software programs that can be used for extreme value analysis. When analyzing univariate sets of data consisting of extreme winds, the following tasks typically need to be performed. Graph the data Determine an appropriate distributional model for the data Estimate the parameters of the distribution Assess the goodness of fit of the fitted distributional model Use the model to estimate quantities of interest Later in this section, we discuss software that can be used to perform these tasks.
Graph the Data	The first step in analyzing the data is to graph the the data. Useful initial graphs of the data are: A run sequence plot of the data is useful for showing time dependent patterns in the data. Examples of time dependent patterns would include strong autocorrelation in the data, non-constant mean, non-constant variance, or notable outliers in the data. If time dependent patterns are present, these issues should be addressed before attempting to fit a distributional model. For extreme value analysis, it can be helpful to draw reference lines at certain threshold values. Graphs showing the distributional shape can be useful. The most common types of distributional graphs are histograms and kernel density plots.
Determine an Appropriate Distributional Model	For extreme values, the following are the most commonly used distributions: Extreme Value Type I (Gumbel) Extreme Value Type II (Frechet) Weibull and reverse Weibull Generalized Pareto Generalized Extreme Value
Estimate the Parameters of the Distribution	There are a number of methods for estimating the parameters of a distribution. These include: The Lieblein BLUE method for Extreme Value Type I (Gumbel) Estimation The method of moments. There are refinements to the method of moments such as the method of modified moments (see Cohen and Whitten (1988), "Parameter Estimation in Reliability and Life Span Models," Marcel Dekker, p. 31 and pp. 341-344) and L moments (see Hosking and Wallis (1997), "Regional Frequency Analysis," Cambridge University Press), that can improve the performance of standard moment estimates. Of the distributions listed above, modified moments estimates are most applicable to the Weibull distribution while L moments are applicable to Weibull, Frechet, generalized Pareto, and generalized extreme value distributions. The method of maximum likelihood. Maximum likelihood estimates typically have excellent statistical properties. However, they are known to be problematic for several of the extreme value distributions (in particular, the generalized Pareto and generalized extreme value distributions). The probability plot correlation coefficient (PPCC)/probability plot method. The PPCC/probability plot method works well for the five extreme value distributions considered here. In some cases, using the Kolmogorov-Smirnov plot or Anderson-Darling plot may improve upon the PPCC plot. The method of elemental percentiles (see Castillo, Hadi, Balakrishnan, and Sarabia (2005), "Extreme Value and Related Models with Applications in Engineering and Science", Wiley) has been successfully used for generalized Pareto and generalized extreme value distributions. The estimates based on elemental percentiles for these distributions tend to be close to the estimates from the PPCC method. For a given distribution, there may be other specialized approaches for estimating the parameters. For the generalized Pareto distribution, the DeHaan and conditional mean exceedance (CME) methods sometimes work well in practice. One issue in developing distributional models for extreme winds data is that we typically want a distributional model for the extreme points (i.e., the points above a given threshold) of the data rather than the full data set.
Assess Goodness of Fit	Once a candidate model has been fit, the next to step is to assess the goodness of fit of that model. Some methods for doing this are: The Kolgmogorov-Smirnov (KS) goodness of fit test can be applied to ungrouped data. The KS test is based on comparing the maximum distance between the empirical cumulative distribution of the data set to the expected cumulative distribution from a theoretical distribution. The Anderson-Darling goodness of fit test is a refinement of the Kolmogorov-Smirnov test. The Anderson-Darling test is generally considered a more powerful test than the KS test and is in particular more sensitive to differences in the tails of the distribution. There are several other variants of the KS test in addition to the Anderson-Darling test. The Cramer von Mises test is one example. The chi-square goodness of fit test can be used for grouped data. If the ungrouped data is available, the KS and Anderson-Darling tests are generally preferred to the chi-square test. The probability plot provides a graphical assessment of goodness of fit. You can generate a histogram or a kernel density plot with the fitted distribution overlaid.
Using the Fitted Model	Once an adequate distributional model has been found, this model will typically be used to estimate some quantities of of interest. For example, Estimate specific quantiles of the distribution Quantiles are estimated from the percent point function (also known as the inverse cumulative distribution function). Estimate return intervals and wind speeds corresponding to a given return interval The return interval (or mean recurrence interval) of a given wind speed, in years, is defined as the inverse of the probability that the wind speed will be exceeded in any one year. It is defined as \( \frac{1} {1 - F(x)} \) with \( F(x) \) denoting the cumulative distribution function. More often, we would like to compute the wind speed that corresponds to a given mean return interval. The solution to this is given by solving the above equation for x \( X_{R} = G(1 - \frac{1}{R}) \) with \( G \) and \( R \) denoting the percent point function and the desired mean recurrence interval, respectively. The above formula is for the case of a set consisting of single yearly maxima. If \( \lambda \) is the mean number of threshold crossings of the extreme speed record per year, the formula is \( X_{R} = G(1 - \frac{1}{\lambda R}) \) See Simiu and Scanlan for a more complete discussion of mean recurrence intervals.
Software for Extreme Wind Speeds	Software is available for performing most of the graphing and distributional modeling tasks described above. For example, Dataplot is a freely available general purpose statistical and data analysis program that supports most of the graphics and distributional modeling capabilities described above. R is a general purpose software environment for statistical computing and graphics. R is widely used in the statistical community and can perform most of the graphics and distributional modeling capabilities described above. Specifically, the Pintar method for the Peaks over Threshold Poisson Process analysis was implemented in R. In particular, the following site documents R packages useful for extreme value analysis. We provide an example of modeling data with the Gumbel distribution using Excel. Note that the above list is not exhaustive. Many commercial statistical/mathematical/spreadsheet programs can be used to analyze extreme value data. The software described above is intended to provide an example of how these analyses can be performed and is not meant to imply that these software programs are the only ones or the best ones for extreme value analysis.
Special Purpose Archival Fortran Programs	Special Purpose Archival Fortran and Matlab Programs Several special purpose archival Fortran and Matlab programs are availble. Fortran-based program for analyzing the hurricane wind speed data sets using the De Haan method. Fortran-based program for analyzing the daily maximum wind speed data sets using the De Haan method. Fortran-based programs for estimation of along-wind response. Matlab programs for the estimation of peaks from time series.
[Extreme Winds Home \| SED Home \| Structures Group ] Date created: 03/05/2004 Last updated: 04/24/2023 Please email comments on this WWW page to SED_webmaster@nist.gov