1. Exploratory Data Analysis
1.4. EDA Case Studies
1.4.2. Case Studies
1.4.2.9. Fatigue Life of Aluminum Alloy Specimens

## Graphical Output and Interpretation

Goal The goal of this analysis is to select a probabilistic model to describe the dispersion of the measured values of fatigue life of specimens of an aluminum alloy described in [1.4.2.9.1], from among several reasonable alternatives.
Initial Plots of the Data Simple diagrams can be very informative about location, spread, and to detect possibly anomalous data values or particular patterns (clustering, for example). These include dot-charts, boxplots, and histograms. Since building an effective histogram requires that a choice be made of bin size, and this choice can be influential, one may wish to examine a non-parametric estimate of the underlying probability density.

These several plots variously show that the measurements range from a value slightly greater than 350,000 to slightly less than 2,500,000 cycles. The boxplot suggests that the largest measured value may be an outlier.

A recommended first step is to check consistency between the data and what is to be expected if the data were a sample from a particular probability distribution. Knowledge about the underlying properties of materials and of relevant industrial processes typically offer clues as to the models that should be entertained. Graphical diagnostic techniques can be very useful at this exploratory stage: foremost among these, for univariate data, is the quantile-quantile plot, or QQ-plot (Wilk and Gnanadesikan, 1968).

Each data point is represented by one point in the QQ-plot. The ordinate of each of these points is one data value; if this data value happens to be the kth order statistic in the sample (that is, the kth largest value), then the corresponding abscissa is the "typical" value that the kth largest value should have in a sample of the same size as the data, drawn from a particular distribution. If F denotes the cumulative probability distribution function of interest, and the sample comprises n values, then F -1 [(k - 1/2) / (n + 1/2)] is a reasonable choice for that "typical" value, because it is an approximation to the median of the kth order statistic in a sample of size n from this distribution.

The following figure shows a QQ-plot of our data relative to the Gaussian (or, normal) probability distribution. If the data matched expectations perfectly, then the points would all fall on a straight line.

In practice, one needs to gauge whether the deviations from such perfect alignment are commensurate with the natural variability associated with sampling. This can easily be done by examining how variable QQ-plots of samples from the target distribution may be.

The following figure shows, superimposed on the QQ-plot of the data, the QQ-plots of 99 samples of the same size as the data, drawn from a Gaussian distribution with the same mean and standard deviation as the data.

The fact that the cloud of QQ-plots corresponding to 99 samples from the Gaussian distribution effectively covers the QQ-plot for the data, suggests that the chances are better than 1 in 100 that our data are inconsistent with the Gaussian model.

This proves nothing, of course, because even the rarest of events may happen. However, it is commonly taken to be indicative of an acceptable fit for general purposes. In any case, one may naturally wonder if an alternative model might not provide an even better fit.

Knowing the provenance of the data, that they portray strength of a material, strongly suggests that one may like to examine alternative models, because in many studies of reliability non-Gaussian models tend to be more appropriate than Gaussian models.

Candidate Distributions There are many probability distributions that could reasonably be entertained as candidate models for the data. However, we will restrict ourselves to consideration of the following because these have proven to be useful in reliability studies.
Approach A very simple approach amounts to comparing QQ-plots of the data for the candidate models under consideration. This typically involves first fitting the models to the data, for example employing the method of maximum likelihood [1.3.6.5.2].

The maximum likelihood estimates are the following:

• Gaussian: mean 1401, standard deviation 389
• Gamma: shape 11.85, rate 0.00846
• Birnbaum-Saunders: shape 0.310, scale 1337
• 3-parameter Weibull: location 181, shape 3.43, scale 1357
The following figure shows how close (or how far) the best fitting probability densities of the four distributions approximate the non-parametric probability density estimate. This comparison, however, takes into account neither the fact that our sample is fairly small (101 measured values), nor that the fitted models themselves have been estimated from the same data that the non-parametric estimate was derived from.

These limitations notwithstanding, it is worth examining the corresponding QQ-plots, shown below, which suggest that the Gaussian and the 3-parameter Weibull may be the best models.

Model Selection A more careful comparison of the merits of the alternative models needs to take into account the fact that the 3-parameter Weibull model (precisely because it has three parameters), may be intrinsically more flexible than the others, which all have two adjustable parameters only.

Two criteria can be employed for a formal comparison: Akaike's Information Criterion (AIC), and the Bayesian Information Criterion (BIC) (Hastie et. al., 2001). The smaller the value of either model selection criterion, the better the model:

```      AIC  BIC
GAU 1495 1501
GAM 1499 1504
BS  1507 1512
WEI 1498 1505
```

On this basis (and according both to AIC and BIC), there seems to be no cogent reason to replace the Gaussian model by any of the other three. The values of BIC can also be used to derive an approximate answer to the question of how strongly the data may support each of these models. Doing this involves the application of Bayesian statistical methods [8.1.10].

We start from an a priori assignment of equal probabilities to all four models, indicating that we have no reason to favor one over another at the outset, and then update these probabilities based on the measured values of lifetime. The updated probabilities of the four models, called their posterior probabilities, are approximately proportional to exp(-BIC(GAU)/2), exp(-BIC(GAM)/2), exp(-BIC(BS)/2), and exp(-BIC(WEI)/2). The values are 76 % for GAU, 16 % for GAM, 0.27 % for BS, and 7.4 % for WEI.

One possible use for the selected model is to answer the question of the age in service by which a part or structure needs to be replaced to guarantee that the probability of failure does not exceed some maximum acceptable value, for example 0.1 %.The answer to this question is the 0.1st percentile of the fitted distribution, that is G -1 (0.001) = 198 thousand cycles, where, in this case, G -1 denotes the inverse of the fitted, Gaussian probability distribution.

To assess the uncertainty of this estimate one may employ the statistical bootstrap [1.3.3.4]. In this case, this involves drawing a suitably large number of bootstrap samples from the data, and for each of them applying the model fitting and model selection exercise described above, ending with the calculation of G -1 (0.001) for the best model (which may vary from sample to sample).

The bootstrap samples should be of the same size as the data, with each being drawn uniformly at random from the data, with replacement. This process, based on 5,000 bootstrap samples, yielded a 95 % confidence interval for the 0.1st percentile ranging from 40 to 366 thousands of cycles. The large uncertainty is not surprising given that we are attempting to estimate the largest value that is exceeded with probability 99.9 %, based on a sample comprising only 101 measured values.

Prediction Intervals One more application in this analysis is to evaluate prediction intervals for the fatigue life of the aluminum alloy specimens. For example, if we were to test three new specimens using the same process, we would want to know (with 95 % confidence) the minimum number of cycles for these three specimens. That is, we need to find a statistical interval [L, ∞] that contains the fatigue life of all three future specimens with 95 % confidence. The desired interval is a one-sided, lower 95 % prediction interval. Since tables of factors for constructing L, are widely available for normal models, we use the results corresponding to the normal model here for illustration. Specifically, L is computed as

where factor r is given in Table A.14 of Hahn and Meeker (1991) or can be obtained from an R program.