1.
Exploratory Data Analysis
1.4. EDA Case Studies 1.4.2. Case Studies 1.4.2.9. Fatigue Life of Aluminum Alloy Specimens


Generation  This data set comprises measurements of fatigue life (thousands of cycles until rupture) of rectangular strips of 6061T6 aluminum sheeting, subjected to periodic loading with maximum stress of 21,000 psi (pounds per square inch), as reported by Birnbaum and Saunders (1958).  
Purpose of Analysis 
The goal of this case study is to select a probabilistic model,
from among several reasonable alternatives, to describe the
dispersion of the resulting measured values of lifelength.
The original study, in the field of statistical reliability analysis, was concerned with the prediction of failure times of a material subjected to a load varying in time. It was wellknown that a structure designed to withstand a particular static load may fail sooner than expected under a dynamic load. If a realistic model for the probability distribution of lifetime can be found, then it can be used to estimate the time by which a part or structure needs to be replaced to guarantee that the probability of failure does not exceed some maximum acceptable value, for example 0.1 %, while it is in service. The chapter of this eHandbook that is concerned with the assessment of product reliability contains additional material on statistical methods used in reliability analysis. This case study is meant to complement that chapter by showing the use of graphical and other techniques in the model selection stage of such analysis. When there is no cogent reason to adopt a particular model, or when none of the models under consideration seems adequate for the purpose, one may opt for a nonparametric statistical method, for example to produce tolerance bounds or confidence intervals. A nonparametric method does not rely on the assumption that the data are like a sample from a particular probability distribution that is fully specified up to the values of some adjustable parameters. For example, the Gaussian probability distribution is a parametric model with two adjustable parameters. The price to be paid when using nonparametric methods is loss of efficiency, meaning that they may require more data for statistical inference than a parametric counterpart would, if applicable. For example, nonparametric confidence intervals for model parameters may be considerably wider than what a confidence interval would need to be if the underlying distribution could be identified correctly. Such identification is what we will attempt in this case study. It should be noted  a point that we will stress later in the development of this case study  that the very exercise of selecting a model often contributes substantially to the uncertainty of the conclusions derived after the selection has been made. 

Software  The analyses used in this case study can be generated using R code.  
Data 
The following data are used for this case study.
370 1016 1235 1419 1567 1820 706 1018 1238 1420 1578 1868 716 1020 1252 1420 1594 1881 746 1055 1258 1450 1602 1890 785 1085 1262 1452 1604 1893 797 1102 1269 1475 1608 1895 844 1102 1270 1478 1630 1910 855 1108 1290 1481 1642 1923 858 1115 1293 1485 1674 1940 886 1120 1300 1502 1730 1945 886 1134 1310 1505 1750 2023 930 1140 1313 1513 1750 2100 960 1199 1315 1522 1763 2130 988 1200 1330 1522 1768 2215 990 1200 1355 1530 1781 2268 1000 1203 1390 1540 1782 2440 1010 1222 1416 1560 1792 