1.
Exploratory Data Analysis
1.2. EDA Assumptions 1.2.5. Consequences


Distributional Analysis  Scientists and engineers routinely use the mean (average) to estimate the "middle" of a distribution. It is not so well known that the variability and the noisiness of the mean as a location estimator are intrinsically linked with the underlying distribution of the data. For certain distributions, the mean is a poor choice. For any given distribution, there exists an optimal choice that is, the estimator with minimum variability/noisiness. This optimal choice may be, for example, the median, the midrange, the midmean, the mean, or something else. The implication of this is to "estimate" the distribution first, and thenbased on the distributionchoose the optimal estimator. The resulting engineering parameter estimators will have less variability than if this approach is not followed.  
Case Studies  The airplane glass failure case study gives an example of determining an appropriate distribution and estimating the parameters of that distribution. The uniform random numbers case study gives an example of determining a more appropriate centrality parameter for a nonnormal distribution.  
Other consequences that flow from problems with distributional assumptions are:  
Distribution 


Model 
 
Process 
