Next Page Previous Page Home Tools & Aids Search Handbook
1. Exploratory Data Analysis
1.1. EDA Introduction


What are the EDA Goals?

Primary and Secondary Goals The primary goal of EDA is to maximize the analyst's insight into a data set and into the underlying structure of a data set, while providing all of the specific items that an analyst would want to extract from a data set, such as:
  1. a good-fitting, parsimonious model
  2. a list of outliers
  3. a sense of robustness of conclusions
  4. estimates for parameters
  5. uncertainties for those estimates
  6. a ranked list of important factors
  7. conclusions as to whether individual factors are statistically significant
  8. optimal settings
Insight into the Data Insight implies detecting and uncovering underlying structure in the data. Such underlying structure may not be encapsulated in the list of items above; such items serve as the specific targets of an analysis, but the real insight and "feel" for a data set comes as the analyst judiciously probes and explores the various subtleties of the data. The "feel" for the data comes almost exclusively from the application of various graphical techniques, the collection of which serves as the window into the essence of the data. Graphics are irreplaceable--there are no quantitative analogues that will give the same insight as well-chosen graphics.

To get a "feel" for the data, it is not enough for the analyst to know what is in the data; the analyst also must know what is not in the data, and the only way to do that is to draw on our own human pattern-recognition and comparative abilities in the context of a series of judicious graphical techniques applied to the data.

Home Tools & Aids Search Handbook Previous Page Next Page