
Tutorial
Exploratory Data Analysis Techniques in a Science and Engineering Environment
James J. Filliben Exploratory Data Analysis (EDA) is an approach/philosophy for data analysis which employs a variety of graphical techniques to
The data sets will be drawn primarily from the physical sciences and engineering; we additionally include Clinton/Bush/Perot data from the last election, data from the last Olympics, and a revisiting of some classical textbook data. EDA methods to be discussed include standard commonlyused tools such as histograms, probability plots, box plots, residual plots, Youden plots, and multiplotting, etc. In addition, other less commonlyused (but powerful) techniques such as 4plots, lag plots, PPCC plots, bihistograms, block plots, GANOVA, and interaction effects matrices will be discussed. The link between data set and appropriate EDA technique/methodology is, of course, driven by EDA principles. These principles are extremely important and serve as the guidance system to choose the appropriate technique(s) from an evergrowing collection of EDA methods. Such principles will be discussed along the way in conjunction with each data set. [ James J. Filliben, Statistical Engineering Div., NIST, Gaithersburg, MD 20899 USA; james.filliben@nist.gov ] All data sets used in this talk are available over the web for possible use in academic teaching  the URL is http://www.nist.gov/itl/div882/conf/jrc/eda_datasets.html
Date created: 6/5/2001 