3. Production Process Characterization
3.4. Data Analysis for PPC


First Steps

Gather all of the data into one place After executing the data collection plan for the characterization study, the data must be gathered up for analysis. Depending on the scope of the study, the data may reside in one place or in many different places. It may be in common factory databases, flat files on individual computers, or handwritten on run sheets. Whatever the case, the first step will be to collect all of the data from the various sources and enter it into a single data file.  The most convenient format for most data analyses is the variables-in-columns format. This format has the variable names in column headings and the values for the variables in the rows.
Perform a quality check on the data using graphical and numerical techniques The next step is to perform a quality check on the data. Here we are typically looking for data entry problems, unusual data values, missing data, etc. The two most useful tools for this step are the scatter plot and the histogram. By constructing scatter plots of all of the response variables, any data entry problems will be easily identified.  Histograms of response variables are also quite useful for identifying data entry problems. Histograms of explanatory variables help identify problems with the execution of the sampling plan. If the counts for each level of the explanatory variables are not the same as called for in the sampling plan, you know you may have an execution problem. Running numerical summary statistics on all of the variables (both response and explanatory) also helps to identify data problems.
Summarize data by estimating location, spread and shape Once the data quality problems are identified and fixed, we should estimate the location, spread and shape for all of the response variables. This is easily done with a combination of histograms and numerical summary statistics.
