Next Page Previous Page Home Tools & Aids Search Handbook

1. Exploratory Data Analysis
1.3. EDA Techniques
1.3.5. Quantitative Techniques

1.3.5.18.

Yates Analysis

Purpose:
Estimate Factor Effects in a 2-Level Factorial Design
Full factorial and fractional factorial designs are common in designed experiments for engineering and scientific applications.

In these designs, each factor is assigned two levels. These are typically called the low and high levels. For computational purposes, the factors are scaled so that the low level is assigned a value of -1 and the high level is assigned a value of +1. These are also commonly referred to as "-" and "+".

A full factorial design contains all possible combinations of low/high levels for all the factors. A fractional factorial design contains a carefully chosen subset of these combinations. The criterion for choosing the subsets is discussed in detail in the process improvement chapter.

The Yates analysis exploits the special structure of these designs to generate least squares estimates for factor effects for all factors and all relevant interactions.

The mathematical details of the Yates analysis are given in chapter 10 of Box, Hunter, and Hunter (1978).

The Yates analysis is typically complemented by a number of graphical techniques such as the dex mean plot and the dex contour plot ("dex" represents "design of experiments"). This is demonstrated in the Eddy current case study.

Yates Order Before performing a Yates analysis, the data should be arranged in "Yates order". That is, given k factors, the kth column consists of 2k-1 minus signs (i.e., the low level of the factor) followed by 2k-1 plus signs (i.e., the high level of the factor). For example, for a full factorial design with three factors, the design matrix is

    - - -
    + - -
    - + -
    + + -
    - - +
    + - +
    - + +
    + + +
          

Determining the Yates order for fractional factorial designs requires knowledge of the confounding structure of the fractional factorial design.

Yates Output A Yates analysis generates the following output.
  1. A factor identifier (from Yates order). The specific identifier will vary depending on the program used to generate the Yates analysis. Dataplot, for example, uses the following for a 3-factor model.
    1 = factor 1
    2 = factor 2
    3 = factor 3
    12 = interaction of factor 1 and factor 2
    13 = interaction of factor 1 and factor 3
    23 = interaction of factor 2 and factor 3
    123 = interaction of factors 1, 2, and 3

  2. Least squares estimated factor effects ordered from largest in magnitude (most significant) to smallest in magnitude (least significant).

    That is, we obtain a ranked list of important factors.

  3. A t-value for the individual factor effect estimates. The t-value is computed as

      t = e/sd(e)

    where e is the estimated factor effect and sd(e) is the standard deviation of the estimated factor effect.

  4. The residual standard deviation that results from the model with the single term only. That is, the residual standard deviation from the model

      response = constant + 0.5 (Xi)

    where Xi is the estimate of the ith factor or interaction effect.

  5. The cumulative residual standard deviation that results from the model using the current term plus all terms preceding that term. That is,

      response = constant + 0.5 (all effect estimates down to and including the effect of interest)

    This consists of a monotonically decreasing set of residual standard deviations (indicating a better fit as the number of terms in the model increases). The first cumulative residual standard deviation is for the model

      response = constant

    where the constant is the overall mean of the response variable. The last cumulative residual standard deviation is for the model

      response = constant + 0.5*(all factor and interaction estimates)

    This last model will have a residual standard deviation of zero.

Sample Output Dataplot generated the following Yates analysis output for the Eddy current data set:
  
(NOTE--DATA MUST BE IN STANDARD ORDER)
NUMBER OF OBSERVATIONS           =        8
NUMBER OF FACTORS                =        3
NO REPLICATION CASE
  
PSEUDO-REPLICATION STAND. DEV.   =    0.20152531564E+00
PSEUDO-DEGREES OF FREEDOM        =        1
(THE PSEUDO-REP. STAND. DEV. ASSUMES ALL
3, 4, 5, ...-TERM INTERACTIONS ARE NOT REAL,
BUT MANIFESTATIONS OF RANDOM ERROR)
  
STANDARD DEVIATION OF A COEF.    =    0.14249992371E+00
(BASED ON PSEUDO-REP. ST. DEV.)
  
GRAND MEAN                       =    0.26587500572E+01
GRAND STANDARD DEVIATION         =    0.17410624027E+01
  
99% CONFIDENCE LIMITS (+-)       =    0.90710897446E+01
95% CONFIDENCE LIMITS (+-)       =    0.18106349707E+01
99.5% POINT OF T DISTRIBUTION    =    0.63656803131E+02
97.5% POINT OF T DISTRIBUTION    =    0.12706216812E+02
  
IDENTIFIER    EFFECT        T VALUE      RESSD:     RESSD:
                                         MEAN +     MEAN +
                                         TERM    CUM TERMS
----------------------------------------------------------
   MEAN       2.65875                   1.74106    1.74106
      1       3.10250         21.8*     0.57272    0.57272
      2      -0.86750         -6.1      1.81264    0.30429
     23       0.29750          2.1      1.87270    0.26737
     13       0.24750          1.7      1.87513    0.23341
      3       0.21250          1.5      1.87656    0.19121
    123       0.14250          1.0      1.87876    0.18031
     12       0.12750          0.9      1.87912    0.00000
      
Interpretation of Sample Output In summary, the Yates analysis provides us with the following ranked list of important factors along with the estimated effect estimate.

  1. X1:
effect estimate = 3.1025 ohms
  1. X2:
effect estimate = -0.8675 ohms
  1. X2*X3:
effect estimate = 0.2975 ohms
  1. X1*X3:
effect estimate = 0.2475 ohms
  1. X3:
effect estimate = 0.2125 ohms
  1. X1*X2*X3:
effect estimate = 0.1425 ohms
  1. X1*X2:
effect estimate = 0.1275 ohms

Model Selection and Validation From the above Yates output, we can define the potential models from the Yates analysis. An important component of a Yates analysis is selecting the best model from the available potential models.

Once a tentative model has been selected, the error term should follow the assumptions for a univariate measurement process. That is, the model should be validated by analyzing the residuals.

Graphical Presentation Some analysts may prefer a more graphical presentation of the Yates results. In particular, the following plots may be useful:
  1. Ordered data plot
  2. Ordered absolute effects plot
  3. Cumulative residual standard deviation plot
Questions The Yates analysis can be used to answer the following questions:
  1. What is the ranked list of factors?
  2. What is the goodness-of-fit (as measured by the residual standard deviation) for the various models?
Related Techniques Multi-factor analysis of variance
Dex mean plot
Block plot
Dex contour plot
Case Study The Yates analysis is demonstrated in the Eddy current case study.
Software Many general purpose statistical software programs, including Dataplot, can perform a Yates analysis.
Home Tools & Aids Search Handbook Previous Page Next Page