Next Page Previous Page Home Tools & Aids Search Handbook
1. Exploratory Data Analysis
1.3. EDA Techniques
1.3.5. Quantitative Techniques
1.3.5.18. Yates Analysis

1.3.5.18.1.

Defining Models and Prediction Equations

Parameter Estimates Don't Change as Additional Terms Added In most cases of least squares fitting, the model coefficients for previously added terms change depending on what was successively added. For example, the X1 coefficient might change depending on whether or not an X2 term was included in the model. This is not the case when the design is orthogonal, as is a 23 full factorial design. For orthogonal designs, the estimates for the previously included terms do not change as additional terms are added. This means the ranked list of effect estimates simultaneously serves as the least squares coefficient estimates for progressively more complicated models.
Yates Table For convenience, we list the sample Yates output for the Eddy current data set here.
  
(NOTE--DATA MUST BE IN STANDARD ORDER)
NUMBER OF OBSERVATIONS           =        8
NUMBER OF FACTORS                =        3
NO REPLICATION CASE
  
PSEUDO-REPLICATION STAND. DEV.   =    0.20152531564E+00
PSEUDO-DEGREES OF FREEDOM        =        1
(THE PSEUDO-REP. STAND. DEV. ASSUMES ALL
3, 4, 5, ...-TERM INTERACTIONS ARE NOT REAL,
BUT MANIFESTATIONS OF RANDOM ERROR)
  
STANDARD DEVIATION OF A COEF.    =    0.14249992371E+00
(BASED ON PSEUDO-REP. ST. DEV.)
  
GRAND MEAN                       =    0.26587500572E+01
GRAND STANDARD DEVIATION         =    0.17410624027E+01
  
99% CONFIDENCE LIMITS (+-)       =    0.90710897446E+01
95% CONFIDENCE LIMITS (+-)       =    0.18106349707E+01
99.5% POINT OF T DISTRIBUTION    =    0.63656803131E+02
97.5% POINT OF T DISTRIBUTION    =    0.12706216812E+02
  
IDENTIFIER    EFFECT        T VALUE      RESSD:     RESSD:
                                         MEAN +     MEAN +
                                         TERM    CUM TERMS
----------------------------------------------------------
   MEAN       2.65875                   1.74106    1.74106
      1       3.10250         21.8*     0.57272    0.57272
      2      -0.86750         -6.1      1.81264    0.30429
     23       0.29750          2.1      1.87270    0.26737
     13       0.24750          1.7      1.87513    0.23341
      3       0.21250          1.5      1.87656    0.19121
    123       0.14250          1.0      1.87876    0.18031
     12       0.12750          0.9      1.87912    0.00000
      

The last column of the Yates table gives the residual standard deviation for 8 possible models, each with one more term than the previous model.

Potential Models For this example, we can summarize the possible prediction equations using the second and last columns of the Yates table:
  • YHAT = 2.65875

    has a residual standard deviation of 1.74106 ohms. Note that this is the default model. That is, if no factors are important, the model is simply the overall mean.

  • YHAT = 2.65875 + 0.5*(3.1025*X1)

    has a residual standard deviation of 0.57272 ohms. (Here, X1 is either a +1 or -1, and similarly for the other factors and interactions (products).)

  • YHAT = 2.65875 + 0.5*(3.1025*X1 - 0.8675*X2)

    has a residual standard deviation of 0.30429 ohms.

  • 
       YHAT = 2.65875 + 0.5*(3.1025*X1 - 0.8675*X2 + 0.2975*X2*X3)

    has a residual standard deviation of 0.26737 ohms.

  • YHAT = 2.65875 + 0.5*(3.1025*X1 - 0.8675*X2 + 0.2975*X2*X3 + 0.2475*X1*X3)

    has a residual standard deviation of 0.23341 ohms

  • YHAT = 2.65875 + 0.5*(3.1025*X1 - 0.8675*X2 + 0.2975*X2*X3 + 0.2475*X1*X3 + 0.2125*X3)

    has a residual standard deviation of 0.19121 ohms.

  • YHAT = 2.65875 + 0.5*(3.1025*X1 - 0.8675*X2 + 0.2975*X2*X3 + 0.2475*X1*X3 + 0.2125*X3 + 0.1425X1*X2*X3)

    has a residual standard deviation of 0.18031 ohms.

  • YHAT = 2.65875 + 0.5*(3.1025*X1 - 0.8675*X2 + 0.2975*X2*X3 + 0.2475*X1*X3 + 0.2125*X3 + 0.1425X1*X2*X3 0.1275*X1*X2)

    has a residual standard deviation of 0.0 ohms. Note that the model with all possible terms included will have a zero residual standard deviation. This will always occur with an unreplicated two-level factorial design.

Model Selection The above step lists all the potential models. From this list, we want to select the most appropriate model. This requires balancing the following two goals.
  1. We want the model to include all important factors.
  2. We want the model to be parsimonious. That is, the model should be as simple as possible.
Note that the residual standard deviation alone is insufficient for determining the most appropriate model as it will always be decreased by adding additional factors. The next section describes a number of approaches for determining which factors (and interactions) to include in the model.
Home Tools & Aids Search Handbook Previous Page Next Page