 6. Process or Product Monitoring and Control
6.5. Tutorials
6.5.5. Principal Components

## Numerical Example

Calculation of principal components example A numerical example may clarify the mechanics of principal component analysis.
Sample data set Let us analyze the following 3-variate dataset with 10 observations. Each observation consists of 3 measurements on a wafer: thickness, horizontal displacement, and vertical displacement. $${\bf X} = \left[ \begin{array}{ccc} 7 & 4 & 3 \\ 4 & 1 & 8 \\ 6 & 3 & 5 \\ 8 & 6 & 1 \\ 8 & 5 & 7 \\ 7 & 2 & 9 \\ 5 & 3 & 3 \\ 9 & 5 & 8 \\ 7 & 4 & 5 \\ 8 & 2 & 2 \end{array} \right]$$
Compute the correlation matrix First compute the correlation matrix. $${\bf R} = \left[ \begin{array}{rrr} 1.00 & 0.67 & -0.10 \\ 0.67 & 1.00 & -0.29 \\ -0.10 & -0.29 & 1.00 \end{array} \right]$$
Solve for the roots of $${\bf R}$$ Next solve for the roots of $${\bf R}$$ using software.

$$\lambda$$ value proportion

1 1.769 0.590
2 0.927 0.899
3 0.304 1.000
Notice that:
• Each eigenvalue satisfies $$|{\bf R} - \lambda {\bf I}| = 0$$.
• The sum of the eigenvalues $$= 3 = p$$, which is equal to the trace of $${\bf R}$$ (i.e., the sum of the main diagonal elements).
• The determinant of $${\bf R}$$ is the product of the eigenvalues.
• The product is $$\lambda_1 \times \lambda_2 \times \lambda_3 = 0.499$$.
Compute the first column of the $${\bf V}$$ matrix Substituting the first eigenvalue of 1.769 and $${\bf R}$$ in the appropriate equation we obtain $$\left[ \begin{array}{rrr} -0.769 & 0.670 & -0.100 \\ 0.670 & -0.769 & -0.290 \\ -0.100 & -0.290 & -0.769 \end{array} \right] \left[ \begin{array}{c} v_{11} \\ v_{21} \\ v_{31} \end{array} \right] = \left[ \begin{array}{c} 0 \\ 0 \\ 0 \end{array} \right] \, .$$ This is the matrix expression for three homogeneous equations with three unknowns and yields the first column of $${\bf V}$$:   0.64  0.69  -0.34  (again, a computerized solution is indispensable).
Compute the remaining columns of the $${\bf V}$$ matrix Repeating this procedure for the other two eigenvalues yields the matrix $${\bf V}$$. $${\bf V} = \left[ \begin{array}{rrr} 0.64 & 0.38 & -0.66 \\ 0.69 & 0.10 & 0.72 \\ -0.34 & 0.91 & 0.20 \end{array} \right]$$ Notice that if you multiply $${\bf V}$$ by its transpose, the result is an identity matrix, $${\bf V}'{\bf V} = {\bf I}$$.
Compute the $${\bf L}^{1/2}$$ matrix Now form the matrix $${\bf L}^{1/2}$$, which is a diagonal matrix whose elements are the square roots of the eigenvalues of $${\bf R}$$. Then obtain $${\bf S}$$, the factor structure, using $${\bf S} = {\bf VL}^{1/2}$$. $$\left[ \begin{array}{rrr} 0.64 & 0.38 & -0.66 \\ 0.69 & 0.10 & 0.72 \\ -0.34 & 0.91 & 0.20 \end{array} \right] \left[ \begin{array}{ccc} 1.33 & 0 & 0 \\ 0 & 0.96 & 0 \\ 0 & 0 & 0.55 \end{array} \right] = \left[ \begin{array}{rrr} 0.85 & 0.37 & -0.37 \\ 0.91 & 0.10 & 0.40 \\ -0.45 & 0.88 & 0.11 \end{array} \right]$$ So, for example, 0.91 is the correlation between the second variable and the first principal component.
Compute the communality Next compute the communality, using the first two eigenvalues only. $${\bf SS}' = \left[ \begin{array}{rr} 0.85 & 0.37 \\ 0.91 & 0.09 \\ -0.45 & 0.88 \end{array} \right] \left[ \begin{array}{rrr} 0.85 & 0.91 & -0.45 \\ 0.37 & 0.09 & 0.88 \end{array} \right] = \left[ \begin{array}{rrr} 0.8662 & 0.8140 & -0.0606 \\ 0.8140 & 0.8420 & -0.3321 \\ -0.0606 & -0.3321 & 0.9876 \end{array} \right]$$
Diagonal elements report how much of the variability is explained Communality consists of the diagonal elements.

var
1 0.8662
2 0.8420
3 0.9876

This means that the first two principal components "explain" 86.62 % of the first variable, 84.20 % of the second variable, and 98.76 % of the third.

Compute the coefficient matrix The coefficient matrix, $${\bf B}$$, is formed using the reciprocals of the diagonals of $${\bf L}^{1/2}$$. $${\bf B} = {\bf VL}^{-1/2} = \left[ \begin{array}{rrr} 0.48 & 0.40 & -1.20 \\ 0.52 & 0.10 & 1.31 \\ -0.26 & 0.95 & 0.37 \end{array} \right]$$

Compute the principal factors Finally, we can compute the factor scores from $${\bf ZB}$$, where $${\bf Z}$$ is $${\bf X}$$ converted to standard score form. These columns are the principal factors. $${\bf F} = {\bf ZB} = \left[ \begin{array}{rrr} 0.41 & -0.69 & 0.06 \\ -2.11 & 0.07 & 0.63 \\ -0.46 & -0.32 & 0.30 \\ 1.62 & -1.00 & 0.70 \\ 0.70 & 1.09 & 0.65 \\ -0.86 & 1.32 & -0.85 \\ -0.60 & -1.31 & 0.86 \\ 0.94 & 1.72 & -0.04 \\ 0.22 & 0.03 & 0.34 \\ 0.15 & -0.91 & -2.65 \end{array} \right]$$
Principal factors control chart These factors can be plotted against the indices, which could be times. If time is used, the resulting plot is an example of a principal factors control chart. 