6.
Process or Product Monitoring and Control
6.5. Tutorials 6.5.5. Principal Components


Calculation of principal components example  A numerical example may clarify the mechanics of principal component analysis.  
Sample data set  Let us analyze the following 3variate dataset with 10 observations. Each observation consists of 3 measurements on a wafer: thickness, horizontal displacement, and vertical displacement. $$ {\bf X} = \left[ \begin{array}{ccc} 7 & 4 & 3 \\ 4 & 1 & 8 \\ 6 & 3 & 5 \\ 8 & 6 & 1 \\ 8 & 5 & 7 \\ 7 & 2 & 9 \\ 5 & 3 & 3 \\ 9 & 5 & 8 \\ 7 & 4 & 5 \\ 8 & 2 & 2 \end{array} \right] $$  
Compute the correlation matrix  First compute the correlation matrix. $$ {\bf R} = \left[ \begin{array}{rrr} 1.00 & 0.67 & 0.10 \\ 0.67 & 1.00 & 0.29 \\ 0.10 & 0.29 & 1.00 \end{array} \right] $$  
Solve for the roots of \({\bf R}\) 
Next solve for the roots of \({\bf R}\)
using software.


Compute the first column of the \({\bf V}\) matrix  Substituting the first eigenvalue of 1.769 and \({\bf R}\) in the appropriate equation we obtain $$ \left[ \begin{array}{rrr} 0.769 & 0.670 & 0.100 \\ 0.670 & 0.769 & 0.290 \\ 0.100 & 0.290 & 0.769 \end{array} \right] \left[ \begin{array}{c} v_{11} \\ v_{21} \\ v_{31} \end{array} \right] = \left[ \begin{array}{c} 0 \\ 0 \\ 0 \end{array} \right] \, . $$ This is the matrix expression for three homogeneous equations with three unknowns and yields the first column of \({\bf V}\): 0.64 0.69 0.34 (again, a computerized solution is indispensable).  
Compute the remaining columns of the \({\bf V}\) matrix  Repeating this procedure for the other two eigenvalues yields the matrix \({\bf V}\). $$ {\bf V} = \left[ \begin{array}{rrr} 0.64 & 0.38 & 0.66 \\ 0.69 & 0.10 & 0.72 \\ 0.34 & 0.91 & 0.20 \end{array} \right] $$ Notice that if you multiply \({\bf V}\) by its transpose, the result is an identity matrix, \({\bf V}'{\bf V} = {\bf I}\).  
Compute the \({\bf L}^{1/2}\) matrix  Now form the matrix \({\bf L}^{1/2}\), which is a diagonal matrix whose elements are the square roots of the eigenvalues of \({\bf R}\). Then obtain \({\bf S}\), the factor structure, using \({\bf S} = {\bf VL}^{1/2}\). $$ \left[ \begin{array}{rrr} 0.64 & 0.38 & 0.66 \\ 0.69 & 0.10 & 0.72 \\ 0.34 & 0.91 & 0.20 \end{array} \right] \left[ \begin{array}{ccc} 1.33 & 0 & 0 \\ 0 & 0.96 & 0 \\ 0 & 0 & 0.55 \end{array} \right] = \left[ \begin{array}{rrr} 0.85 & 0.37 & 0.37 \\ 0.91 & 0.10 & 0.40 \\ 0.45 & 0.88 & 0.11 \end{array} \right] $$ So, for example, 0.91 is the correlation between the second variable and the first principal component.  
Compute the communality  Next compute the communality, using the first two eigenvalues only. $$ {\bf SS}' = \left[ \begin{array}{rr} 0.85 & 0.37 \\ 0.91 & 0.09 \\ 0.45 & 0.88 \end{array} \right] \left[ \begin{array}{rrr} 0.85 & 0.91 & 0.45 \\ 0.37 & 0.09 & 0.88 \end{array} \right] = \left[ \begin{array}{rrr} 0.8662 & 0.8140 & 0.0606 \\ 0.8140 & 0.8420 & 0.3321 \\ 0.0606 & 0.3321 & 0.9876 \end{array} \right] $$  
Diagonal elements report how much of the variability is explained 
Communality consists of the diagonal elements.
This means that the first two principal components "explain" 86.62 % of the first variable, 84.20 % of the second variable, and 98.76 % of the third. 

Compute the coefficient matrix 
The coefficient matrix, \({\bf B}\),
is formed using the reciprocals of
the diagonals of \({\bf L}^{1/2}\).
$$ {\bf B} = {\bf VL}^{1/2} =
\left[ \begin{array}{rrr}
0.48 & 0.40 & 1.20 \\
0.52 & 0.10 & 1.31 \\
0.26 & 0.95 & 0.37
\end{array} \right] $$


Compute the principal factors  Finally, we can compute the factor scores from \({\bf ZB}\), where \({\bf Z}\) is \({\bf X}\) converted to standard score form. These columns are the principal factors. $$ {\bf F} = {\bf ZB} = \left[ \begin{array}{rrr} 0.41 & 0.69 & 0.06 \\ 2.11 & 0.07 & 0.63 \\ 0.46 & 0.32 & 0.30 \\ 1.62 & 1.00 & 0.70 \\ 0.70 & 1.09 & 0.65 \\ 0.86 & 1.32 & 0.85 \\ 0.60 & 1.31 & 0.86 \\ 0.94 & 1.72 & 0.04 \\ 0.22 & 0.03 & 0.34 \\ 0.15 & 0.91 & 2.65 \end{array} \right] $$  
Principal factors control chart  These factors can be plotted against the indices, which could be times. If time is used, the resulting plot is an example of a principal factors control chart. 