6.5.5.2. Numerical Example

6. Process or Product Monitoring and Control
6.5. Tutorials
6.5.5. Principal Components

6.5.5.2. Numerical Example

Calculation of principal components example

A numerical example may clarify the mechanics of principal component analysis.

Sample data set

Let us analyze the following 3-variate dataset with 10 observations. Each observation consists of 3 measurements on a wafer: thickness, horizontal displacement, and vertical displacement. $$ {\bf X} = \left[ \begin{array}{ccc} 7 & 4 & 3 \\ 4 & 1 & 8 \\ 6 & 3 & 5 \\ 8 & 6 & 1 \\ 8 & 5 & 7 \\ 7 & 2 & 9 \\ 5 & 3 & 3 \\ 9 & 5 & 8 \\ 7 & 4 & 5 \\ 8 & 2 & 2 \end{array} \right] $$

Compute the correlation matrix

First compute the correlation matrix. $$ {\bf R} = \left[ \begin{array}{rrr} 1.00 & 0.67 & -0.10 \\ 0.67 & 1.00 & -0.29 \\ -0.10 & -0.29 & 1.00 \end{array} \right] $$

Solve for the roots of ${\bf R}$

Next solve for the roots of ${\bf R}$ using software.

$\lambda$	value	proportion

1	1.769	0.590
2	0.927	0.899
3	0.304	1.000

Notice that:

Each eigenvalue satisfies $|{\bf R} - \lambda {\bf I}| = 0$.
The sum of the eigenvalues $= 3 = p$, which is equal to the trace of ${\bf R}$ (i.e., the sum of the main diagonal elements).
The determinant of ${\bf R}$ is the product of the eigenvalues.
The product is $\lambda_1 \times \lambda_2 \times \lambda_3 = 0.499$.

Compute the first column of the ${\bf V}$ matrix

Substituting the first eigenvalue of 1.769 and ${\bf R}$ in the appropriate equation we obtain $$ \left[ \begin{array}{rrr} -0.769 & 0.670 & -0.100 \\ 0.670 & -0.769 & -0.290 \\ -0.100 & -0.290 & -0.769 \end{array} \right] \left[ \begin{array}{c} v_{11} \\ v_{21} \\ v_{31} \end{array} \right] = \left[ \begin{array}{c} 0 \\ 0 \\ 0 \end{array} \right] \, . $$ This is the matrix expression for three homogeneous equations with three unknowns and yields the first column of ${\bf V}$: 0.64 0.69 -0.34 (again, a computerized solution is indispensable).

Compute the remaining columns of the ${\bf V}$ matrix

Repeating this procedure for the other two eigenvalues yields the matrix ${\bf V}$. $$ {\bf V} = \left[ \begin{array}{rrr} 0.64 & 0.38 & -0.66 \\ 0.69 & 0.10 & 0.72 \\ -0.34 & 0.91 & 0.20 \end{array} \right] $$ Notice that if you multiply ${\bf V}$ by its transpose, the result is an identity matrix, ${\bf V}'{\bf V} = {\bf I}$.

Compute the ${\bf L}^{1/2}$ matrix

Now form the matrix ${\bf L}^{1/2}$, which is a diagonal matrix whose elements are the square roots of the eigenvalues of ${\bf R}$. Then obtain ${\bf S}$, the factor structure, using ${\bf S} = {\bf VL}^{1/2}$. $$ \left[ \begin{array}{rrr} 0.64 & 0.38 & -0.66 \\ 0.69 & 0.10 & 0.72 \\ -0.34 & 0.91 & 0.20 \end{array} \right] \left[ \begin{array}{ccc} 1.33 & 0 & 0 \\ 0 & 0.96 & 0 \\ 0 & 0 & 0.55 \end{array} \right] = \left[ \begin{array}{rrr} 0.85 & 0.37 & -0.37 \\ 0.91 & 0.10 & 0.40 \\ -0.45 & 0.88 & 0.11 \end{array} \right] $$ So, for example, 0.91 is the correlation between the second variable and the first principal component.

Compute the communality

Next compute the communality, using the first two eigenvalues only. $$ {\bf SS}' = \left[ \begin{array}{rr} 0.85 & 0.37 \\ 0.91 & 0.09 \\ -0.45 & 0.88 \end{array} \right] \left[ \begin{array}{rrr} 0.85 & 0.91 & -0.45 \\ 0.37 & 0.09 & 0.88 \end{array} \right] = \left[ \begin{array}{rrr} 0.8662 & 0.8140 & -0.0606 \\ 0.8140 & 0.8420 & -0.3321 \\ -0.0606 & -0.3321 & 0.9876 \end{array} \right] $$

Diagonal elements report how much of the variability is explained

Communality consists of the diagonal elements.

var
1	0.8662
2	0.8420
3	0.9876

This means that the first two principal components "explain" 86.62 % of the first variable, 84.20 % of the second variable, and 98.76 % of the third.

Compute the coefficient matrix

The coefficient matrix, ${\bf B}$, is formed using the reciprocals of the diagonals of ${\bf L}^{1/2}$. $$ {\bf B} = {\bf VL}^{-1/2} = \left[ \begin{array}{rrr} 0.48 & 0.40 & -1.20 \\ 0.52 & 0.10 & 1.31 \\ -0.26 & 0.95 & 0.37 \end{array} \right] $$

Compute the principal factors

Finally, we can compute the factor scores from ${\bf ZB}$, where ${\bf Z}$ is ${\bf X}$ converted to standard score form. These columns are the principal factors. $$ {\bf F} = {\bf ZB} = \left[ \begin{array}{rrr} 0.41 & -0.69 & 0.06 \\ -2.11 & 0.07 & 0.63 \\ -0.46 & -0.32 & 0.30 \\ 1.62 & -1.00 & 0.70 \\ 0.70 & 1.09 & 0.65 \\ -0.86 & 1.32 & -0.85 \\ -0.60 & -1.31 & 0.86 \\ 0.94 & 1.72 & -0.04 \\ 0.22 & 0.03 & 0.34 \\ 0.15 & -0.91 & -2.65 \end{array} \right] $$

Principal factors control chart

These factors can be plotted against the indices, which could be times. If time is used, the resulting plot is an example of a principal factors control chart.