Invited Session: Statistics in Chemical Engineering
Process Understanding & Control Using Multivariate Statistics
Albert S. Tam
Many chemical engineering processes can be instrumented to yield large amounts of data, yet still provide insufficient information to monitor and control the quality of the material being processed. Often, the data only tells about the machine, and/or the volume of data prevents careful examination except when a process problem occurs, and then usually in a univariate fashion only. Real-time, ``intrusive'', measurements of the material state are also needed, but these rarely have straightforward relationships to the material properties of interest. Combining the machine measurements with ``intrusive'' measurements yields a redundant data set which gives us the potential to infer the material state. Multivariate statistics is a valuable tool to reduce the dimensionality of these data sets, and reveal the underlying structure and inner relationships of the process. From this information, we can create a ``fingerprint'' for process monitoring. More importantly, we can use this information to improve our understanding of the process mechanisms, especially for those processes for which a first principles understanding does not exist, and/or those that are to be scaled up from lab to commercial scale.
We will present several examples of application of Principal Components Analysis (PCA), Partial Least Squares (PLS), and Multi-Way PCA (MPCA) to create process understanding and ``fingerprints'', and how these empirical models can be used in improved control schemes to reduce material variability. We will also address issues on implementing such tools in commercial applications.
[Albert S. Tam DuPont Central Research & Development,, Route 141 Experimental Station Bldg. 1, Room 207, Wilmington, DE 19880 USA; tamas@A1.esvax.umc.dupont.com ]
Prediction Intervals for Artificial Neural Networks
Richard D. De Veaux
Lyle H. Ungar
Artificial neural networks (ANN) are being used with increasing frequency as an alternative to traditional statistical models. Unfortunately, ANN models rarely provide any indication of the accuracy or reliability of their predictions. In this paper, we compare two approaches to obtaining prediction limits for ANN's: a frequentist approach, based on standard non-linear regression theory, and a Bayesian approach, following recent work by MacKay and Neal. We compare, via Monte Carlo methods, the coverage probabilities, computational costs and practical implementation issues. Real data sets, primarily from the chemical process and banking industries will be used as well to compare the two approaches.
[Richard D. De Veaux Dept. of Mathematics,, Williams College, Williamstown, MA 01267 USA; deveaux@Williams.EDU ]
Date created: 6/5/2001