Product and Process Comparisons
7.2. Comparisons based on data from one process
7.2.6. What intervals contain a fixed percentage of the population values?
|Definition of a tolerance interval||A confidence interval covers a population parameter with a stated confidence, that is, a certain proportion of the time. There is also a way to cover a fixed proportion of the population with a stated confidence. Such an interval is called a tolerance interval. The endpoints of a tolerance interval are called tolerance limits. An application of tolerance intervals to manufacturing involves comparing specification limits prescribed by the client with tolerance limits that cover a specified proportion of the population.|
|Difference between confidence and tolerance intervals||Confidence limits are limits within which we expect a given population parameter, such as the mean, to lie. Statistical tolerance limits are limits within which we expect a stated proportion of the population to lie.|
|Not related to engineering tolerances||Statistical tolerance intervals have a probabilistic interpretation. Engineering tolerances are specified outer limits of acceptability which are usually prescribed by a design engineer and do not necessarily reflect a characteristic of the actual measurements.|
|Three types of tolerance intervals||
Three types of questions can be addressed by tolerance intervals.
Question (1) leads to a two-sided interval; questions (2) and (3)
lead to one-sided intervals.
|Tolerance intervals for measurements from a normal distribution||
For the questions above, the corresponding tolerance intervals are
defined by lower (L) and upper (U) tolerance limits which are computed
from a series of measurements Y1, ...,
|Calculation of k factor for a two-sided tolerance limit for a normal distribution||
If the data are from a normally distributed population, an
approximate value for the k2 factor as a function
of p and γ for a two-sided tolerance interval
(Howe, 1969) is
where Χ 21-γ, ν is the critical value of the chi-square distribution with degrees of freedom ν that is exceeded with probability γ, and z(1-p)/2 is the critical value of the normal distribution associated with cummulative probability (1-p)/2.
The quantity ν represents the degrees of freedom used to estimate the standard deviation. Most of the time the same sample will be used to estimate both the mean and standard deviation so that ν = N - 1, but the formula allows for other possible values of ν.
|Example of calculation||For example, suppose that we take a sample of N = 43 silicon wafers from a lot and measure their thicknesses in order to find tolerance limits within which a proportion p = 0.90 of the wafers in the lot fall with probability γ = 0.99. Since the standard deviation, s, is computed from the sample of 43 wafers, the degrees of freedom are ν = N - 1.|
|Use of tables in calculating two-sided tolerance intervals||
Values of the k2 factor as a function of p and
γ are tabulated in some textbooks, such as
Massey (1969). To use the normal and chi-square tables in this
handbook to approximate the k2 factor, follow the
steps outlined below.
The tolerance limits are then computed from the sample mean, , and standard deviation, s, according to case(1).
The notation for the critical value of the chi-square
distribution can be confusing. Values as tabulated are, in a sense, already
squared; whereas the critical value for the normal distribution must be
squared in the formula above.
Some software is capable of computing a tolerance intervals for a given set of data so that the user does not need to perform all the calculations. All the tolerance intervals shown in this section can be computed using both Dataplot code and R code. In addition, R software is capable of computing an exact value of the k2 factor thus replacing the approximation given above. R and Dataplot examples include the case where a tolerance interval is computed automatically from a data set.
|Calculation of a one-sided tolerance interval for a normal distribution||
The calculation of an approximate k factor for one-sided tolerance
intervals comes directly from the following set of formulas
|A one-sided tolerance interval example||
For the example above, it may also be of interest to guarantee with
0.99 probability (or 99% confidence) that 90% of the wafers have
thicknesses less than an upper tolerance limit. This problem falls
under case (3). The calculations
for the k1 factor for a one-sided tolerance interval are:
|Tolerance factor based on the non-central t distribution||
The value of k1 can also be computed using the
inverse cumulative distribution function for the non-central
t distribution. This method may give more accurate
results for small values of N. The value of k1 using
the non-central t distribution (using the same example as
where δ is the non-centrality parameter.
In this case, the difference between the two computations is negligble (1.8752 versus 1.8740). However, the difference becomes more pronounced as the value of N gets smaller (in particular, for N ≤ 10). For example, if N = 43 is replaced with N = 6, the non-central t method returns a value of 4.4111 for k1 while the method based on the Natrella formuals returns a value of 5.2808.
The disadvantage of the non-central t method is that it depends on the inverse cumulative distribution function for the non-central t distribution. This function is not available in many statistical and spreadsheet software programs, but it is available in Dataplot and R (see Dataplot code and R code). The Natrella formulas only depend on the inverse cumulative distribution function for the normal distribution (which is available in just about all statistical and spreadsheet software programs). Unless you have small samples (say N ≤ 10), the difference in the methods should not have much practical effect.