|
7. Product and Process Comparisons 7.2. Comparisons based on data from one process 7.2.4. Does the proportion of defectives meet requirements? 7.2.4.1. Confidence intervals |
|
| Confidence intervals using the method of Agresti and Coull |
The method recommended by
Agresti and Coull (1998)
and also by
Brown, Cai and DasGupta (2001)
(the methodology was originally developed by Wilson in 1927)
is to use the form of the confidence interval that corresponds to the
hypothesis test given in Section 7.2.4.
That is, solve for the two values of p0 (say,
pupper and plower) that result
from setting
z =
z1-α/2
and solving for p0 = pupper, and
then setting
z =
zα/2
and solving for p0 = plower.
(Here, as in Section 7.2.4,
zα/2
denotes the variate value from the
standard normal
distribution such that the area to the left of the value is
/2.) Although
solving for the two values of p0 might sound
complicated, the appropriate expressions can be obtained by
straightforward but slightly tedious algebra. Such algebraic
manipulation isn't necessary, however, as the appropriate expressions
are given in various sources. Specifically, we have
|
| Formulas for the confidence intervals |
|
| Procedure does not strongly depend on values of p and n | This approach can be substantiated on the grounds that it is the exact algebraic counterpart to the (large-sample) hypothesis test given in section 7.2.4 and is also supported by the research of Agresti and Coull. One advantage of this procedure is that its worth does not strongly depend upon the value of n and/or p, and indeed was recommended by Agresti and Coull for virtually all combinations of n and p. |
| Another advantage is that the lower limit cannot be negative |
Another advantage is that the lower limit cannot be negative. That is
not true for the confidence expression most frequently used:
|
| One-sided confidence intervals |
A one-sided confidence interval can also be constructed simply by
replacing each
by
in the
expression for the lower or upper limit, whichever is desired. The
95% one-sided interval for p for the example in the
preceding section is:
|
| Example |
|
| Conclusion from the example | Since the lower bound does not exceed 0.10, in which case it would exceed the hypothesized value, the null hypothesis that the proportion defective is at most 0.10, which was given in the preceding section, would not be rejected if we used the confidence interval to test the hypothesis. Of course a confidence interval has value in its own right and does not have to be used for hypothesis testing. |
| Exact Intervals for Small Numbers of Failures and/or Small Sample Sizes | |
| Constrution of exact two-sided confidence intervals based on the binomial distribution |
If the number of failures is very small or if the sample size
N is very small, symmetical confidence limits that are
approximated using the normal distribution may not be accurate enough
for some applications. An exact method based on the binomial
distribution is shown next. To construct a two-sided confidence
interval at the 100(1-α)%
confidence level for the true proportion defective p where
Nd defects are found in a sample of size N
follow the steps below.
for pL to obtain the lower 100(1-α)% limit for p. |
| Note |
The interval (pL, pU) is an exact
100(1-α)%
confidence interval for p. However, it is not symmetric about the
observed proportion defective,
.
|
| Binomial confidence interval example |
The equations above that determine pL and
pU can be solved using readily available functions.
Take as an example the situation where twenty units are sampled
from a continuous production line and four items are found to be
defective. The proportion defective is estimated to be
= 4/20 = 0.20.
The steps for calculating a 90 % confidence interval for the true
proportion defective, p follow.
1. Initalize constants.
alpha = 0.10
Nd = 4
N = 20
2. Define a function for upper limit (fu) and a function
for the lower limit (fl).
fu = F(Nd,pu,20) - alpha/2
fl = F(Nd-1,pl,20) - (1-alpha/2)
F is the cumulative density function for the
binominal distribution.
3. Find the value of pu that corresponds to fu = 0 and
the value of pl that corresponds to fl = 0 using software
to find the roots of a function.
The values of pu and pl for our example are:
pu = 0.401029
pl = 0.071354
Thus, a 90 % confidence interval for the proportion defective, p, is (0.071, 0.400). Whether or not the interval is truly "exact" depends on the software. The calculations used in this example can be performed using both Dataplot code and R code. |