7. Product and Process Comparisons 7.2. Comparisons based on data from one process 7.2.4. Does the proportion of defectives meet requirements? 7.2.4.1. Confidence intervals 

Confidence intervals using the method of Agresti and Coull  The Wilson method for calculating confidence intervals for proportions (introduced by Wilson (1927), recommended by Brown, Cai and DasGupta (2001) and Agresti and Coull (1998)) is based on inverting the hypothesis test given in Section 7.2.4. That is, solve for the two values of \(p_0\) (say, \(p_{upper}\) and \(p_{lower}\)) that result from setting \(z = z_{1\alpha/2}\) and solving for \(p_0 = p_{upper}\), and then setting \(z = z_{\alpha/2}\) and solving for \(p_0 = p_{lower}\). (Here, as in Section 7.2.4, \(z_{\alpha/2}\) denotes the variate value from the standard normal distribution such that the area to the left of the value is \(\alpha/2\).) Although solving for the two values of \(p_0\) might sound complicated, the appropriate expressions can be obtained by straightforward but slightly tedious algebra. Such algebraic manipulation isn't necessary, however, as the appropriate expressions are given in various sources. Specifically, we have 
Formulas for the confidence intervals  $$ \large \begin{eqnarray} \mbox{U.L. } & = & \frac{\hat{p} + \frac{z^2_{1\alpha/2}}{2n} + z_{1\alpha/2} \sqrt{ \frac{\hat{p}(1\hat{p})}{n} + \frac{z^2_{1\alpha/2}}{4n^2} }} {1 + \frac{z^2_{1\alpha/2}}{n}} \\ & & \\ & & \\ \mbox{L.L. } & = & \frac{\hat{p} + \frac{z^2_{\alpha/2}}{2n} + z_{\alpha/2} \sqrt{ \frac{\hat{p}(1\hat{p})}{n} + \frac{z^2_{\alpha/2}}{4n^2} }} {1 + \frac{z^2_{\alpha/2}}{n}} \, . \end{eqnarray} $$ 
Procedure does not strongly depend on values of \(p\) and \(n\)  This approach can be substantiated on the grounds that it is the exact algebraic counterpart to the (largesample) hypothesis test given in section 7.2.4 and is also supported by the research of Agresti and Coull. One advantage of this procedure is that its worth does not strongly depend upon the value of \(n\) and/or \(p\), and indeed was recommended by Agresti and Coull for virtually all combinations of \(n\) and \(p\). 
Another advantage is that the lower limit cannot be negative  Another advantage is that the lower limit cannot be negative. That is not true for the confidence expression most frequently used: $$ \hat{p} \pm z_{1\alpha/2}\sqrt{\frac{\hat{p}(1\hat{p})}{n} } \, . $$ A confidence limit approach that produces a lower limit which is an impossible value for the parameter for which the interval is constructed is an inferior approach. This also applies to limits for the control charts that are discussed in Chapter 6. 
Onesided confidence intervals  A onesided confidence interval can also be constructed simply by replacing each \(z_{\alpha/2}\) by \(z_{\alpha}\) in the expression for the lower or upper limit, whichever is desired. The 95 % onesided interval for \(p\) for the example in the preceding section is: 
Example  $$ \large \begin{eqnarray} p & \ge & \mbox{lower limit} \\ & \\ p & \ge & \frac{\hat{p} + \frac{z^2_{\alpha}}{2n} + z_{\alpha} \sqrt{ \frac{\hat{p}(1\hat{p})}{n} + \frac{z^2_{\alpha}}{4n^2} }} {1 + \frac{z^2_{\alpha}}{n}} \\ & & \\ & & \\ p & \ge & \frac{0.013 + \frac{(1.645)^2}{2(200)} 1.645 \sqrt{ \frac{0.013(10.013)}{200} + \frac{(1.645)^2}{4(200)^2} }} {1 + \frac{(1.645)^2}{200}} \\ & & \\ p & \ge & 0.09577 \, . \end{eqnarray} $$ 
Conclusion from the example  Since the lower bound does not exceed 0.10, in which case it would exceed the hypothesized value, the null hypothesis that the proportion defective is at most 0.10, which was given in the preceding section, would not be rejected if we used the confidence interval to test the hypothesis. Of course a confidence interval has value in its own right and does not have to be used for hypothesis testing. 
Exact Intervals for Small Numbers of Failures and/or Small Sample Sizes  
Constrution of exact twosided confidence intervals based on the binomial distribution 
If the number of failures is very small or if the sample size \(N\)
is very small, symmetical confidence limits that are
approximated using the normal distribution may not be accurate enough
for some applications. An exact method based on the binomial
distribution is shown next. To construct a twosided confidence
interval at the \(100(1\alpha)\) %
confidence level for the true proportion defective \(p\)
where \(N_d\)
defects are found in a sample of size \(N\)
follow the steps below.

Note  The interval \((p_L, \, p_U)\) is an exact \(100(1\alpha)\) % confidence interval for \(p\). However, it is not symmetric about the observed proportion defective, \(\hat{p} = N_d/N\). 
Binomial confidence interval example 
The equations above that determine \(p_L\) and \(p_U\)
can be solved using readily available functions.
Take as an example the situation where twenty units are sampled
from a continuous production line and four items are found to be
defective. The proportion defective is estimated to be \(\hat{p}\)
= 4/20 = 0.20.
The steps for calculating a 90 % confidence interval for the true
proportion defective, \(p\)
follow.
1. Initalize constants. alpha = 0.10 Nd = 4 N = 20 2. Define a function for upper limit (fu) and a function for the lower limit (fl). fu = F(Nd,pu,20)  alpha/2 fl = F(Nd1,pl,20)  (1alpha/2) F is the cumulative density function for the binominal distribution. 3. Find the value of pu that corresponds to fu = 0 and the value of pl that corresponds to fl = 0 using software to find the roots of a function.The values of \(p_U\) and \(p_L\) for our example are: pu = 0.401029 pl = 0.071354 Thus, a 90 % confidence interval for the proportion defective, \(p\), is (0.071, 0.400). Whether or not the interval is truly "exact" depends on the software. The calculations used in this example can be performed using both Dataplot code and R code. 
Terminology Note  Previous versions of the Handbook referred to the method described here as the AgrestiCoull method. However, common practice in the statistics literature is to refer to the method given here as the Wilson method and a similar, but different, method described in Brown, Cai, and DasGupta as the AgrestiCoull method (the AgrestiCoull paper refers to this as the "adjusted Wald" method). We have modified our terminology to be consistent with common practice in the statistical literature. 