Dataplot Vol 2 Vol 1

# AGRESTI COULL

Name:
AGRESTI COULL (LET)
Type:
Let Subcommand
Purpose:
Compute either the lower or upper Agresti-Coull confidence limit for either a one-sided or a two-sided binomial proportion of a variable.
Description:
The binomial proportion is defined as the number of successes divided by the number of trials.

In this context, we define success as "1" and failure as "0". Dataplot actually allows any two distinct values to be used. However, the larger value will always be considered "success" and the smaller value will always be considered "failure". If the variable contains more than two distinct values, an error is reported.

The BINOMIAL PROPORTION command is used to compute a point estimate of the probability of success. Confidence intervals for the binomial proportion can be computed using a method recommended by Agresti and Coull and also by Brown, Cai and DasGupta (the methodology was originally developed by Wilson in 1927). This method solves for the two values of p0 (say, pupper and plower)) that result from setting z = α/2 and solving for p0 = pupper, and then setting z = -z = α/2 and solving for p0 = plower where zα/2 denotes the variate value from the standard normal distribution such that the area to the right of the value is α/2. The solution for the two values of p0 results in the following confidence intervals:

$$U. L. = \frac{\hat{p} + \frac{z_{\alpha/2}^{2}}{2n} + z_{\alpha/2}\sqrt{\frac{\hat{p}(1-\hat{p})}{n} + \frac{z_{\alpha/2}^{2}}{4n^2}}} {1 + z_{\alpha/2}^{2}/n}$$

$$L. L. = \frac{\hat{p} + \frac{z_{\alpha/2}^{2}}{2n} - z_{\alpha/2}\sqrt{\frac{\hat{p}(1-\hat{p})}{n} + \frac{z_{\alpha/2}^{2}}{4n^2}}} {1 + z_{\alpha/2}^{2}/n}$$

Another advantage is that the limits are in the (0,1) interval. This is not true for the frequently used normal approximation:

$$\hat{p} \pm z_{\alpha/2}\sqrt{\frac{\hat{p}(1 - \hat{p})}{n}}$$

A one-sided confidence interval can also be constructed simply by replacing each zα/2 by zα in the expression for the lower or upper limit, whichever is desired.

Syntax 1:
LET <par> = TWO-SIDED LOWER AGRESTI COULL <y>
<SUBSET/EXCEPT/FOR qualification>
where <y> is the response variable;
<par> is a parameter where the computed value is saved; and where the <SUBSET/EXCEPT/FOR qualification> is optional.

This is for raw data case and <y> should contain a sequence of 0's and 1's. This returns the lower confidence limit for the two-sided Agresti-Coull interval.

Syntax 2:
LET <par> = TWO-SIDED UPPER AGRESTI COULL <y>
<SUBSET/EXCEPT/FOR qualification>
where <y> is the response variable;
<par> is a parameter where the computed value is saved;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

This is for raw data case and <y> should contain a sequence of 0's and 1's. This returns the upper confidence limit for the two-sided Agresti-Coull interval.

Syntax 3:
LET <par> = ONE-SIDED LOWER AGRESTI COULL <y>
<SUBSET/EXCEPT/FOR qualification>
where <y> is the response variable;
<par> is a parameter where the computed value is saved;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

This is for raw data case and <y> should contain a sequence of 0's and 1's. This returns the lower confidence limit for the one-sided lower tailed Agresti-Coull interval.

Syntax 4:
LET <par> = ONE-SIDED UPPER AGRESTI COULL <y>
<SUBSET/EXCEPT/FOR qualification>
where <y> is the response variable;
<par> is a parameter where the computed value is saved;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

This is for raw data case and <y> should contain a sequence of 0's and 1's. This returns the upper confidence limit for the one-sided upper tailed Agresti-Coull interval.

Examples:
LET A = TWO-SIDED LOWER AGRESTI COULL Y1
LET A = TWO-SIDED UPPER AGRESTI COULL Y1
LET A = ONE-SIDED LOWER AGRESTI COULL Y1
LET A = ONE-SIDED UPPER AGRESTI COULL Y1
LET A = TWO-SIDED LOWER AGRESTI COULL Y1 SUBSET TAG > 2
Note:
There are many methods that have proposed for the confidence limits for a binomial proportion. In the statistical literature, what we refer to above as the Agresti-Coull method is now commonly referred to as the Wilson method (this method was originally described in a paper by Wilson). What the Agresti-Coull paper referred to as the adjusted Wald method is now commonly referred to as the Agresti-Coull method.

The Brown, Cai, and DasGupta paper studied the coverage properties of various methods. They specifically recommend the Wilson, the adjusted Wald, and a Bayesion method based on a Jeffreys prior as having the best coverage properties. Specifically, they recommend the Wilson and Jeffreys methods for n ≤ 40. For n > 40, the methods have comparable performance. Although they recommend the adjusted Wald in this case, this is primarily for simplicity in classroom presentation.

In any event, the March, 2014 version of Dataplot added the following command:

SET BINOMIAL METHOD <WILSON/ADJUSTED WALD/JEFFREYS>

Whenever an Agresti-Coull interval is invoked in Dataplot, this command specifies which interval will be computed.

The adjusted Wald interval is

$$\tilde{p} \pm \Phi^{-1}_{(1 - \alpha/2)} \sqrt{\tilde{p}(1 - \tilde{p})}/\sqrt{\tilde{n}}$$

where

$$\tilde{X} = X + (\Phi^{-1}(1 - \alpha/2))^{2}/2 \hspace{0.5in}$$ (X is the number of success)
$$\tilde{n} = n + (\Phi^{-1}(1 - \alpha/2))^{2}$$
$$\tilde{p} = \frac{\tilde{X}} {\tilde{n}}$$
$$\Phi^{-1}$$ is the percent point function of the normal distribution

Note that the adjusted Wald method is never shorter than the Wilson interval.

The Jeffreys interval (the derivation for this interval is given in the Brown, Cai, DasGupta paper) is

LCL = BETPPF(α/2,X + 0.5)
UCL = BETPPF(1 - α/2,n - X + 0.5)

where BETPPF is the percent point function of the beta distribution and X is the number of successes.

The default method is the Wilson interval.

Note:
To specify the signficance level to use for the Agresti-Coull limits, enter the command

LET ALPHA = <value>

The default value of alpha is 0.95.

Note:
Dataplot statistics can be used in a number of commands. For details, enter

HELP STATISTICS

These various commands are actually where the AGRESTI COULL statistics are most commonly used.

Note:
In addition to the commands given here, the following command is also available:

LET AL AU = AGRESTI COULL LIMITS P N ALPHA

This command is a Math Let Subcommand rather than a Statistics LET Subcommand. The distinctions are:

1. The "Statistics" version of the command returns a single parameter value while the "Math" version of the command returns two variables.

2. The "Statistics" version of the command can be used with a number of other commands (see the Note above) while the "Math" version of the command cannot.

For example, the "Statistics" version of the command is most typically used with the FLUCTUATION PLOT, CROSS TABULATE, and STATISTIC PLOT commands.

3. The "Statistics" version of the command expects a single variable (containing a sequence of 1's and 0's). The "Math" version expects summary data (i.e., P and N). The P and N can be either constants, parameters, or variables (or even a mix of these).

Which form of the command to use is determined by the context of what you are trying to do.

For details on the "Math" version of the command, enter

HELP AGRESTI COULL CONFIDENCE LIMITS
Default:
None
Synonyms:
None
Related Commands:
 AGRESTI COULL CONFIDENCE LIMITS = Compute Agresti-Coull confidence limits for binomial proportions. EXACT BINOMIAL = Compute the "exact" confidence limits statistic for binomial proportions. BINOMIAL PROPORTION = Compute the binomial proportion statistic. BINOMIAL PROPORTION TEST = Perform a binomial proportions test. CROSS TABULATE = Perform a cross-tabulation for a specified statistic. FLUCTUATION PLOT = Generate a fluctuation plot. STATISTIC PLOT = Generate a statistic versus subset plot.
References:
Agresti, A. and Coull, B. A. (1998), "Approximate is better than "exact" for interval estimation of binomial proportions", The American Statistician, 52(2), 119-126.

Brown, L. D. Cai, T. T. and DasGupta, A. (2001), "Interval estimation for a binomial proportion," Statistical Science, 16(2), 101-133.

Wilson (1927), "Probable inference, the law of succession, and statistical inference," Journal of the American Statistical Association, Vol. 22, pp. 209-212.

Applications:
Statistics
Implementation Date:
2010/3
2014/3: Support for SET BINOMIAL METHOD command
Program:

LET N = 1
LET P = 0.8
LET ALPHA = 0.95
LET Y = BINOMIAL RANDOM NUMBERS FOR I = 1 1 50
LET AL = ONE SIDED LOWER AGRESTI COULL Y
LET AU = ONE SIDED LOWER AGRESTI COULL Y
LET BL = TWO SIDED LOWER AGRESTI COULL Y
LET BU = TWO SIDED LOWER AGRESTI COULL Y
PRINT AL AU BL BU

The following output is generated.
AL      --  0.6706774E+00
AU      --  0.8605760E+00
BL      --  0.6475845E+00
BU      --  0.8724608E+00


NIST is an agency of the U.S. Commerce Department.

Date created: 10/05/2010
Last updated: 10/07/2016

Please email comments on this WWW page to alan.heckert@nist.gov.