SED navigation bar go to SED home page go to Dataplot home page go to NIST home page SED Home Page SED Staff SED Projects SED Products and Publications Search SED Pages
Dataplot Vol 1 Vol 2

CONSENSUS MEAN

Name:
    CONSENSUS MEAN
Type:
    Analysis Command
Purpose:
    Compute an estimate of a consensus mean, and the associated uncertainty, based on the data from multiple laboratories or multiple methods.
Overview:
    The problem of determining a consensus mean based on data from two or more laboratories (or from two or more methods from the same laboratory) is a common one for measurement laboratories. A few specific applications of consensus means are Standard Reference Materials (SRM's), interlaboratory studies, and key comparisons.

    There are a number of approaches to this problem. The Dataplot CONSENSUS MEANS command computes estimates for a variety of methods and does not specify which is the most appropriate method for a given data set. Consult with a statistician for guidance on which method is most appropriate for your data.

Grand Mean Model:
    The simplest model is to assume that there is no lab effect (i.e., we treat the data as if it all came from the same lab).

    In this case, the consensus mean is simply the grand mean of all the data and a confidence interval for the consensus mean is simply the standard t-based condidence interval:

      xbar +/- tppf(1-alpha/2,n-1)*s/sqrt(n)

    where xbar is the overall mean, tppf is the percent point function for the t distribution, s is the standard deviation of all the points, and n is the total number of points.

    The assumption of no lab effect is unrealistic in almost all cases. However, we include the grand mean method as a reference point as it gives an indication of how including the lab effects the estimate of the consensus mean and its uncertainty.

Mean of Means Model:
    The mean of means model was originally recommended by Churchill Eisenhart and was one of the earliest methods used for SRM's.

    For this method, we compute the mean for each of the k laboratories. Then we compute xbar and s as the mean and standard deviation of these k means. The estimate of the consensus mean is simply xbar and we compute the following confidence interval for the consensus mean:

      xbar +/- tppf(1-alpha/2,nlab-1)*s/sqrt(nlab)

    The limitations of this method are discussed in the "An ISO GUM Approach to Combining Results from Multiple Methods" paper (see the Reference section).

    For this method, the consensus mean estimate is an equi-weighted mean with no regard to possible differences in within-lab variation or within-lab sample sizes. The advantages of this method are that it is robust and simple to compute. The primary disadvantage is that no consideration is given to possible differences in the within-lab variation and sample sizes.

Common Mean Model:
    Assume there are k laboratories, each measuring the unknown underlying (nonrandom) value mu common to all laboratories. The measurements xij i = 1, ..., k; j = 1, ..., ni are of the form

      x(ij) = mu + e(ij)

    with independent Gaussian errors e(ij) ~ N(0,kappa(i)^2). All parameters mu, kappa(i)^2 i = 1, ...., k are unknown and the goal is to estimate mu, determine its standard error, and to provide a confidence interval for mu.

    Unbiased estimates of the within lab means and variances sigma(i)^2 = kappa(i)^2/n(i) are:

      x(i) = xbar(i) = SUM[j=1 to n(i)][x(ij)]/n(i)

      s(i)^2 = SUM[j=1 to n(i)][(x(ij) - x(i))^2]/(n(i)*(n(i)-1))

    When the variaces sigma(i)^2 are known, the best, in terms of mean squared error, unbiased estimator of the reference value mu is the weighted means statistics

      xtilde = SUM[i=1 to k][(w(i)x(i)]/SUM[i=1 to k][w(i))]

    with w(i) = 1/sigma(i)^2. The formula for the variance is

      Var(xtilde) = E(xtilde - mu)^2 = 1/SUM[i=1 to k][w(i)]

    In practice, these within lab variances are unknown and so the true wi are also unknown.

    The Graybill-Deal method is based on this model. In the Graybill-Deal model, the estimate of the consensus mean is

      xtilde = SUM[i=1 to k][x(i)*(1/s(i)**2)]/SUM[i=1 to k][(1/s(i)**2)]

    Dataplot supports four methods for computing the variance of the Graybill-Deal consensus mean.

    1. The naive estimate of the variance is obtained by replacing the sigma(i)^2 with the sample estimates s(i)**2

        Var[xtilede] = 1/SUM[i=1 to k][1/s(i)**2]

      Although this variance is easy to compute and widely used, it is known to underestimate the true variance (and rather badly for small sample sizes).

    2. Sinha proposed the variance estimator

        Var[xtilde] = 1/SUM[i=1 to k][1/s(i)**2]*
[1 + 4*SUM[i=1 to k][what(i)*(1-what(i))/(n(i)-1)]

      with

        what(i) = (1/s(i)**2)/SUM[j=1 to k][1/s(j)**2]

      Zhang performed some simulations that indicate that while this estimate of the variance reduces the bias, it still underestimates it.

    3. To reduce the bias further, Zhang proposed the following estimate for the variance

        Var[xtilde] = 1/SUM[i=1 to k][1/{((n(i)-3)/(n(i)-1))*s(i)**2}]

    4. Zhang proposed the following additional estimate for the variance

        Var[xtilde] = 1/SUM[i=1 to k][1/{((n(i)-3)/(n(i)-1))*s(i)**2}]*
[1 + 2*SUM[i=1 to k][what(i)*(1-what(i)/(n(i)-1)]

      where

        what(i) = ((n(i)-3)/(n(i)-1))*(1/s(i)**2)/
SUM[j=1 to k][((n(j)-3)/(n(j)-1))*(1/s(j)**2)]

    Dataplot currently generates confidence intervals for the Graybill-Deal method using a method proposed by Rukhin (private communication). This method generates conservative intervals.

    The Graybill-Deal approach has the following limitations

    1. It does not take into account between lab effects. If the between lab variance is in fact significant, the Graybill-Deal may not be the appropriate approach.

    2. Labs with small variances may recieve unjustifiably large weights and therefore dominate the estimate of the consensus mean.

One Way Random Effects Model:
    In order to account for between lab variance, we can define a one-way random effects ANOVA model which may be both unbalanced and heteroscedastic:

      x(ij) = mu + b(i) + e(ij)

    where there are i = 1, ..., k labs and j = 1, .... ni observations for each lab. In this model, mu denotes the consensus mean, bi is the lab effect and eij is the error term. The bi are distributed as N(0,sigma**2) and the eij are distributed as N(0,sigma**2). That is, sigma(i)**2 are the within lab variances and sigma**2 is the between lab variance.

    For convenience, define the following terms:

      xi = mean for lab i
      s(i)**2 = variance for lab i
      tau(i)**2 = sigma(i)**2/n(i) (this was s(i)**2 in the common means model)
      vi = ni - 1
      gamma(i) = sigma**2/(sigma**2 + tau(i)**2)
      t(i)**2 = s(i)**2/n(i) (= variance of the mean)

    The Mandel-Paule, modified Mandel-Paule, maximum likelihood (ML), DerSimonian-Laird, and generalized confidence interval methods are based on this model. We will discuss each of these in turn.

    1. Mandel-Paule/Modified Mandel-Paule

      The Mandel-Paule estimate of the consensus mean is defined as

        xtilde = SUM[i=1 to k][w(i)*x(i)]/SUM[i=1 to k][w(i)]

      with wi denoting the weight function

        w(i) = 1/(y + t(i)**2)

      where y is an estimate of the between lab variance.

      The between lab variance is estimated by iteratively solving the following equation:

        SUM[i=1 to k][(x(i)-xtilde)**2/(y + t(i)**2)] = k-1

      The modified Mandel-Paule procedure uses k on the right hand side instead of (k-1).

      The confidence interval for the consensus mean is computed as (equation 19 in the Ruhkin and Vangel paper).

        xtilde +/- norppf(alpha/2)*
{SQRT(SUM[i=1 to k][(x(i)-xtilde)**2/(y + t(i)**2))]}/
{SUM[i=1 to k][1/(y + t(i)**2)]}

      The Mandel-Paule estimates can be considered an approximation to maximum likelihood estimates, but they are computationally simpler. The Mandel-Paule methods are a reasonable choice when the number of labs is greater than or equal to six. For a smaller number of labs, the uncertainty intervals are generally too small.

      Dataplot uses code provided by Mark Vangel to compute the Mandel-Paule estimates.

    2. Rukhin-Vangel Maximum Likelihood (ML)

      The Rukhin-Vangel paper gives the likelihood function. From this likelihood function, the ML estimate for the consensus mean is obtained from the equation

        xtilde = SUM[i=1 to k][gamma(i)*x(i)]/SUM[i=1 to k][gamma(i)]

      The ML estimate of the between lab variance sigma**2 is obtained from the equation

        sigma**2 = SUM[i=1 to k][gamma(i)*((x(i)-xtilde)**2 +
 v(i)*t(i)**2/(1-gamma(i))]/(N+k)

      Rukhin and Vangel give a ML estimate of gamma(i). This estimate is fairly complex and not repeated here. This ML estimate of gamma(i) is solved numerically using an iterative algorithm. The Mandel-Paule estimates are used as starting values for the consensus mean and between lab variance.

      The confidence interval for the ML estimate has the same form as the Mandel-Paule confidence interval. However, the t(i)**2 are replaced with tau(i)**2 in the formula and the ML estimate of the between lab variance is used.

      The mathematical details of the ML procedure are given in the Rukhin-Vangel paper. They also show why the Mandel-Paule estimates provide a good aproximation to the ML estimates.

      This is the recommended method of choice when the number of labs is large (>= 6). As with Mandel-Paule, the uncertainty intervals tend to be too small when the number of labs is small (<= 5).

      Dataplot uses code provided by Mark Vangel to compute the maximum likelihood estimates.

    3. DerSimonian-Laird

      The DerSimonian-Laird procedure is as follows:

      • Compute the Graybill-Deal estimate as an initial estimate of the consensus mean (see the above description for Graybill-Deal).

      • Determine a non-negative estimate of the between lab variance from

          YDL = MAX[0,TERM1/(TERM2 - TERM3/TERM2)]

        where

          TERM1 = SUM[i=1 to k][(1/s(i)**2)*(x(i) - xtilde(GD))**2] - k + 1
          TERM2 = SUM[i=1 to k][1/s(i)**2]
          TERM3 = SUM[i=1 to k][1/s(i)**4]
          xtilde(GD) = Graybill-Deal estimate of the consensus mean

      • Estimate the Dersimonian-Laird weights and use them to compute the Dersimonian-Laird consensus mean

          w(i) = (1/(YDL + s(i)**2))

          xtilde(DL) =  w(i)*x(i)/SUM[i=1 to k][w(i)]

      • Compute the variance of the Dersimonian-Laird consensus mean estimate

          Var(xtilde(DL)) = SUM[i=1 to k][w(i)**2*
(x(i) - xtilde(DL))**2/(1 - w(i))]

      • Compute confidence intervals for the DerSimonian Laird consensus mean.

        Dataplot computes two confidence limits for the DerSimonian-Laird method. The first is the standard t-based interval:

          xtilde(DL) +/- SQRT(Var(xtilde(DL)))*t(1-alpha/2,k-1)

        The second is an interval proposed by Rukhin. The Rukhin interval is conservative and should maintain the nominal coverage even when the sample sizes and variances for the labs vary widely.

    4. Iyer and Wang Generalized Confidence Intervals

      Iyer, Wang, and Matthew have applied the generalized confidence interval approach of Weerhandi to the problem of finding confidence limits for the consensus mean.

      The description of this method is rather involved and not given here. See the Wang, Iyer, and Matthews article listed in the Reference section below. Dataplot uses code provided by Jack Wang to compute the confidence intervals for this approach.

      The primary advantage of this method is that it can be applied to cases where there are a small number of labs. It is also more robust than the Mandel-Paule and maximum likelihood when the normality assumptions are violated.

Some Additional Methods
    1. BOB (Type B on Bias)

      This method is discussed in detail in the "An ISO GUM Approach to Combing Results from Multiple Methods" paper (see the Reference section).

      This method should only applied if there are between two and five methods. It is based on the type B model of bias (which is where the name BOB comes from).

      BOB is based on the model

        gamma = mu + beta

      where gamma is the unknown value of the measurand, mu is the equally weighted mean of the population means of the methods, and beta is the possible bias of the mu as an estimate of gamma. Both mu and beta require estimates and uncertainites of the estimates. The estimate of mu is the sample mean of the set of method (or lab) results.

      We assume the best estimate of beta is 0, but there is uncertainty in this estimate. A probability distribution is placed on the value of beta that best summarizes the available information. A common choice is to use a uniform distribution (the ISO GUM paper provides the rationale for this choice), and that is the distributional model Dataplot uses. That is, we assume a uniform distribution centered at zero with upper and lower bounds of +a and -a. For this uniform distribution, the standard uncertainty is a/SQRT(3). The choice for a is the difference between the minimum and maximum lab (or method) mean divided by 2. This yields (xmax-xmin)/sqrt(12) as the uncertainty for the bias term.

      Dataplot combines these to get the following uncertainty factor:

        KU = 2*SQRT(SW**2 + SB**2)

      where

        SUM[i=1 to k][s(i)**2)/nlab**2]

      where s(i) is the standard deviation of the ith lab mean and

        s(b)**2 = (xbar(max)-xbar(xmin))**2/12

      Here, SW is the within lab variability (where s is the standard deviation of the lab means) and SB is the between lab variability.

      The ISO GUM paper discusses some variations of this basic technique. For example, Dataplot uses a factor of 2 for the expanded uncertainty interval. This can be replaced with a t-value where the degrees of freedom are computed using the Welch-Saitterwaite approximation.

      The BOB method is the preferred method for SRM'swhen the number of labs is small (<= 5).

    2. Schiller-Eberhardt

      A number of variants of this method have been used. Dataplot implements the method as discussed in the Schiller-Eberhardt paper (see the Reference section below).

      The Schiller-Eberhardt estimate of the consensus mean is:

        xtilde = SUM[i=1 to k][omega(i)*x(i)]

      where xi is the mean of the ith lab and omega(i) is the weighting function:

        omega(i) = w(i)/SUM[i=1 to k][w(i)]

      where

        w(i) = (1/(s(i)**2 + sigma(b)**2)

      Here, sigma(b)**2 is the between lab variance and s(i)**2 is the variance of the ith lab. The between lab variance is estimated as the smallest non-negative value that satisfies

        SUM[i=1 to k][w(i)*(x(i) - xtilde)**2]/(k-1) = 1

      This is solved iteratively using the Mandel-Paule algorithm. Note that Dataplot uses the between lab variance computed by the Mandel-Paule method described above.

      The uncertainty interval for Schiller-Eberhardt is defined as

        U = t(1-alpha/2,df)*SQRT(s(xtilde)**2 + sigmah**2) + bias allowance

      where s(xtilde)**2 is the variance of the consensus mean, sigmah**2 is the material variability variance, and bias allowance is defined as

        Bias allowance = MAX|x(i) - xtilde|

      The variance of the Schiller-Eberhardt consensus mean is computed as

        s(xtilde)**2 = SUM[i=1 to k][omega(i)**2*s(i)**2]

      where s(i)**2 is the variance of the ith lab mean and the omega weight function is defined as

        omega(i) = w(i)/SUM[i=1 to k][W(i)]

        w(i) = 1/s(i)**2

      Note that the weight function for s(xtilde)**2 omits the between lab variance term that is included in the weight function for the consensus mean.

      The variance of the material variability is discussed in the Schiller-Eberhardt paper. This is computed independently of the data given to the CONSENSUS MEAN command. To specify a value for , enter the following commands before entering the CONSENSUS MEAN command.

      SIGMAH contains the value of the variance and DFH contains the corresponding degrees of freedom for the materials variance.

      The degrees of freedom for the t percent point function in the uncertainty is computed as

        df(effective) = (NUM**2)/(DEN**2) where
 NUM = SUM[i=1 to k][omega(i)**2*s(i)**2 + sigmah**2]
 DEN = SUM[i=1 to k][(omega(i)**2*s(i)**2)**2/(n(i)-1) +
 sigmah**4/df(sigmah))

      This method has been superseeded by the BOB method. As with BOB, the Schiller-Eberhardt method is intended for a small number of labs (<= 5).

Data Analytic Considerations:
    In determining the most appropriate estimate of the consensus mean, the following issues need to be addressed.

    1. What is the definition of the consensus mean. Is this a lab independent number which represents an absolute physical truth or is this a lab-dependent "average" across all participating labs?

    2. How many labs are there? Some methods are more appropriate for a small number of labs while others are based on asymptotic results and are thus more appropriate for a larger number of labs.

    3. Do between-lab differences (biases) exist?

    4. Are there differences in within-lab variation?

    5. Are there differences in within lab sample sizes?

    6. Does a lab with much data have such only because the lab's method is cheaper and thus of potentially poorer quality than other labs?

    7. Are all labs treated equally?

    8. Do "star" labs exist? That is, labs that are known to be either super unbiased or super accurate.

    9. If an engineering equal lab tests out to be a statistically outlying lab, how (a prioori) is that lab to be weighted?

    Answers to the above questions will determine how to appropriately weight the labs. The consensus mean will be a weighted mean of the lab means. The weighting can be either fixed (i.e., equal weights) or variable where the variable weights can be based on both engineering and statistical considerations.

    If the engineering decision is made to treat all labs as equal in importance, then from a statistical point of view the analysis consists primarily of the following two steps:

    1. estimation of a consensus mean;
    2. estimation of an uncertainty limits for the consensus mean.

    An additional third step is to carry out formal statistical tests to identify potentially outlying labs. A statistically unsolvable question that persists here is that just because a lab appears "different" does not necessarily mean that the lab is wrong (i.e., biased). The spectre that all of the consistent labs being self-behaved but biased is a real possibility which can only be solved by engineering judgement.

Description of the Dataplot Input:
    Dataplot can accept data in either one of the following formats:

    1. Raw Data - there should two columns of data. The first column contains the respone values and the second column contains the corresponding lab-id. The data do not need to be sorted by lab-id.

      If your data is the form where each lab is contained in a separate column, you can do something like the following

        LET Y LABID = STACK Y1 Y2 Y3 Y4 Y5 Y6
        CONSENSUS MEAN Y LABID

      This example will take the data for six labs stored in Y1, Y2, Y3, Y4, Y5, and Y6 and save it the variables Y and LABID in a format that can be used by the CONSENSUS MEAN command.

    2. Summary Data - there should three columns of data. The first column contains the sample means for the labs, the second column contains the sample standard deviations for the labs, and the third column contains the sample sizes for the labs.
Description of Dataplot Output:
    Dataplot generates the following four sections of output for the consensus means analysis.

    1. The first section prints summary information about the data. Specifically, it prints the overall mean, the number of observations, and the number of labs. It also prints a table giving the sample size, mean, variance, standard deviation, and standard deviation of the mean for each lab. It then prints the pooled within lab variance (and standard deviation). The pooled within lab variance is computed as: as

        VARw = SUM[i=1 to k][(n(i)-1)*s(i)**2]/SUM[i=1 to k][n(i)-1]

      with s(i)**2 denoting the variance of the ith lab.

    2. The second section prints the detailed output for each method.

    3. The third section prints a summary table containing the 95% confidence limits for each method.

    4. The fourth section prints summary tables containing the uncertainty and the percent relative uncertainty. Separate tables are printed for standard uncertainty (k = 1) and expanded uncertainty (k = 2).
Syntax 1:
    CONSENSUS MEANS <y> <tag>      <SUBSET/EXCEPT/FOR qualification>
    where <y> is a response variable;
                <tag> is a lab id variable;
    and where the <SUBSET/EXCEPT/FOR qualification> is optional.

    This syntax computes the consensus means based on the raw data.

Syntax 2:
    CONSENSUS MEANS <ymean> <ysd> <ni>
                            <SUBSET/EXCEPT/FOR qualification>
    where <ymean> is a variable containing the lab means;
                <ysd> is a variable containing the lab standard deviations;
                <ni> is a variable containing the lab sample sizes;
    and where the <SUBSET/EXCEPT/FOR qualification> is optional.

    This syntax computes the consensus means based on the lab means, standard deviations, and sample sizes.

Examples:
    CONSENSUS MEANS Y1 GROUP
    CONSENSUS MEANS Y1 GROUP SUBSET GROUP > 2
    CONSENSUS MEANS YMEAN YSD NI
Note:
    Variations of the above methods are commonly used. For example, the uncertainty intervals may be modified to incorporate external sources of uncertainty. For that reason, the following internal parameters are automatically saved when the CONSENSUS MEAN command is entered. You can use these parameters to compute many variations of the above methods.

      XGRAND = the overall mean.
      S2POOL = the pooled within lab variance.
      T1STDERR = the standard error for the t method using all the data.
      T2STDERR = the standard error for the t method using the lab means.
      SEMEAN = the consensus mean using the Schiller-Eberhardt method.
      SES2 = the variance of the consensus mean using the Schiller-Eberhardt method.
      BIASALLO = the bias allowance for the Schiller-Eberhardt method.
      SEDF = the degrees of freedom for the Schiller-Eberhardt method.
      MPMEAN = the consensus mean using the Mandel-Paule method.
      MPS2 = the between lab variance using the Mandel-Paule method.
      SEMP = the standard error of the Mandel-Paule method.
      MMPMEAN = the consensus mean using the modified Mandel-Paule method.
      MMPS2 = the between lab variance using the modified Mandel-Paule method.
      SEMMP = the standard error of the modified Mandel-Paule method.
      MLMEAN = the consensus mean using the Ruhkin-Vangel maximum likelihood method.
      MLS2 = the between lab variance using the Ruhkin-Vangel maximum likelihood method.
      SEML = the standard error of the Ruhkin-Vangel maximum likelihood method.
      BOBMEAN = the consensus mean for the BOB method.
      BOBS2 = the between lab variance for the BOB method.
      BOBS2W = the within lab variance for the BOB method.
      BOBKU = the uncertainty value for the BOB method.
      GDMEAN = the consensus mean using the Graybill-Deal method.
      GDS2 = the variance of the Graybill-Deal consensus mean.
      GCIMEAN = the consensus mean using the generalized confidence interval approach.
      DERSMEAN = the consensus mean using the DerSimonian-Laird approach.
      DERSVARI = the variance of the DerSimonian-Laird consensus mean.
      DERSSE = the standard error of the DerSimonian-Laird consensus mean.
Note:
    To allow additional analysis, a number of results are written to external files.

    The following variables are written to the file dpst1f.dat. These are the statistics for the labs.

    1. Lab ID
    2. Number of Observations for Lab
    3. Mean for the Lab
    4. Variance for the Lab
    5. Standard Deviation for the Lab
    6. Standard Deviation of Mean of the Lab

    The following variables are written to the file dpst2f.dat. This is the information contained in table 2 of the CONSENSUS MEAN output. These variables can be used to make plots of the consensus mean results.

    1. Consensus mean for the method
    2. Lower 95% confidence limit for the method
    3. Upper 95% confidence limit for the method
    4. Method id

    The following variables are written to the file dpst3f.dat. This is the information contained in table 3 of the CONSENSUS MEAN output. These variables can be used to generate plots of the consensus mean results.

    1. Consensus mean for the method
    2. Standard uncertainty (k = 1) for the method
    3. Percentage relative standard uncertainty
    4. Method id

    The following variables are written to the file dpst4f.dat. This is the information contained in table 4 of the CONSENSUS MEAN output. These variables can be used to generate plots of the consensus mean results.

    1. Consensus mean for the method
    2. Expanded uncertainty (k = 2) for the method
    3. Percentage relative expanded uncertainty
    4. Method id

    The following variables are written to the file dpst5f.dat.

      1. Simulated consensus means from the generalized confidence interval approach
Note:
    The Mandel-Paule and maximum likelihood estimates are generated using code provided by Mark Vangel of the NIST Statistical Engineering Division. The BOB estimates were adapted from a Dataplot macro provided by Stefan Leigh of the NIST Statistical Engineering Division.
Note:
    By default, the 4 tables are generated using an F15.7 format. This uses a column width of 15 with 7 digits to the right of the decimal point. You can specify the number of digits to the right of the decimal point with the command

      SET WRITE DECIMALS <value>

    If you want to use an exponential format (E15.7), enter

      SET WRITE DECIMALS 99
Note:
    You can optionally generate the CONSENSUS MEANS output in HTML, Latex, or Rich Text Format (RTF). Enter

      HELP CAPTURE HTML
      HELP CAPTURE LATEX

    for details.

Note:
Default:
    None
Synonyms:
    None
Related Commands:
    MEAN PLOT = Generate a mean plot.
    SD PLOT = Generate a standard deviation plot.
    YOUDEN PLOT = Generate a Youden plot.
    ANOVA = Perform an analysis of variance.
References:
    DerSimonian and Laird (1986), "Meta-analysis in Clinical Trials", Controlled Clinical Trials, 7, pp. 177-188.

    Graybill and Deal (1959), "Combining Unbiased Estimators", Biometrics, 15, pp. 543-550.

    M. S. Levenson, D. L. Banks, K. R. Eberhardt, L. M. Gill, W. F. Guthrie, H. K. Liu, M. G. Vangel, J. H. Yen, and N. F. Zhang (2000), "An ISO GUM Approach to Combining Results from Multiple Methods", Journal of Research of the National Institute of Standards and Technology, Volume 105, Number 4.

    John Mandel and Robert Paule (1970), "Interlaboratory Evaluation of a Material with Unequal Number of Replicates", Analytical Chemistry, 42, pp. 1194-1197.

    Robert Paule and John Mandel (1982), "Consensus Values and Weighting Factors", Journal of Research of the National Bureau of Standards, 87, pp. 377-385.

    Andrew Ruhkin (2003), "Two Procedures of Meta-analysis in Clinical Trials and Interlaboratory Studies", Tatra Mountains Mathematical Publications, 26, pp. 155-168.

    Andrew Ruhkin and Mark Vangel (1998), "Estimation of a Common Mean and Weighted Means Statistics", Journal of the American Statistical Association, Vol. 93, No. 441.

    Andrew Ruhkin, B. Biggerstaff, and Mark Vangel (2000), "Restricted Maximum Likelihood Estimation of a Common Mean and Mandel-Paule Algorithm", Journal of Statistical Planning and Inference, 83, pp. 319-330.

    Susannah Schiller and Keith Eberhardt (1991), "Combining Data from Independent Analysis Methods", Spectrochimica, ACTA 46 (12).

    Susannah Schiller (1996), "Standard Reference Materials: Statistical Aspects of the Certification of Chemical SRMs", NIST SP 260-125, NIST, Gaithersburg, MD.

    Bimal Kumar Sinha (1985), "Unbiased Estimation of the Variance of the Graybill-Deal Estimator of the Common Mean of Several Normal Populations", The Canadian Journal of Statistics, Vol. 13, No. 3, pp. 243-247.

    Mark Vangel and Andrew Ruhkin (1999), "Maximum Likelihood Analysis for Heteroscedastic One-Way Random Effects ANOVA in Interlaboratory Studies", Biometrics 55, 129-136.

    Nien-Fan Zhang (2006), "The Uncertainty Associated with The Weighted Mean of Measurement Data", Metrologia, 43, PP. 195-204.

Applications:
    Interlaboratory Studies
Implementation Date:
    2000/10
    2002/10: Support for Latex and HTML output
    2006/3: Reformat output for consistency and clarity
                  Add Tables 3 and 4 to the output
                  Updated the Graybill-Deal method
                  Added the DerSimonian-Laird method
                  Added the generalized confidence intervals method
                  Added support for Rich Text Format (RTF) output
                  Added support for SET WRITE DECIMALS
Program:
     
    SKIP 25
    READ STUTZ86.DAT ALITE JUNK2 JUNK3 JUNK4 JUNK5 LABID
    .
    CONSENSUS MEANS ALITE LABID
        
    The following output is generated:
                       Consensus Means Analysis
                           (Full Sample Case)
      
     Data Summary:
     Response Variable:                      ALITE
     Lab-ID Variable:                        LABID
     Total Number of Observations:                 46
     Grand Mean:                           57.2260857
     Grand Standard Deviation:              1.4274194
     Total Number of Labs:                          5
     Minimum Lab Mean:                     56.5000000
     Maximum Lab Mean:                     61.1999969
     Minimum Lab SD:                        0.1414219
     Maximum Lab SD:                        1.6800299
     Within Lab (pooled) SD:                0.8369111
     Within Lab (pooled) Variance:          0.7004202
      
     Table 1: Summary Statistics by Lab
                                                                         Standard
          Lab                                            Standard       Deviation
           ID    n(i)       Mean          Variance      Deviation         of Mean
     ----------------------------------------------------------------------------
            1      36     56.7527771      0.5522779      0.7431540      0.1238590
            2       4     58.4249992      2.8225005      1.6800299      0.8400150
            3       2     56.5000000      0.1799991      0.4242630      0.2999992
            4       2     60.0999985      0.0200002      0.1414219      0.1000004
            5       2     61.1999969      0.7200009      0.8485287      0.6000004
     ----------------------------------------------------------------------------
      
      
     1.  Method: Mandel-Paule
         Estimate of (unscaled) Consensus Mean:       58.5663223
         Estimate of (scaled) Consensus Mean:          0.4396437
         Between Lab Variance (unscaled):              4.0465660
         Between Lab SD (unscaled):                    2.0116079
         Between Lab Variance (scaled):                0.1831857
         Standard Deviation of Consensus Mean:         0.8317266
         Standard Uncertainty (k = 1):                 0.8317266
         Expanded Uncertainty (k = 2):                 1.6634532
         Expanded Uncertainty (k =  1.9599645):        1.6301546
         Normal PPF of 0.975:                          1.9599645
         Lower 95% (normal) Confidence Limit:         56.9361687
         Upper 95% (normal) Confidence Limit:         60.1964760
         Note: Mandel-Paule Best Usage:
               6 or More Labs
      
      
     2.  Method: Modified Mandel-Paule
         Estimate of (unscaled) Consensus Mean:       58.5590630
         Estimate of (scaled) Consensus Mean:          0.4380985
         Between Lab Variance (unscaled):              3.2046051
         Between Lab SD (unscaled):                    1.7901411
         Between Lab Variance (scaled):                0.1450706
         Standard Deviation of Consensus Mean:         0.8338748
         Standard Uncertainty (k = 1):                 0.8338748
         Expanded Uncertainty (k = 2):                 1.6677495
         Expanded Uncertainty (k =  1.9599645):        1.6343650
         Normal PPF of 0.975:                          1.9599645
         Lower 95% (normal) Confidence Limit:         56.9246979
         Upper 95% (normal) Confidence Limit:         60.1934280
         Note: Modified Mandel-Paule Best Usage:
               6 or More Labs
      
     3.  Method: Vangel-Rukhin Maximum Likelihood
         Estimate of (unscaled) Consensus Mean:       58.5534592
         Estimate of (scaled) Consensus Mean:          0.4369068
         Between Lab Variance (unscaled):              3.2312329
         Between Lab SD (unscaled):                    1.7975631
         Between Lab Variance (scaled):                0.1462760
         Standard Deviation of Consensus Mean:         0.8306379
         Standard Uncertainty (k = 1):                 0.8306379
         Expanded Uncertainty (k = 2):                 1.6612757
         Expanded Uncertainty (k =  1.9599645):        1.6280208
         Normal PPF of 0.975:                          1.9599645
         Lower 95% (normal) Confidence Limit:         56.9254379
         Upper 95% (normal) Confidence Limit:         60.1814804
         Note: Vangel-Rukhin Maximum Likelihood Best Usage:
               6 or More Labs
      
     4.  Method: BOB (Bound on Bias)
         Estimate of Consensus Mean:                  58.5955544
         Within Lab Uncertainty:                       0.2173445
         Between Lab Uncertainty:                      1.3567723
         Standard Uncertainty (k = 1):                 1.3740704
         Expanded Uncertainty (k = 2):                 2.7481408
         Lower 95% (k = 2) Confidence Limit:          55.8474121
         Upper 95% (k = 2) Confidence Limit:          61.3436966
         Note: BOB Best Usage:
               5 or Fewer Labs
      
     5.  Method: Schiller-Eberhardt
         Estimate of Consensus Mean:                  58.5908279
         Estimate of Variance of Mean:                 0.0169179
         Bias Allowance:                               2.6091690
         Sigmah (heterogeneity):                       0.0000000
         Degrees of Freedom for Sigmah:                 1
         Standard Uncertainty (k = 1):                 2.7392378
         Expanded Uncertainty (k = 2):                 2.8693065
         Expanded Uncertainty (k =  2.3645761):        2.9167265
         Degrees of Freedom:                            7
         t Percent Point Value (alpha = 0.05):         2.3645761
         Lower 95% Confidence Limit:                  55.6741028
         Upper 95% Confidence Limit:                  61.5075531
         Note: Schiller-Eberhardt Best Usage:
               5 or Fewer Labs
      
     6.  Method: Mean of Means
         Mean of Lab Means:                           58.5955544
         Standard Deviation of Lab Means:              2.0532134
         Standard Uncertainty (sd/sqrt(n)):            0.9182249
         SD of Consensus Mean (sd/sqrt(n)):            0.9182249
         Standard Uncertainty (k = 1):                 0.9182249
         Expanded Uncertainty (k = 2):                 1.8364499
         Expanded Uncertainty (k =  2.7764461):        2.5494020
         Degrees of Freedom:                            4
         t Percent Point Value (alpha = 0.05):         2.7764461
         Lower 95% (t-value) Confidence Limit:        56.0461540
         Upper 95% (t-value) Confidence Limit:        61.1449547
         Note: Mean of Means Best Usage:
               Any Number of Labs
      
     7.  Method: Graybill-Deal
         Estimate of Consensus Mean:                  58.6732941
         Estimate of Variance (Sinha):                 0.0128360
         Estimate of Variance (Naive):                 0.0055405
         Standard Uncertainty (Sinha) (k = 1):         0.1132961
         Expanded Uncertainty (Sinha) (k = 2):         0.2265923
         Lower 95% (Rukhin) Confidence Limit:         56.2782021
         Upper 95% (Rukhin) Confidence Limit:         61.0683861
         Note: Graybill-Deal Best Usage:
               Any Number of Labs, but no Between Lab Variance
      
     8.  Method: Grand Mean (No Lab Effect)
         Mean of All Data:                            57.2260857
         Standard Deviation of All Data:               2.0532134
         SD of Consensus Mean (sd/sqrt(n)):            0.3027298
         Standard Uncertainty (k = 1):                 0.3027298
         Expanded Uncertainty (k = 2):                 0.6054596
         Expanded Uncertainty (k =  2.0141039):        0.6097292
         Degrees of Freedom:                           45
         t Percent Point Value (alpha = 0.05):         2.0141039
         Lower 95% (t-value) Confidence Limit:        56.6163559
         Upper 95% (t-value) Confidence Limit:        57.8358154
         Note: Grand Mean Best Usage:
               Any Number of Labs, but no Lab-to-Lab Differences
      
     9.  Method: Generalized Confidence Intervals
         Estimate of Consensus Mean:                   58.4525528
         Standard Uncertainty (k = 1):                  1.2792627
         Expanded Uncertainty (k = 2):                  2.5585253
         Lower 95% (Simulation) Confidence Limit:      55.9661942
         Upper 95% (Simulation) Confidence Limit:      61.0020752
         Note: Generalized Confidence Interval Best Usage:
               Any Number of Labs, but no Between Lab Variance
      
     10. Method: DerSimonian Laird
         Estimate of Consensus Mean:                  58.5719872
         Estimate of Variance of Consensus Mean:       0.8636000
         Estimate of Between-Lab Variance:             5.0619205
         Standard Uncertainty (k = 1):                 0.9293008
         Expanded Uncertainty (k = 2):                 1.8586016
         Degrees of Freedom:                            4
         t Percent Point Value:                        2.7764461
         Lower 95% (t-value) Confidence Limit:        55.9918327
         Upper 95% (t-value) Confidence Limit:        61.1521416
         Lower 95% (Rukhin) Confidence Limit:         56.0077209
         Upper 95% (Rukhin) Confidence Limit:         61.1362534
         Note: DerSimonian-Laird Best Usage:
               Any Number of Labs
      
      
     Table 2:  95% Confidence Limits
                                    Consensus            Lower            Upper
     Method                              Mean            Limit            Limit
     --------------------------------------------------------------------------
      1. Mandel-Paule              58.5663223       56.9361677       60.1964770
      2. Modified Mandel-Paule     58.5590630       56.9246980       60.1934279
      3. Vangel-Rukhin ML          58.5534592       56.9254384       60.1814799
      4. BOB                       58.5955544       55.8474121       61.3436966
      5. Schiller-Eberhardt        58.5908279       55.6741015       61.5075544
      6. Mean of Means             58.5955544       56.0461540       61.1449547
      7. Graybill-Deal             58.6732941       56.2782006       61.0683875
      8. Grand Mean                57.2260857       56.6163559       57.8358154
      9. Generalized CI            58.4525528       55.9661958       61.0020750
     10. DerSimonian-Laird (t)     58.5719872       55.9918327       61.1521416
         (Rukhin)                  58.5719872       56.0077202       61.1362541
     --------------------------------------------------------------------------
      
      
     Table 3:  Standard Uncertainties (k = 1)
      
                                                      Standard        Relative
                                    Consensus      Uncertainty        Standard
     Method                              Mean          (k = 1)  Uncertainty (%)
     --------------------------------------------------------------------------
      1. Mandel-Paule              58.5663223        0.8317266        1.4201448
      2. Modified Mandel-Paule     58.5590630        0.8338748        1.4239892
      3. Vangel-Rukhin ML          58.5534592        0.8306379        1.4185975
      4. BOB                       58.5955544        1.3740704        2.3450079
      5. Schiller-Eberhardt        58.5908279        2.7392378        4.6751986
      6. Mean of Means             58.5955544        0.9182249        1.5670557
      7. Graybill-Deal             58.6732941        0.1132961        0.1930966
      8. Grand Mean                57.2260857        0.3027298        0.5290066
      9. Generalized CI            58.4525528        1.2792627        2.1885488
     10. DerSimonian-Laird         58.5719872        0.9293008        1.5865959
     --------------------------------------------------------------------------
      
      
     Table 4:  Expanded Uncertainties (k = 2)
      
                                                      Expanded        Relative
                                    Consensus      Uncertainty        Expanded
     Method                              Mean          (k = 2)  Uncertainty (%)
     --------------------------------------------------------------------------
      1. Mandel-Paule              58.5663223        1.6634532        2.8402896
      2. Modified Mandel-Paule     58.5590630        1.6677495        2.8479784
      3. Vangel-Rukhin ML          58.5534592        1.6612757        2.8371949
      4. BOB                       58.5955544        2.7481408        4.6900158
      5. Schiller-Eberhardt        58.5908279        2.8693066        4.8971944
      6. Mean of Means             58.5955544        1.8364499        3.1341114
      7. Graybill-Deal             58.6732941        0.2265923        0.3861932
      8. Grand Mean                57.2260857        0.6054596        1.0580132
      9. Generalized CI            58.4525528        2.5585253        4.3770976
     10. DerSimonian-Laird         58.5719872        1.8586016        3.1731918
     --------------------------------------------------------------------------
        

Privacy Policy/Security Notice
Disclaimer | FOIA

NIST is an agency of the U.S. Commerce Department.

Date created: 6/5/2001
Last updated: 10/13/2015

Please email comments on this WWW page to alan.heckert@nist.gov.