|
3.3.1 Modeling Heterogeneity When Combining Information
Brad Biggerstaff Statistical Engineering Division, CAML
Richard L. Tweedie Colorado State University
Researchers and policy makers are often faced with the need to synthesize quantitatively information from different, possibly conflicting sources when evaluating evidence. Several points of concern have been raised regarding different aspects of statistical methods developed to allow this synthesis to be done objectively. One critical issue is whether the information being combined is in some way inhomogeneous and, if it is, how this ought to be accounted for when modeling or when making general statements of inference. Indeed, concerns with potential heterogeneity and poor success at accounting for it are major complaints about current methodology.
Efforts by most researchers to deal with homogeneity questions have
focused on borrowing from classical linear models by employing
the random effects or variance components models using a modified
interpretation for the sources of variation. Specifically when using
this approach, researchers usually begin by adopting a model of the form
where Yi are the observations, perhaps transformed, from the individual studies; are the within-study variances; is the between-study variance; are the true means of the individual studies; is the overall mean and is typically the parameter of interest; and the ei and Ei are all mutually independent, normally distributed error terms. This hierarchical model arises in an effort to take account of across-source differences by incorporating unsystematic yet ``true''--rather than sampling--variation, measured by , in the individual study means.
In applications, estimation of the overall mean
is generally
of interest. In the frequentist setting, it is
standard to use a point estimate
for
in
to construct point and confidence interval estimates for using normal distribution theory. This approach ignores variability in when estimating , thereby potentially understating the uncertainty in interval estimates of . A readily-computed, moment-based estimate for suggested by DerSimonian and Laird (1986) has gained favor for this purpose in application (National Research Council's Report Combining Information (1992)). We have derived the second and third moments of , allowing us to use gamma and Pearson Type III approximating distributions for in interval estimation of . Additionally, the availability of these distributions allows us to incorporate variability in when estimating , thereby addressing a major drawback of current methodology. We have applied these methods (Statistics in Medicine, revision) to a meta-analysis of studies of the use of diuretics in the prevention of pre-eclampsia and to a meta-analysis of studies of the effects of environmental tobacco smoke on lung cancer. In these examples it is seen that incorporation of the variability in into weights of the individual studies' results produces weighting that does not go as far as standard random effects methods in down-weighting the results of large studies and up-weighting those of small studies, a criticism of the standard approach. Preliminary Monte Carlo investigations suggest the proposed methodology provides an improvement (confidence interval coverage and average confidence interval length considerations) over standard techniques.
Date created: 7/20/2001 |