There is some controversy about correcting effect sizes used in meta-analyses for methodological artifacts. In this section I describe arguments for and against correction, and then attempt to reconcile these two positions.
1. Arguments for Artifact Correction
Probably the most consistent advocates of correcting for study artifacts are John Hunter (now deceased) and Frank Schmidt (see Hunter & Schmidt, 2004; Schmidt & Hunter, 1996; as well as, e.g., Rubin, 1990). Their argument, in a simplified form, is that individual primary studies report effect sizes among imperfect measures of constructs, not the constructs themselves. These imperfections in the measurement of constructs can be due to a variety of sources including unreliability of the measures, imperfect validity of the measures, or imperfect ways in which the variables were managed in primary studies (e.g., artificial dichotomization). Moreover, individual studies contain not only random sampling error (due to their finite sample sizes), but often biased samples that do not represent the population about which you wish to draw conclusions.
These imperfections of measurement and sampling are inherent to every primary study and provide a limiting frame within which you must interpret the findings. For instance, a particular study does not provide a perfect effect size of the association between X and Y, but rather an effect size of the association between a particular measure of X with a particular measure of Y within the particular sample of the study. The heart of the argument for artifact correction is that we are less interested in these imperfect effect sizes found in primary studies and more interested in the effect sizes between latent constructs (e.g., the correlation between construct X and construct Y).
The argument seems reasonable and in fact provides much of the impetus for the rise of such latent variable techniques as confirmatory factor analysis (e.g., Brown, 2006) and structural equation modeling (e.g., Kline, 2005) in primary research. Our theories that we wish to evaluate are almost exclusively about associations among constructs (e.g., aggression and rejection), rather than about associations among measures (e.g., a particular self-report scale of aggression and a particular peer-report method of measuring rejection). As such, it makes sense that we would wish to draw conclusions from our metaanalyses about associations among constructs rather than associations among imperfect measures of these constructs reported in primary studies; thus, we should correct for artifacts within these studies in our meta-analyses.
A corollary to the focus on associations among constructs (rather than imperfect measures) is that artifact correction results in the variability among studies being more likely due to substantively interesting differences rather than methodological differences. For example, studies may differ due to a variety of features, with some of these differences being substantively interesting (e.g., characteristics of the sample such as age or income, type of intervention evaluated) and others being less so (e.g., the use of a reliable versus unreliable measure of a variable). Correction for these study artifacts (e.g., unreliability of measures) reduces this variability due to likely less interesting differences (i.e., noise), thus allowing for clearer illumination of differences between studies that are substantively interesting through moderator analyses (Chapter 9).
2. Arguments against Artifact Correction
Despite the apparent logic supporting artifact correction in meta-analysis, there are some who argue against these corrections. Early descriptions of meta-analysis described the goal of these efforts as integrating the findings of individual studies (e.g., Glass, 1976); in other words, the synthesis of results was reported in primary studies. Although one might argue that these early descriptions simply failed to appreciate the difference between the associations between measures and constructs (although this seems unlikely given the expertise Glass had in measurement and factor analysis), some modern meta-analysts have continued to oppose artifact adjustment even after the arguments put forth by Hunter and Schmidt. Perhaps most pointedly, Rosenthal (1991) argues that the goal of meta-analysis “is to teach us better what is, not what might some day be in the best of all possible worlds” (p. 25, italics in original). Rosenthal (1991) also cautions that these corrections can yield inaccurate effect sizes, such as when corrections for unreliability yield correlations greater than 1.0.
Another, though far weaker, argument against artifact correction is simply that such corrections add another level of complexity to our meta-analytic procedures. I agree that there is little value in making these procedures more complex than is necessary to best answer the substantive questions of the meta-analysis. Furthermore, additional data-analytic complexity often requires lengthier explanation when reporting meta-analyses, and our focus in most of these reports is typically to explain information relevant to our content-based questions rather than data-analytic procedures. At the same time, simplicity alone is not a good guide to our data-analytic techniques. The more important question is whether the cost of additional data-analytic complexity is offset by the improved value of the results yielded.
3. Reconciling Arguments Regarding Artifact Correction
Many of the critical issues surrounding the controversy of artifact correction can be summarized in terms of whether meta-analysts prefer to describe associations among constructs (those for correction) or associations as found among variables in the research (those against correction). In most cases, the questions likely involve associations among latent constructs more so than associations among imperfectly measured variables. Even when questions involve measurement (e.g., are associations between X and Y stronger when X is measured in certain ways than when X is measured in other ways?), it seems likely that one would wish to base this answer on the differences in associations among constructs between the two measurement approaches rather than the magnitudes of imperfections that are common for these measurement approaches. Put bluntly, Hunter and Schmidt (2004) argue that attempting to meta-analytically draw conclusions about constructs without correcting for artifacts “is the mathematical equivalent of the ostrich with its head in the sand: It is a pretense that if we ignore other artifacts then their effects on study outcomes will go away” (p. 81). Thus, if you wish to draw conclusions about constructs, which is usually the case, it would appear that correcting for study artifacts is generally valuable.
At the same time, one must consider the likely impact of artifacts on the results. If one is meta-analyzing a body of research that consistently uses reliable and valid measures within representative samples, then the benefits of artifact adjustment are likely small. In these cases, the additional complexity of artifact adjustment is likely not warranted. To adapt Rosenthal’s (1991) argument quoted earlier, if what is matches closely with what could be, then there is little value in correcting for study artifacts.
In sum, although I do not believe that all, or even any, artifact adjustments are necessary in every meta-analysis, I do believe it is valuable to always consider each of the artifacts that could bias effect sizes. In meta-analyses in which these artifacts are likely to have a substantial impact on at least some of the included primary studies, it is valuable to at least explore some of the following corrections.
Source: Card Noel A. (2015), Applied Meta-Analysis for Social Science Research, The Guilford Press; Annotated edition.
25 Aug 2021
24 Aug 2021
24 Aug 2021
24 Aug 2021
24 Aug 2021
25 Aug 2021