# Inferential Testing and Confidence Intervals of Average Effect Sizes

The key to making inferences regarding statistical significance about, or computing confidence intervals around, this (weighted) mean effect size is to compute a standard error of estimate. Here, I am referring to the standard error of estimating the overall, average effect size, as opposed to the standard error of effect size estimates from each individual study. The standard error of this estimate of average effect size is computed from the following equa­tion: The logic of this equation is that you want to cumulate the amount of precision across studies to estimate the precision of your estimate of mean effect size. This logic is clear if you consider Z effect size (without artifact corrections), in which the standard error for each study is 1 / V(N – 3) and the weight for each study is therefore N – 3. If there are many studies with large sample sizes, then the sum of ws (i.e., the denominator in Equation 8.3) will be large, and the standard error of estimate of the mean effect size will be small (i.e., the estimate will be precise). In contrast, if a meta-analysis includes just a few studies with small sample sizes, then the sum of ws is small and the standard error of the estimate of mean effect size will be rela­tively large. Although the equations for standard errors of other effect sizes are not as straightforward (in that they are not as simply related to sample size), they all follow this logic.

After computing this standard error of the mean effect size, you can use this value to make statistical inferences and to compute confidence intervals. To evaluate statistical significance, one can perform a Wald test, which sim­ply involves dividing a parameter estimate (i.e., the mean effect size) by its standard error: This test is evaluated according to the standard normal distribution, sometimes called the Z test (note that this is different from Fisher’s Zr trans­formation). The statistical significance of this test can be obtained by looking up the value of Z in any table of standard normal deviates (where e.g., |Z| > 1.96 denotes p < .05). This test can also be modified from a test of an effect size of zero in order to test the difference from any other null hypothesis value, ESo, by changing the numerator to ES – ESo.

The standard error of the mean effect size can also be used to compute confidence intervals. Specifically, you can compute the lower (ESlb) and upper (ESub) bounds for the effect size using the following equation: This equation can be used to compute any level of confidence interval desired, though 95% confidence intervals (i.e., two-tailed a = .05, so Z = 1.96) are typical. If the effect size you are using is one that is transformed (e.g., Zr, ln(o)), you should calculate the mean, lower-bound, and upper-bound effect sizes using these transformed values, and then back-translate each into inter­pretable effect size metrics (e.g., r, o).

To illustrate these computations using the running example, I refer again to Table 8.1. I have already summed the weights (w) across the 22 studies, so I can apply Equation 8.3 to obtain the standard error of the mean effect size, I can use this standard error to evalu­ate the statistical significance of the average effect size (Zr) using the Wald test of Equation 8.4, Z = .387/0118 = 32.70, p < .001. I would therefore con­clude that this average effect size is significantly greater than zero (i.e., there is a positive association between relational aggression and peer rejection). To create 95% confidence intervals, I would compute the lower-bound value of the effect size using Equation 8.5 as = .363, which would then be transformed (using Equation 5.3) for reporting to a lower-bound r = .348. Similarly, I would compute the upper-bound value which is converted to upper bound r = .388 for reporting. To summarize, the mean correlation of this example meta-analysis is .368, which is significantly greater than zero (p < .001), and the 95% confidence interval of this correlation ranges from .348 to .388.

Source: Card Noel A. (2015), Applied Meta-Analysis for Social Science Research, The Guilford Press; Annotated edition.