Meta‐Regression Models and Observational Research

Published date01 October 2017
Date01 October 2017
©2017 The Department of Economics, University of Oxford and JohnWiley & Sons Ltd.
doi: 10.1111/obes.12172
Meta-Regression Models and Observational
Stephan B. Bruns
Department of Economics, University of G¨ottingen, Humboldtallee 3, 37073 G¨ottingen,
Germany (e-mail:
Meta-regression models were originally developed for the synthesis of experimental re-
search where randomization ensures unbiased and consistent estimation of the effect of
interest. Most economics research is, however, observational and specification searches
may often result in estimates that are biased and inconsistent, for example, due to omitted-
variable biases. We show that if the authors of primary studies search for statistically
significant estimates in observational research, meta-regression models tend to make false-
positive findings of genuine empirical effects. More research is needed to better understand
how meta-regression models need to be specified to help identifying genuine empirical
effects in observational research.
I. Introduction
Empirical research is often characterized by the selection of statistically significant results.
It has been shown that published p-values cluster just below the widely used significance
thresholds for the leading general-interest journals (Ridley et al., 2007), the top economics
journals (Brodeur et al., 2016), the top sociology journals (Gerber and Malhotra, 2008a) and
the top political science journals (Gerber and Malhotra, 2008b). Meta-regression models
try to address this selection bias by integrating the estimates from multiple primary studies
in order to reveal the presence or absence of genuine effects.We show that meta-regression
analyses of observational research may suffer from a lack of robustness if primary authors
search across different regression specifications for statistically significant estimates.
We refer to experimental research if randomization is used to estimate an effect of
interest, whereas observational research denotes research designs without randomization.
While randomization ensures an unbiased and consistent estimate of the effect of interest,
regression analyses based on observational data are characterized by substantial analyti-
cal flexibility due to the multitude of potential regression specifications. Each regression
specification may possibly suffer from biases such as omitted-variable biases resulting in
JEL Classification numbers: C12, C15, C40.
*I am deeply grateful to the editor Jonathan Temple. I also would like to thank David Stern, Guido B¨unstorf and
Alessio Moneta as well as two anonymous referees for their helpful comments.
638 Bulletin
a biased and inconsistent estimation of the effect of interest. This analytical flexibility was
described as a key threat to the reliability of inferences in observational research (Hendry,
1980; Leamer, 1983; Sims, 1988).
Various terms have been used to describe the search by authors of primary studies for
estimates with a p-value below the common thresholds of 0.05 or 0.1. We follow the usage
of Simonsohn, Nelson and Simmons (2014) who coined the term ‘p-hacking’ to denote the
selection of statistically significant estimates within each study while using ‘publication
bias’ to refer to researchers placing studies without statistically significant estimates in the
file drawer (Rosenthal, 1979). The classical view on publication bias is that in the worst
case only those 5% of studies that produce statistically significant estimates by chance are
published and the remaining 95% of studies remain in the file drawer. However, this view
ignores that each study may engage in p-hacking resulting in a substantially increased rate
of false-positive findings (Simonsohn et al., 2014). Only those studies that fail to produce
significant estimates after p-hacking may remain in the file drawer.
p-hacking is prevalentin both experimental and observational research and may be more
frequent in economics than in other disciplines, at least for impact evaluationstudies (Vivalt,
2015). p-hacking probably originates in the incentive structure of academic publishing and
limits the reliability of inferences that can be drawn from published empirical studies
(Ioannidis, 2005; Glaeser, 2008). Researchers engaging in p-hacking usually look for
estimates that are not only significant but also confirm the theory or hypothesis of interest.
Fanelli (2010) shows that the probability that a paper finds support for its hypothesis
is high across all research disciplines. The pressure to provide significant and theory-
confirming results is increased by declining acceptance rates in top economics journals
and the need to publish in these journals in order to start or advance an academic career
(Card and DellaVigna, 2013).As a result, Young, Ioannidis andAl-Ubaydli (2008) compare
the publication process to the winner’s curse in auction theory. The most spectacular or
exaggerated results are rewarded with publication in the top journals, although in this case
it is the scientific community rather than the author that is cursed.
In extreme cases, strong theoretical presumptions may lead authors to search for theory-
confirming results (Card and Krueger, 1995). As soon as potentially false theories become
established, empirical research may be characterized by the selection of results that meet
the anticipated expectations of reviewers(Frey, 2003) rather than those that falsify the false
theory. Null results mayonly be considered for publication if a series of articles previously
established the presence of a genuine effect (De Long and Lang, 1992).
The combination of flexible observational research designs in economics and incen-
tives to select for specific results may introduce severe biases in published empirical find-
ings. Experimental sciences improve the reliability of inferences by using meta-analyses
that integrate the evidence of multiple studies while controlling for publication bias (e.g.
Sutton et al., 2000). Such meta-analytic tools are increasingly being used to synthesize
observational research in economics. The Precision-Effect Test (PET) that relates the
t-value of an estimated regression coefficient to the precision of the estimate (Stanley,
2008) is commonly used (e.g. Doucouliagos and Stanley, 2009; Doucouliagos, Stanley and
Viscusi, 2014). If a genuine effect is present, the coefficient’s t-value and its precision are
associated and this relation is used to test for the presence of a genuine effect. However,
such an association between a coefficient’s t-value and its precision might also occur in the
©2017 The Department of Economics, University of Oxford and JohnWiley & Sons Ltd

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT