Sharp Bounds on Causal Effects under Sample Selection

DOIhttp://doi.org/10.1111/obes.12056
Published date01 February 2015
AuthorGiovanni Mellace,Martin Huber
Date01 February 2015
129
©2013 The Department of Economics, University of Oxford and JohnWiley & Sons Ltd.
OXFORD BULLETIN OF ECONOMICSAND STATISTICS, 77, 1 (2015) 0305–9049
doi: 10.1111/obes.12056
Sharp Bounds on Causal Effects under Sample
Selection*
Martin Huber and Giovanni Mellace
Department of Economics, University of St. Gallen, Varnbuelstrasse 14, CH 9000, St.
Gallen Switzerland (e-mail: martin.huber@unisg.ch; giovanni.mellace@unisg.ch)
Abstract
In many empirical problems, the evaluation of treatment effects is complicated by sample
selection so that the outcome is only observed for a non-random subpopulation. In the
absence of instruments and/or tight parametric assumptions, treatment effects are not point
identified, but can be bounded under mild restrictions. Previous work on partial identifi-
cation has primarily focused on the ‘always observed’ (irrespective of the treatment). This
article complements those studies by considering further populations, namely the ‘compli-
ers’ (observed only if treated) and the observed population.We derive sharp bounds under
various assumptions and provide an empirical application to a school voucher experiment.
I. Introduction
The sample selection problem, see for instance Gronau (1974) and Heckman (1974), arises
when the outcome of interest is only observed for a non-randomly selected subpopulation.
This may flaw causal analysisand is an ubiquitous phenomenon in many fields where treat-
ment effect evaluations are conducted, such as labour, health and educational economics.
For example, in the estimation of the returns to a training, it is an issue when only a
selective subgroup of training participants and non-participants finds employment which
is a condition for observing earnings. Similar problems are inherent in clinical trials when
some of the participants in medical treatments pass away (‘truncation by death’) before the
health outcome is measured. As a further example, consider the effect of randomly pro-
vided private schooling on college entrance examinations. The sample selection problem
arises if only a non-random subgroup of students takes the exam.
In sample selection models in economics (see the seminal work of Heckman, 1974,
1976, 1979), identification commonly relies on tight functional form restrictions and the
availability of a valid instrument for selection. Albeit the literature has recently moved
towards more flexible models, see for instance Das et al. (2003) and Newey (2009), it
*We have benefited from comments by Michael Lechner, Fabrizia Mealli, Franco Peracchi, Christoph Rothe,
seminar/conference participants in St. Gallen (research seminar) and Pisa (4th Italian Congress of Econometrics and
Empirical Economics), and two anonymous referees.
JEL Classification numbers: C14, C21, C24.
130 Bulletin
typically imposes strong assumptions on the unobserved terms unlikely to hold in many
applications, see Huber and Melly (2012), or uses invalid instruments, see Huber and
Mellace (2013). Similar arguments apply to many studies in the related field of missing
data problems, which often use regression or weightingadjustments (assuming selection on
observables) to control for missing outcomes, see for instance Hausman and Wise (1979),
Robins et al. (1995) and Wooldridge (2007). In the absence of unattractive parametric
restrictions or instruments for sample selection, treatment effects are not point identified,
but upper and lower bounds can still be obtained under fairly mild restrictions.
Partial identification of economic parameters in general goes back to Manski (1989,
1994) and Robins (1989). In the context of the sample selection (or missing outcome data)
problem, several contributions in the fields of principal stratification, see Frangakis and
Rubin (2002) and econometrics derive bounds on treatment effects. Building on Horowitz
and Manski (1998), Horowitz and Manski (2000) consider the partial identification of the
average treatment effect in the entire population assuming a binary outcome and also
allowing for missing covariate information. Zhang and Rubin (2003) (see also Zhang,
Rubin and Mealli, 2008) bound the average treatment effects for one subpopulation,
namely the ‘always observed’, whose outcomes are observed both under treatment and
non-treatment. They impose twoassumptions both separately and jointly: (i) monotonicity
of selection in the treatment and (ii) stochastic dominance of the potential outcomes
of the always observed over those of other populations. Imai (2008) shows that the
bounds of Zhang and Rubin (2003) are sharp and additionally considers the identification
of quantile treatment effects. Lee (2009) invokes monotonicity of selection (but not
stochastic dominance) when assessing the average earnings effects of Job Corps, a
training program for disadvantaged youths in the US, on the always observed and proves
the sharpness of the bounds. Blanco et al. (2011) evaluate the same program, but add
assumptions on the order of mean potential outcomes within and across subpopulations to
obtain tighter bounds. In contrast to the aforementioned contributions, Lechner and Melly
(2007) bound the effects on those treated and observed, which is a mixed population
consisting of always observed and ‘compliers’ who are observed under treatment, but
would not be without treatment.1
The main contribution of this article is the derivation of sharp bounds (the tightest
feasible bounds given the assumptions imposed and the information available) on aver-
age treatment effects among compliers, ‘defiers’ (outcomes observed if not treated and
not observed if treated) and the observed population, which have not been considered
in previous work. We show that under the monotonicity and/or stochastic dominance
assumptions, informative bounds can be derived even when the outcomes of particular
subpopulations are only observed in one treatment arm. For instance, one useful result
is that under both assumptions, the lower bound on the observed population coincides
with that on the always observed. This is relevant for many applications where particular
interest lies in whether the lower bound includes a zero effect. Thus, the assumptions may
bear considerable identifying power, which is demonstrated in an application to a school
voucher experiment in Colombia previously analyzed by Angrist et al. (2006).
1This definition is not to be confused with the local average treatment effect framework (see Imbens and
Angrist (1994)), where compliers are those who are treated if assigned to treatment and not treated if assigned to
control in a randomized trial.
©2013 The Department of Economics, University of Oxford and John Wiley & Sons Ltd

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT