Bayesian Inference in Spatial Sample Selection Models

Published date01 February 2018
DOIhttp://doi.org/10.1111/obes.12187
Date01 February 2018
90
©2017 The Department of Economics, University of Oxford and JohnWiley & Sons Ltd.
OXFORD BULLETIN OF ECONOMICSAND STATISTICS, 80, 1 (2018) 0305–9049
doi: 10.1111/obes.12187
Bayesian Inference in Spatial Sample Selection
Models
Osman Do˘
gan,and S¨
uleyman Tas¸pinar
Economics Program, University of Illinois, Illinois, United States (email: odogan10@gmail.
com)
Economics Program, Queens College, The City University of New York, New York, United
States (email: STaspinar@qc.cuny.edu)
Abstract
In this study, we consider Bayesianmethods for the estimation of a sample selection model
with spatially correlated disturbance terms. We design a set of Markov chain Monte Carlo
algorithms based on the method of data augmentation. The natural parameterization for
the covariance structure of our model involves an unidentified parameter that complicates
posterior analysis. The unidentified parameter – the varianceof the disturbance ter m in the
selection equation – is handled in different ways in these algorithms to achieve identifica-
tion for other parameters. The Bayesian estimator based on these algorithms can account
for the selection bias and the full covariance structure implied by the spatial correlation.
We illustrate the implementation of these algorithms through a simulation study and an
empirical application.
I. Introduction
A typical sample selection model consists of (i) a selection equation that models the
selection mechanism through which we observe the level of outcome, and (ii) an outcome
equation that describes the process that is generating the outcome. The model structure is
characterized by the correlation between the disturbances of these equations, for which
estimation requires special methods (Lee, 1978, 1994; Heckman, 1979, 1990; Olsen,
1980; Newey, 2009). Besides the cross equation correlation in the disturbance terms,
spatial correlation may also be present in the disturbance terms of each equation. The spa-
tial sample selection model considered in the present study accommodates both type of
correlations.
Selection models, or more generally Type-I or Type-II Tobit models, may arise fre-
quently in urban economics, regional science, labour economics, agricultural economics,
and social interaction models. It is natural to consider a notion of spatial correlation in the
unobservables so long as data is organized by a notion of location in the relevant space.
The presence of common shocks, factors and cluster effects provides a natural motivation.
JEL Classification numbers: C13, C21, C31.
Bayesian inference in spatial sample selection models 91
For example, McMillen (1995) studies residential land values in urban areas through a
sample selection model. He conjectures that unobserved variables that make a parcel more
likely to receive residential zoning may increase the value of residential land. It is also
plausible to allow for spatially correlated disturbance terms because nearby parcels are
likely to be affected by the same neighbourhood factors and spillovers. B¨uchel and van
Ham (2003) study overeducation – a job seeker’soverqualification for a job she accepts due
to her location constraints – through a Heckit model. The selection problem arises because
overeducation is observed only for the employed. The authors state that the risk of overe-
ducation is largely determined by spatial flexibility of a job seeker in combination with
the spatial heterogeneity in suitable job opportunities, relative to the place of residence.
They try to control for spatial correlations in the disturbance terms by clustering methods.
Flores-Lagunes and Schnier (2012) study (for details, see our empirical illustration) the
spatial production of fisheries in the Eastern Bering Sea through a sample selection model.
Since a negative shock that affects the fish population in a certain location wouldaffect the
production of all vessels in other locations by displacing fishing effort into more efficient
surrounding locations, they allow disturbance terms to be spatially correlated.
Another motivation for considering spatial correlation is related to measurement error.
The mismatch between the spatial unit of observations (e.g. census tract, county, state, peer
groups, farm location, fishing zone, etc.) and the unit of a study (e.g. student, household,
housing market, labour market, farm, fishing vessel, etc.) can cause measurement errors
in the variables of interest (Anselin, 2007). Since these measurement errors may vary
systematically across space, the disturbance terms of a regression model over the same
space are likely to be correlated. Ward, Florax and Flores-Lagunes (2014) consider a sample
selection model of cereal production where the selection equation specifies a farmer’s
endogenous decision about whether to plant cereals. They employ a first-order spatial
autoregressive model for the disturbance terms, because data on yields or climate are
aggregated for large administrative units, and correlation among unobservables may be
driven by unobserved environmental, geographical and climatological clusters. Raboviˇc
and ˇ
ıˇzek (2016) provideanother example in the context of peer effects, where the outcome
equation models a student’s achievement on a test and the selection equation models the
student’s decision whether to take the exam. It is plausible that the decision to take the
exam and the score from the exam may depend on a student’s ability that is likely to be
similar to her peers’ abilities.Therefore, social interaction literature often incorporate what
is known as the ‘correlated effects’ in the model (Lee, Liu and Lin, 2010).
The limited dependent variable models that accommodate spatial dependence havebeen
studied in terms of both estimation and testing issues. The maximum likelihood estima-
tor (MLE) of a probit model with a spatial autoregressive process requires evaluation of
a multivariate normal cumulative distribution function, which often leads to numerical
estimation problems. To circumvent this shortcoming of the MLE, alternative methods
have been suggested in the literature. For example, McMillen (1992) uses an expectation
maximization (EM) algorithm to circumvent the evaluation of the multivariate normal
distribution function and suggests a tractable iterative estimation approach. Beron and
Vijverberg (2004) use the recursive importance sampling (RIS) simulator to approximate
the log-likelihood function of the spatial probit model. Pinkse and Slade (1998) formu-
late moment functions from the score vector of a partial MLE for the generalized method
©2017 The Department of Economics, University of Oxford and JohnWiley & Sons Ltd
92 Bulletin
of moments (GMM) estimation of a spatial probit model. Instead of using entire joint
distribution of observations implied by the spatial dependence, Wang, Iglesias and
Wooldridge (2013) formulate a partial MLE based on the partial joint distribution of ob-
servations to reduce computational difficulties.
McMillen (1995) extends the EM algorithm suggested in McMillen (1992) to a sample
selection model that has a first-order spatial autoregressive process in the disturbance
term. The estimation scheme requires inversion of an n×nmatrix at each iteration which
makes this approach impractical for large samples. Flores-Lagunes and Schnier (2012)
combine the GMM approach in Pinkse and Slade (1998) and Kelejian and Prucha (1998)
and suggest a GMM method for a sample selection model that has a first-order spatial
autoregressive process in the disturbance terms. The moment functions for the estimation
of the selection equation are the ones suggested by Pinkse and Slade (1998) for the probit
model. These moment functions are combined with the moment functions formulated for
the outcome equation to form a joint GMME. The simulation studies reported in Flores-
Lagunes and Schnier (2012) show that the bias present in the selection equation parameters
adversely affects the estimation of the parameter of the outcome equation. Raboviˇc and
ˇ
ıˇzek (2016) extend the partial maximum likelihood (ML) method of Wang et al. (2013)
to a sample selection model with a spatial lag of a latent dependent variable and a sample
selection model with spatially correlated disturbance terms. They establishthe large sample
properties of the proposed partial MLE and provide a finite sample bias and precision
comparison to the Heckit and the GMM approach of Flores-Lagunes and Schnier (2012).
Overall, the proposed partial MLE outperforms the Heckit and the GMM-based estimator.
In this paper, we consider a sample selection model that has a first-order spatial auto-
regressive process for the disturbance terms of the selection and outcome equations.1We
consider the Bayesian Markov chain Monte Carlo (MCMC) estimation approach for this
model with various alternative Gibbs samplers. In comparison with the GMM and partial
ML approaches, the Bayesian approach with data augmentation formulates estimators that
can exploit the full information on the spatial correlation structure. The data augmentation
method treats the underlying latent variable as an additional parameter to be estimated,
which in turn facilitates the posterior simulation through an MCMC sampler (Albert and
Chib, 1993; van Dyk and Meng, 2001). The natural parameterization for the covariance of
our model involvesan unidentified parameter which can complicate posterior analysis. The
unidentified parameter, i.e. the variance of the disturbance terms in the selection equation,
is handled in different ways in our suggested Gibbs samplers.
In the first algorithm, we specify prior distributions for the identified parameters and
consider the method suggested in Nobile (2000) that can be used to construct a Markov
chain that fixes the unidentified parameter during the posterior analysis. In the second
algorithm, we consider the re-parameterization approach suggested in Li (1998), McCul-
loch, Polson and Rossi (2000) and van Hasselt (2011) for the covariance matrix of the
model. Given the bivariate normal distribution assumption for the disturbance terms, the
covariance matrix is re-parameterized in terms of conditional variance and covariance of
disturbance terms. In the third algorithm, we consider a different blocking scheme for the
1To the best of our knowledge, McMillen (1995), Flores-Lagunes and Schnier (2012) and Raboviˇc and ˇ
C´
zek
(2016) are the only studies that consider estimation of sample selection models with spatial dependence.
©2017 The Department of Economics, University of Oxford and JohnWiley & Sons Ltd

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT