A Zero‐Inflated Regression Model for Grouped Data

Date01 December 2015
AuthorSarah Brown,Alan Duncan,Mark N. Harris,Karl Taylor,Jennifer Roberts
DOIhttp://doi.org/10.1111/obes.12086
Published date01 December 2015
822
©2014 The Department of Economics, University of Oxford and JohnWiley & Sons Ltd.
OXFORD BULLETIN OF ECONOMICSAND STATISTICS, 77, 6 (2015) 0305–9049
doi: 10.1111/obes.12086
A Zero-Inflated Regression Model for Grouped
Data*
Sarah Brown, Alan Duncan, Mark N. Harris§, Jennifer
Roberts† and Karl Taylor
Department of Economics, University of Sheffield, 9 Mappin Street, Sheffield S1 4DT, UK
(e-mail: sarah.brown@sheffield.ac.uk)
Bankwest Curtin Economics Centre, Curtin University, Perth, Australia
§School of Economics and Finance, Curtin University, Perth, Australia
Abstract
We introduce the (panel) zero-inflated interval regression (ZIIR) model, which is ideally
suited when data are in the form of groups, and there is an ‘excess’ of zero observations.
We apply our new modelling framework to the analysis of visits to the general practitioner
(GP) using individual-level data from the British Household PanelSur vey.The ZIIR model
simultaneously estimates the probability of visiting the GP and the frequency of visits
(defined by given numerical intervals in the data). The results show that different socio-
economic factors influence the probability of visiting the GP and the frequency of visits.
I. Introduction and background
In this paper, we introduce the zero-inflated interval regression (ZIIR) model which is
ideally suited when the variable of interest is grouped in some way and there is an excess
of zero observations. The standard approach to modelling grouped data is the interval re-
gression approach (see, for example, Greene and Hensher, 2010), which is based on the
ordered probit model but with known boundary parameters. A key advantage of this ap-
proach is that it is now possible to identify the scale of the dependent variable (in contrast
to the ordered probit approach). However, there are circumstances in which outcomes at
the extensive margin may be driven by different processes than those that dictate positive
outcomes. Grouped-dependent data that exhibit a build-up of ‘excess’ zeros is one likely
manifestation of such a situation. It is therefore necessary to introduce a more flexible para-
metric specification into the standard interval regression to accommodate such divergent
processes in order to avoid the potential for biased and inconsistent estimates. In such a
case, we propose generalizing the interval regression framework along the lines suggested
by Harris and Zhao (2007) for ordered dependent variables.
JEL Classification numbers: C33; C35; I12;
*Funding from the Australian Research Council is kindly acknowledged.We are grateful to the UK Data Service
for supplyingthe British Household Panel Sur veywaves 1 to 18 and to the Institute for Economic and Social Research,
University of Essex, for granting access to the urban/rural indicators.

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT