Twin Data vs. Longitudinal Data to Control for Unobserved Variables in Earnings Functions – Which Are the Differences?*

DOIhttp://doi.org/10.1111/j.1468-0084.2006.00197.x
Date01 June 2007
AuthorGunnar Isacsson
Published date01 June 2007
Twin Data vs. Longitudinal Data to Control for
Unobserved Variables in Earnings Functions –
Which Are the Differences?*
Gunnar Isacsson
Transport Economics Research Unit, Swedish National Road and Transport Research
Institute, Borla¨nge, Sweden (e-mail: gunnar.isacsson@vti.se)
Abstract
This paper compares two different approaches empirically to control for unobserved
characteristics when estimating the effect of marriage on male and female earnings:
the longitudinal and the twins approach. The estimates were obtained by exploiting
the longitudinal dimension of a large sample of Swedish twins, so that longitudinal
and twin-based estimates could be obtained in the same sample. The two approaches
lead to different conclusions both regarding the role of unobserved characteristics in
the cross-sectional earnings–marriage relationship and the effect of marriage on
earnings. The paper investigates three potential explanations of this difference.
I. Introduction
In labour economics, concerns are often expressed that individual characteristics that
are unobserved by the researcher, such as ‘ability’, ‘ambition’, ‘career orientation’ or
‘beauty’ may bias cross-sectional estimators applied to empirical models, e.g. wage
*The Swedish Twin Registry is administrated through the Division of Epidemiology at the Institute of
Environmental Medicine, Karolinska Institute and is supported by grants from the John D. and Catherine T.
MacArthur Foundation, and the Swedish Council for Planning and Coordination of Research (FRN). Dis-
cussions with Orley Ashenfelter, Anders Bjo¨ rklund, Per Johansson, Alan Krueger, and Erik Mellander that
inspired the work presented here are gratefully acknowledged. Comments from Anders Bjo¨ rklund, Per
Johansson, Erik Mellander, Hessel Oosterbeek, Oddbjørn Raaum, Ha˚kan Regne´r, Katarina Richardson,
Robert Wright, an anonymous referee and seminar participants at SNF University of Oslo, the Institute for
Social Research Stockholm University, the Research Institute of Industrial Economics, and the Tinbergen
Institute on earlier versions of the paper have been very useful. I am also grateful to Ha˚kan Malmstro¨ m and
Dan Svensson at the Swedish Twin Registry for providing the data and to Christian Kjellstro¨ m at the Swedish
Institute for Social Research for providing the necessary data from the Swedish Level of Living Survey. Any
errors are, of course, mine.
JEL Classification numbers: C23, J12, J31.
OXFORD BULLETIN OF ECONOMICS AND STATISTICS, 69, 3 (2007) 0305-9049
doi: 10.1111/j.1468-0084.2006.00197.x
339
ÓBlackwell Publishing Ltd and the Department of Economics, University of Oxford, 2006. Published by Blackwell Publishing Ltd,
9600 Garsington Road, Oxford OX4 2DQ, UK and 350 Main Street, Malden, MA 02148, USA.
equations. Different approaches on how to address such biases have been suggested
in the literature, however. One of them is the conventional fixed-effects estimator
applied to longitudinal data, where the same individual is observed at different points
in time. Assuming that the unobserved characteristics are time-invariant, the effect of
the unobservables in the empirical model may be accounted for by estimating
the model in first differences taken over time. This approach has become something
of a standard tool to control for unobservables in virtually every field of empirical
labour economics.
1
Another approach to deal with unobservables assumes that such variables are
equal for siblings or twins at a given point in time. Consequently, the effect of the
unobserved characteristics is accounted for by estimating the model in first
differences, with the differences taken between pairs of siblings or twins. Hence, this
approach requires a data set of siblings or twins and it has most often been used
when estimating the economic return to schooling.
2
Thus, two alternative approaches
have been used to control for fixed effects in empirical models: one that assumes that
the unobservables are individual-specific and another that assumes that they are
sibling-specific.
The purpose of this paper was to compare these two approaches to control for
unobserved characteristics in an empirical application to the effect of marriage on
earnings. The results are obtained by exploiting the longitudinal structure of a large
sample of Swedish twins, so that estimates produced with the two approaches are
obtained in the same sample.
3
In the following I refer to the fixed-effects estimator
applied to a longitudinal data set, and to a sibling or twin data set as the within-
individual and the within-pair estimator, respectively. The paper investigates the
following: Do the two approaches lead to the same conclusion about the effect of
unobserved characteristics on the cross-sectional estimator? If the within-individual
estimates are different from the within-pair estimates, can this be explained by:
(i) untenable assumptions regarding the structure of the unobserved characteristics in
the equation, (ii) classification error in the explanatory variables or (iii) heterogeneity
in the effect of marriage on earnings? Question (i) is addressed through a difference-
in-difference estimator that combines the longitudinal and twin dimensions of the
data. A conventional estimator that adjusts for classification error is used to address
1
Freeman (1984) presents, for example, an application pertaining to the union wage premium and Glaeser
and Mare´ (2001) provide an application to the metropolitan wage premium. Angrist and Krueger (1999) and
Baltagi (2001) contain further references.
2
See Behrman et al. (1980), Ashenfelter and Krueger (1994), Behrman, Rosenzweig and Taubman (1994),
Miller, Mulvey and Martin (1995, 1997), Ashenfelter and Zimmerman (1997), Ashenfelter and Rouse (1998),
Rouse (1999), Behrman and Rosenzweig (1999) and Isacsson (1999, 2004) for applications regarding the
economic returns to education. One reason for using the within-pair estimator and not the within-individual
estimator to estimate the returns to education, is that educational attainment is approximately time-invariant
once the individual has left school.
3
Large sibling data sets have been created in Sweden, Norway, Denmark and Finland (see Bjo¨rklund et al.,
2002). Ermisch and Francesconi (2001) also use sibling data that they obtain from the British Household
Panel Survey. US data from the NLS also contain information on siblings (see Neumark and Korenman, 1994,
for example). Thus, there are a number of data sets that could be used as substitutes for ordinary longitudinal
data sets to obtain within estimates of different parameters of interest.
340 Bulletin
ÓBlackwell Publishing Ltd and the Department of Economics, University of Oxford, 2006

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT