Dynamic Factor Models with Jagged Edge Panel Data: Taking on Board the Dynamics of the Idiosyncratic Components

AuthorFrancisco Dias,António Rua,Maximiano Pinheiro
DOIhttp://doi.org/10.1111/obes.12006
Date01 February 2013
Published date01 February 2013
80
©Blackwell Publishing Ltd and the Department of Economics, University of Oxford 2012. Published by Blackwell Publishing Ltd,
9600 Garsington Road, Oxford OX4 2DQ, UK and 350 Main Street, Malden, MA 02148, USA.
OXFORD BULLETIN OF ECONOMICS AND STATISTICS, 75, 1 (2013) 0305-9049
doi: 10.1111/obes.12006
Dynamic Factor Models with Jagged Edge Panel
Data: Taking on Board the Dynamics of the
Idiosyncratic Components
Maximiano Pinheiro*†Ant ´
onio Rua* and Francisco Dias*
*Banco de Portugal, Avenida Almirante Reis no. 71, 1150-165 Lisboa, Portugal (emails:
mpinheiro@bportugal.pt; antonio.rua@bportugal.pt; fadias@bportugal.pt)
ISEG, Technical University of Lisbon, Rua do Quelhas no. 6, 1200-781 Lisboa, Portugal
Abstract
As macroeconomic data are released with different delays, one has to handle unbalanced
panel data sets with missing values at the end of the sample period when estimating
dynamic factor models. We propose an EM algorithm which copes with such data sets
while accounting for autoregressive common factors and allowing for serial correlation in
the idiosyncratic components. Based on Monte Carlo simulations, we nd that taking on
board the dynamics of the idiosyncratic components improves signicantly the accuracy
of the estimation of both the missing values and the common factors at the end of the
sample period.
I. Introduction
The literature on dynamic factor models in economics and nance goes back to Geweke
(1977), Sargent and Sims (1977), Geweke and Singleton (1981) and Watson and Engle
(1983). In a factor model, the data generating process of each variable is the sum of a com-
mon component, driven by a small number of latent common factors, and an idiosyncratic
component. In the classical formulation, the idiosyncratic components are cross-section-
ally and serially independent and also uncorrelated with the common factors. In addition,
the common factors are generated by a nite order vector autoregression. For a xed
cross-sectional dimension, the model can be consistently estimated by Gaussian maxi-
mum likelihood. In the early literature, the analysis was limited to panels with a small
number of variables and the model was estimated by maximum likelihood using either
frequency or time domain approaches.
In the context of growing data availability, the existence of large panel data sets led
to the development of a non-parametric estimation approach based on least squares. The
resulting principal components estimator avoided the feasibility issues and the increased
technical complexity of the maximum likelihood estimator when dealing with large cross-
sections. Connor and Korajczyk (1986, 1988, 1993) discussed the consistency of the
JEL Classication numbers: C32, C33, C53.
Dynamic factor models with jagged edge panel data 81
principal components estimator when the number of variables tends to innity and the
time dimension remains xed. When both panel dimensions tend to innity, Stock and
Watson (1998, 2002b), Bai and Ng (2002), Bai (2003) and Amengual and Watson (2007)
have shown that, under slightly different sets of assumptions regarding the data generating
processes of the factors and of the idiosyncratic components, the rst principal components
span the factor space, even if there is some heteroskedasticity and limited dependence of
the idiosyncratic components in both dimensions, as well as moderate correlation between
the latter and the factors. Related work includes Forni and Reichlin (1998), Forni and Lippi
(2001), Forni et al. (2000, 2004, 2005), using frequency domain methods.
Doz, Giannone and Reichlin (2012) reconciled the classical factor model estimated
by Gaussian maximum likelihood with the strand of literature on factor models for large
cross-sections. In a quasi-maximum likelihood approach (in the sense of White, 1982),
they treat the classical model as a possibly misspecied model which is used for estima-
tion purposes, henceforth the ‘estimation model’. By imposing the classical assumptions
on the estimation model makes the Gaussian maximum likelihood estimation feasible for
large cross-sections. They show that the factor space is estimated consistently when both
panel dimensions tend to innity even if the underlying data set is generated by a model
with heteroskedastic and serially correlated idiosyncratic components. More recently, the
estimation model has been generalized to allow for serially correlated idiosyncratic com-
ponents (Jungbacker and Koopman, 2008; Reis and Watson,2010; Banbura and Modugno,
2010; among others).
In practice, macroeconomic data become available with different delays, that is, one
has to handle unsynchronized data releases for a large number of variables. In fact, if one
had to wait until all data were available it would be necessary to wait for a few months to
estimate the factors for the current period. The staggered release of information results in
an unbalanced panel data with missing values located at the end of the sample period. The
presence of missing values at the end of the sample is by and large the more practically
relevant issue for macroeconomic forecasting, nowcasting and policy analysis. Typically,
for data of the same frequency, there are no missing values at the middle of the sample
whereas if they are located at the beginning one can always shorten the sample and still
have long time series in most cases. In light of this, the jagged edge panel data feature
is clearly the most challenging feature that one has to deal with. Giannone, Reichlin and
Small (2008) address this issue in the framework of a dynamic factor model and a large
cross-section. They refer to panels with this specic unbalanced feature as having a jagged
edge across the most recent periods of the sample. Other authors refer to this problem
as ragged edge data (see e.g. Wallis, 1986, and more recently Schumacher and Breitung,
2008; Marcellino and Schumacher, 2010; Kuzin, Marcellino and Schumacher, 2011).
The estimation model considered by Giannone et al. (2008) is a dynamic factor model
with idiosyncratic components cross-sectionally orthogonal and white noise.1As men-
tioned above, the misspecication of the idiosyncratic components autocorrelation does
not jeopardize the consistent estimation of the factor space, but consistency is not the
only issue at stake. A more accurate estimation of factors at the end of the sample is key
1They do not estimate the model by maximum likelihood. Instead, they use the two-step estimator based on Kalman
ltering suggested by Doz, Giannone and Reichlin (2007).
©Blackwell Publishing Ltd and the Department of Economics, University of Oxford 2012

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT