A Better Understanding of Granger Causality Analysis: A Big Data Environment

AuthorXiaojun Song,Abderrahim Taamouti
DOIhttp://doi.org/10.1111/obes.12288
Date01 August 2019
Published date01 August 2019
911
©2018 The Department of Economics, University of Oxford and JohnWiley & Sons Ltd.
OXFORD BULLETIN OF ECONOMICSAND STATISTICS, 81, 4 (2019) 0305–9049
doi: 10.1111/obes.12288
A Better Understanding of Granger Causality
Analysis: A Big Data Environment*
Xiaojun Song† and Abderrahim Taamouti
Department of Business Statistics and Econometrics, Guanghua School of Management
and Center for Statistical Science, Peking University, Beijing, 100871, China (e-mail:
sxj@gsm.pku.edu.cn)
Department of Economics and Finance, Durham University Business School, Mill Hill
Lane Durham, DH1 3LB, UK (e-mail: abderrahim.taamouti@durham.ac.uk)
Abstract
This paper aims to provide a better understanding of the causal structure in a multivariate
time series by introducing several statistical procedures for testing indirect and spurious
causal effects. In practice, detecting these effects is a complicated task, since the auxiliary
variables that transmit/induce indirect/spurious causality are very often unknown. The
availability of hundreds of economic variables makes this task even more difficult since it
is generally infeasible to find the appropriate auxiliary variables among all the available
ones. In addition, including hundreds of variables and their lags in a regression equation
is technically difficult. The paper proposes several statistical procedures to test for the
presence of indirect/spurious causality based on big data analysis. Furthermore, it suggests
an identification procedure to find the variables that transmit/induce the indirect/spurious
causality. Finally, it provides an empirical application where 135 economic variables were
used to study a possible indirect causality from money/credit to income.
I. Introduction
The concept of causality introduced by Wiener (1956) and Granger (1969) constitutes a
basic notion for analysing dynamic relationships between time series. In studying Wiener–
Granger causality, predictability is the central issue, hence its importance to economists
and policymakers. In practice, Granger-causality is often investigated for bivariate pro-
cesses. However, different conclusions may be reached when more than two variables are
considered. If more than two variables are present, non-causality conditions become more
JEL Classification numbers: C12; C32; C38; C53; E60.
*We thankAnindya Banerjee (Editor) and two anonymous referees for several useful comments. We also thank
Han Hong and the participants at the Second Conference on High-Dimensional Statistics in the Age of Big Data
in Beijing, 2017 Asian Meeting of the Econometric Society in Hong Kong and IAAE 2017 Annual Conference in
Sapporo (Japan) for their very useful comments. Special thanks to Raffaella Giacomini for her discussion which
helped us write this paper. All errors are our own. Financialsuppor t from the National Natural Science Foundation
of China (Grant No. 71532001) is acknowledged.
912 Bulletin
complicated; see e.g. L¨utkepohl (1993) and Dufour and Renault (1998). In other words,
even if a variable is Granger-causal in a bivariate model, it may not be Granger-causal in
a larger model involving more variables. In this case, we talk about an indirect causality
transmitted through a third variable(s); hereafter referred as auxiliary variable(s). For in-
stance, there may be a variable that drives both variables in the bivariate process, such that
when this variable is included into the model, a bivariate causal structure may disappear.
In turn, it is also possible that a variable is non-causal for another one in a bivariate model
and becomes causal if the information set is extended to include other variables as well.
The latter situation corresponds to what is known as a spurious causality. Ignoring these
causal effects can lead to wrong economic analysis, and consequently to inaccurate policy
decisions. In this paper, we borrow from Hsiao (1982) and the literature on factor analysis
to introduce statistical procedures that help us detect indirect and spurious causal effects.
The literature on Granger causality analysis is extensive and many tests and measures
have been introduced to detect and quantify both linear and nonlinear Granger causality;
for review see Dufour andTaamouti (2010), Bouezmarni, Rombouts and Taamouti (2012),
and Song and Taamouti (2018). The original definition of Granger (1969) that have been
adopted in this literature implicitly assumes that all the relevant information is available
and used for the causality analysis. However, in practice, only a very limited information is
considered and the omission of key variables (auxiliary variables) could lead to a spurious
causality or might not help detect a possible indirect causality between the variables of in-
terest. The relevanceof the infor mation set for Granger causality analysiswas first pointed
out by Hsiao (1982) (see also Eichler, 2007, 2012), whofor mallyintroduced the concept of
indirect/spurious causality in a trivariate model. Hsiao (1982) provides a basic framework
to explain the causal relationships in a multivariate time series model based on Wiener–
Granger notion of causality. He focuses on establishing a Granger causal ordering of the
events and on the reconciliation of the disparity between the results obtained from the bi-
variate and multivariateanalysis. He generalizes the Granger’sconcept of causality to make
some provision for spurious/indirect causality which may arise in multivariate analysis. In
particular, he shows that a certain type of spurious causality vanishes whenthe infor mation
set is reduced. This observation leads to a strengthened definition of (direct) causality by
requiring an improvement in prediction irrespective of the used information set. Finally,
Hsiao (1982) characterizes the indirect/spurious causality in the context of VAR models
and discusses how to test these causal effects in the presence of known auxiliary variables.
It is worth mentioning that indirect/spurious causality might be linked to the omitted
variables bias problem. In the context of the vector moving average model, Sims (1980)
points out that the Granger causal relations may appear in the model because of the omitted
variables problem. Furthermore, L¨utkepohl (1982) shows that on the one hand Granger-
causality in a bivariate system maybe due to omission of relevant variables, and on the other
hand non-causality in a bivariate system may theoreticallyresult from neglected variables.
For L ¨utkepohl (1982), the structure of the causal relation between the variables of interest
can only be obtained by including all relevant variables in the model. He adds that ‘since
many economic variables are important in the sense that they interact, high-dimensional
time series model-building seems to be required’,but he also recognizes that the latter ‘does
not seem to be an easy task’. This paper aims to use big data analysis techniques to propose
statistical procedures that help to test for the presence of indirect/spurious causality.
©2018 The Department of Economics, University of Oxford and JohnWiley & Sons Ltd

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT