Using deep learning to interpolate the missing data in time-series for credit risks along supply chain

DOIhttps://doi.org/10.1108/IMDS-08-2022-0468
Published date27 February 2023
Date27 February 2023
Pages1401-1417
Subject MatterInformation & knowledge management,Information systems,Data management systems,Knowledge management,Knowledge sharing,Management science & operations,Supply chain management,Supply chain information systems,Logistics,Quality management/systems
AuthorWenfeng Zhang,Ming K. Lim,Mei Yang,Xingzhi Li,Du Ni
Using deep learning to interpolate
the missing data in time-series
for credit risks along supply chain
Wenfeng Zhang
School of General Education, Chongqing Polytechnic Institute,
Chongqing, PR China and
Chongqing University, Chongqing, China
Ming K. Lim
Adam Smith Business School, University of Glasgow, Glasgow, UK
Mei Yang
Chongqing University, Chongqing, China
Xingzhi Li
Chongqing Jiaotong University, Chongqing, China, and
Du Ni
School of Management, Nanjing University of Posts and Telecommunications,
Nanjing, China
Abstract
Purpose As the supply chain is a highly integrated infrastructure in modern business, the risks in supply
chain are also becoming highly contagious among the target company. This motivates researchers to
continuously add new features to the datasets for the credit risk prediction (CRP). However, adding new
features can easily lead to missing of the data.
Design/methodology/approach Based on the gaps summarized from the literature in CRP, this study first
introducesthe approaches to the building of datasets and the framing of the algorithmic models. Then, this study tests
theinterpolationeffectsofthealgorithmicmodelinthreeartificialdatasets with differentmissingratesandcompares
its predictability before and after the interpolation in a real dataset with the missing data in irregular time-series.
Findings The algorithmic model of the time-decayed long short-term memory (TD-LSTM) proposed in this study
can monitor the missing data in irregular time-series by capturing more and better time-series information, and
interpolating the missing data efficiently. Moreover, the algorithmic model of Deep Neural Network can be used in
the CRP for the datasets with the missing data in irregular time-series after the interpolation by the TD-LSTM.
Originality/value This study fully validates the TD-LSTM interpolation effects and demonstrates that the
predictability of the dataset afterinterpolation is improved. Accurate and timely CRP can undoubtedly assist a
target company in avoiding losses. Identifying credit risks and taking preventive measures ahead of time,
especially in the case of public emergencies, can help the company minimize losses.
Keywords Deep learning, Credit risk prediction, Interpolation, Missing data in irregular time-series, Supply chain
Paper type Research paper
1. Introduction
Credit risk prediction (CRP) refers to the process ofpredicting whether a company willdefault
in the futurethrough the historicaldata of a company (Bu et al.,2018;Kousenidis et al.,2019;Ma
and Lv, 2019). Accurateand timely CRP can help companies avoidlosses. Early indicators for
CRP are obtained mainly from financial reports of the target companies (Tian et al., 2020;
Trustorffet al., 2011;Wu and Brynjolfsson,2015), but after the globalsubprime mortgage crisis
Deep learning
to interpolate
missing data
1401
Declaration: This work was supported by the 2022 Scientific Research Startup Fund of Chongqing
Jiaotong University [Grant No. F1210045]. The authors declare no competing interests.
The current issue and full text archive of this journal is available on Emerald Insight at:
https://www.emerald.com/insight/0263-5577.htm
Received 3 August 2022
Revised 10 December 2022
18 January 2023
Accepted 25 January 2023
Industrial Management & Data
Systems
Vol. 123 No. 5, 2023
pp. 1401-1417
© Emerald Publishing Limited
0263-5577
DOI 10.1108/IMDS-08-2022-0468
in 2008, it has beenquestioned whether the methodsthat rely solely on financial data fromone
companyfor CRP (Trustorff et al.,2011)aretimely(Savitriet al., 2019) and reliable(Ashraf et al.,
2020;Holder-Webb and Sharma,2010). The researchers turnto the information minedfrom the
data indirectly related to the company (Bakoben et al.,2020;Zhang et al.,2022). The data
concerning supplychain indirectly play an important roleas those direct ones in CRP, for the
integrity of the supply chain and the corporate credit riskcan easily spread along the supply
chain, leading to a chainof credit disasters (Agca et al., 2022;Osadchiy et al., 2016).
In essence, most of the data used in CRP are in time-series with historical information
(Chang et al., 2018;Tian et al., 2020), and these data can be collected from different sources.
However, some of the CRP indicators are collected in a fixed sampling frequency, while the
others are not (Kreindler and Lumsden, 2016;Mikalsen et al., 2018), and as a result, it is almost
impossible to fit all the indicators for each other when the data of different sources are not
collected in a fixed sampling frequency for the same company. Worse still, the missing data
will cause a break in the continuity of the data in time-series, which will seriously affect
subsequent data analysis and processing of CRP (Tian et al., 2018). Interpolation (Sun et al.,
2021;Wubetie, 2017) is the most common way of dealing with the missing data, and recently
to interpolate the missing data, the algorithmic models of machine learning such as Deep
Neural Networks (DNN), Generative Adversarial Networks (GAN) and Long Short-Term
Memory (LSTM) have been employed in time-series (Ma et al., 2020;Yoon et al., 2019), but for
irregular time-series, few of the algorithms take the missing data into consideration (Pratama
et al., 2016;Wang et al., 2020). As a result, adding the irregular time-series as an independent
time-function during interpolation requires more consideration.
To raise the interpolation effects, this study will first summarize the relevant literature of
CRP and the interpolation for the missing data in time-series, and then illustrate the
approaches to the building of datasets and the framing of the algorithmic model that can
interpolate the missing data in irregular time-series with high predictability. Finally, this
study will test the interpolation effects of the algorithmic model in artificial datasets with
different missing rates and compare its predictability before and after the interpolation in a
real dataset with the missing data in irregular time-series.
2. Literature review
This section will review the literature closely related to this study from two aspects: the
previous studies in CRP and the methods for interpolating the missing data in time-series.
2.1 Credit risk prediction
Credit risks speakfor the ability of the solvency and possible defaultof a company, but it has
always been difficult to predict the credit risks (Chi and Li, 2017;Scannella and Polizzi, 2021)
with proper timeliness(Savitri et al., 2019) andreliability (Ashraf et al.,2020;Holder-Webb and
Sharma, 2010). In this case, many scholars make great efforts in developing powerful
algorithmsfor better CRP. In general, workingout CRP mainly involves two types (Chi andLi,
2017) of algorithms: linear discriminant analysis based on statistics (Mahmoudi and Duman,
2015) and the algorithms of machine learningbased on computer programs (Ma and Lv, 2019;
Yu et al., 2022). Unlike statistical methods that require researchers to manually estimate the
parametersfor the CRP, the algorithms of machinelearning allow computersto parse the data
and grasp useful knowledge from the data by the computers themselves (Bhator e etal., 2020;
Pandey et al.,2017). Of the machine learning algorithmic models, the most commonly used
include supportvector machines (SVM), neuralnetworks (NN) and random forests (RF)(Arora
and Kaur,2020;Teles et al.,2020;Trustorff et al., 2011), and in addition,various variants of deep
learning likedeep belief networks (DBN) and DNN in CRP have gainedwidespread popularity
IMDS
123,5
1402

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT