Machine learning facilitated business intelligence (Part I). Neural networks learning algorithms and applications

Published date27 November 2019
Pages164-195
Date27 November 2019
DOIhttps://doi.org/10.1108/IMDS-07-2019-0361
AuthorWaqar Ahmed Khan,S.H. Chung,Muhammad Usman Awan,Xin Wen
Subject MatterInformation & knowledge management,Information systems,Data management systems,Knowledge management,Knowledge sharing,Management science & operations,Supply chain management,Supply chain information systems,Logistics,Quality management/systems
Machine learning facilitated
business intelligence (Part I)
Neural networks learning
algorithms and applications
Waqar Ahmed Khan and S.H. Chung
Department of Industrial and Systems Engineering,
The Hong Kong Polytechnic University, Kowloon, Hong Kong
Muhammad Usman Awan
Institute of Quality and Technology Management,
University of the Punjab, Lahore, Pakistan, and
Xin Wen
Department of Industrial and Systems Engineering,
The Hong Kong Polytechnic University, Kowloon, Hong Kong
Abstract
Purpose The purpose of this paper is to conduct a comprehensive review of the noteworthy contributions
made in the area of the Feedforward neural network (FNN) to improve its generalization performance and
convergence rate (learning speed); to identify new research directions that will help researchers to design new,
simple and efficient algorithms and users to implement optimal designed FNNs for solving complex problems;
and to explore the wide applications of the reviewed FNN algorithms in solving real-world management,
engineering and health sciences problems and demonstrate the advantages of these algorithms in enhancing
decision making for practical operations.
Design/methodology/approach The FNN has gained much popularity during the last three decades.
Therefore, the authors have focused on algorithms proposed during the last three decades. The selected databases
were searched with popular keywords: generalization performance,”“learningrate,”“overfittingand fixed and
cascade architecture.Combinations of the keywords were also used to get more relevant results. Duplicated
articles in the databases, non-English language, and matched keywords but out of scope, were discarded.
Findings The authors studieda total of 80 articles and classified theminto six categories according to the
nature of the algorithms proposed in these articles which aimed at improving the generalization performance
and convergencerate of FNNs. To review and discuss all the sixcategories would result in the paper beingtoo
long. Therefore,the authors further dividedthe six categories intotwo parts (i.e. Part I and Part II). Thecurrent
paper, PartI, investigates two categoriesthat focus on learning algorithms(i.e. gradient learningalgorithms for
network training and gradient-free learning algorithms). Furthermore, the remaining four categories which
mainly explore optimization techniques are reviewed in Part II (i.e. optimization algorithms for learning rate,
bias andvariance (underfittingand overfitting) minimizationalgorithms,constructive topology neuralnetworks
and metaheuristicsearch algorithms).For the sake of simplicity, the paperentitled Machine learningfacilitated
business intelligence (Part II): Neural networks optimization techniques and applicationsisreferred to as Part
II. This resultsin a division of 80 articles into38 and 42 for Part I and Part II, respectively.After discussing the
FNN algorithmswith their technicalmerits and limitations,along with real-world management,engineeringand
health sciencesapplications for each individual category, the authors suggest seven (three in Part I and other
four in Part II) new future directions which can contribute to strengthening the literature.
Research limitations/implications The FNN contributions are numerous and cannot be covered in a
single study. The authors remain focused on learning algorithms and optimization techniques, along with
their application to real-world problems, proposing to improve the generalization performance and
convergence rate of FNNs with the characteristics of computing optimal hyperparameters, connection
weights, hidden units, selecting an appropriate network architecture rather than trial and error approaches
and avoiding overfitting.
Industrial Management & Data
Systems
Vol. 120 No. 1, 2020
pp. 164-195
© Emerald PublishingLimited
0263-5577
DOI 10.1108/IMDS-07-2019-0361
Received 11 July 2019
Revised 11 October 2019
Accepted 28 October 2019
The current issue and full text archive of this journal is available on Emerald Insight at:
www.emeraldinsight.com/0263-5577.htm
This work was supported by a grant from the Research Committee of The Hong Kong Polytechnic
University under the account code RLKA, and supported by RGC (Hong Kong) GRF, with the Project
Number: PolyU 152131/17E.
164
IMDS
120,1
Practical implications This study will help researchers and practitioners to deeply understand the
existing algorithms merits of FNNs with limitations, research gaps, application areas and changes in research
studies in the last three decades. Moreover, the user, after having in-depth knowledge by understanding the
applications of algorithms in the real world, may apply appropriate FNN algorithms to get optimal results in
the shortest possible time, with less effort, for their specific application area problems.
Originality/value The existing literature surveys are limited in scope due to comparative study of the
algorithms, studying algorithms application areas and focusing on specific techniques. This implies that the
existing surveys are focused on studying some specific algorithms or their applications (e.g. pruning
algorithms, constructive algorithms, etc.). In this work, the authors propose a comprehensive review of
different categories, along with their real-world applications, that may affect FNN generalization performance
and convergence rate. This makes the classification scheme novel and significant.
Keywords Data analytics, Machine learning, Learning algorithm, Feedforward neural network,
Industrial management
Paper type Research paper
1. Introduction
The widespreadpopularity of the feedforward neuralnetwork (FNN) to solve problems exists
in its universalapproximation capability(Ferrari and Stengel, 2005;Hornik et al., 1989; Huan g,
Chen and Siew, 2006). It can solve any complex non-linear problem more accurately which is
difficult using classical statistical techniques (Kumar et al., 1995; Tkáčand Verner, 2016;
Tu, 1996). The FNN range of applications is numerous and some areas include regression
estimation(Chung et al., 2017; Deng et al., 2019;Kummong and Supratid, 2016; Teo et al., 2015),
image processing(Dong et al., 2016; Mohamed Shakeel et al., 2019),image segmentation (Chen
et al., 2018), video processing (Babaee et al., 2018), speech recognition (Abdel-Hamid et al.,
2014), text classification (Kastrati et al., 2019; Zaghloul et al., 2009), face classification and
recognition (Yin and Liu, 2018), human action recognition (Ijjina and Chalavadi, 2016), risk
analysis (Nasiret al., 2019) and many others. Businessintelligence makes useof data analytics
techniques to generate useful information from high-dimensional data that may support
making better informed decisions. Machine learning is gaining popularity in all aspects from
data gathering to discovering knowledge and its role in enhancing business decisions is
gaining significant interest (Bottani et al., 2019; Hayashiet al., 2010; Kim et al.,2019;Lamet al.,
2014; Li et al., 2018; Mori et al., 2012; Wan get al.,2005;Wonget al., 2018). The study explores
the machine learning FNN and its application in facilitating business intelligence. The
application of FNN in diverse topics is not simple and extensive knowledge is required to
build an optimal network to achieve the intended results in the shortest possible time. In its
simplest form, FNN with a single hidden layer is powerful in solving many problems, given
that there are a sufficient number of hidden units in the layer (Nguyen and Widrow, 1990).
The importance of FNN is increasing every day due to its property of processing big
non-linear data like in the human brains. It can discover a hidden pattern in the data by
entering raw data in the input, passing layer by layer until it arrives at the output in the
forward direction. The model is trained to correctly estimate unseen data (also known as test
data) referred to as the generalization performance of FNN. The ideal FNN is considered to
have better generalization performance and may require less learning time (also known as
convergence rate) to train the model. Generalization performance can be defined as the
ability of an algorithm to accurately predict values on previously unseen samples (Yeung
et al., 2007), whereas learning time can be defined as the ability of the algorithm to train a
model quickly. Both generalization performanceand learning timeare key performance
indicators for FNNs and are used by researchers to demonstrate the effectiveness of their
proposed algorithms. The major drawbacks that influence FNN generalization performance
and learning speed (time) are:
(1) being trapped in a local minimum when the global minimum is far away;
(2) facing a problem of saddle point;
165
Neural networks
learning
algorithms and
applications
(3) the convergence decreases at plateau surface;
(4) network performance is affected by hyperparameter initialization and adjustment;
(5) needs trial and error approaches and expert involvement;
(6) repeatedly tuning of connection weights; and
(7) hidden units and layers adjustments.
The drawbacks can be avoided and FNN can be improved to approximate any non-linear
complex problem by the implementation of suitable algorithms. Several reasons which
cause the above drawbacks include:
(1) What should be the network size and depth, i.e. shallow or deep?
(2) How many hidden units should be generated by each hidden layer?
(3) How many hidden layers will be sufficient for deep learning?
(4) What should be network initial connection weights and learning rate?
(5) How should the hyperparameters be adjusted?
(6) What should be the size of the data set during network training?
(7) Which learning algorithm should be implemented?
(8) Which network topology is more efficient, i.e. fixed or cascade?
(9) What should be the criteria for increasing or decreasing the global and local
hyperparameters?
(10) What type of activation function should to be used in hidden units?
In the literature, the answers to the above questions are not so straightforward. Researchers
have proposed several learning algorithms and optimization techniques, that benefit in
improvingthe FNN, with the main motivationto get a better generalization performance in the
shortest possible network training time. In the existing literature surveys, several authors
have reviewed FNN algorithms by performing a comparative study of different algorithms
within the same class (for instance: constructive algorithms comparison based on data and
many others), studying the application area (for instance: business, engineering and many
others) and specific class surveys (for instance: network ensemble survey and many others).
For instance, Zhang (2000) focused on and surveyed the recent development of neural
networks for classification problems. The review included the link between the neural and
conventional classifiers and demonstrated that neural networks are a competitive alternative
to the traditional classifiers. Other contributions include examining the issues of posterior
probabilityestimation, feature selectionand the trade-off betweenlearning and generalization.
Hunter et al. (2012) performed a comparative study among different types of learning
algorithmsand network topology so as to select a properneural network size and architecture
for better generalization performance. LeCunet al. (2015) reviewed deep learningand provided
in-depth knowledge of backpropagation, convolutional neural network and recurrent neural
networks. The successin deep learning is that it requires little engineering by hand and new
algorithms will accelerate its progress much more. Tkáčand Verner (2016) provided a
systematicreview of neural network applicationsover two decades and disclosedthat most of
the application areas included financial distress and bankruptcy. Cao et al. (2018) presenteda
survey on tuningfree random weights neural networks from the perspective ofdeep learning.
The traditionaldeep learning iterativealgorithms are far slower and havethe problem of local
minima. The surveysuggests that the computing efficiency of deep learning increases dueto
the combinationof traditional deep learning and tuning free random weights neural network.
166
IMDS
120,1

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT