An approach for fault prediction in SOA-based systems using machine learning techniques

Publication Date03 September 2019
Date03 September 2019
AuthorGuru Prasad Bhandari,Ratneshwer Gupta,Satyanshu Kumar Upadhyay
SubjectLibrary & information science
An approach for fault prediction
in SOA-based systems using
machine learning techniques
Guru Prasad Bhandari
DST-CIMS, Banaras Hindu University, Varanasi, India
Ratneshwer Gupta
School of Computer and Systems Sciences,
Jawaharlal Nehru University, New Delhi, India, and
Satyanshu Kumar Upadhyay
DST-CIMS, Banaras Hindu University, Varanasi, India
Purpose Software fault prediction is an important concept that can be applied at an early stage of
the software life cycle. Effective prediction of faults may improve the reliability and testability of software
systems. As service-oriented architecture (SOA)-based systems become more and more complex, the
interaction between participating services increases frequently. The component services may generate
enormous reports and fault information. Although considerable research has stressed on developing
fault-proneness prediction models in service-oriented systems (SOS) using machine learning (ML) techniques,
there has been little work on assessing how effective the source code metrics are for fault prediction.
The paper aims to discuss this issue.
Design/methodology/approach In this paper, the authors ha ve proposed a fault prediction frame work
to investigate fault pr ediction in SOS using metrics of we b services. The effectiven ess of the model has been
explored by applying six ML t echniques, namely, Naï ve Bayes, Artificial Net works (ANN), Adaptive
Boosting (AdaBoost), decision tree, Rando mF orests and Support Vector Machine (SVM), along with five
feature selection techniques to extract the essential metrics. The authors have explored accuracy,
precision, recall, f-mea sure and receiver operating characteris tic curves of the area under curve values as
performance measures .
Findings The experimental results show that the proposed system can classify the fault-proneness of web
services, whether the service is faulty or non-faulty, as a binary-valued output automatically and effectively.
Research limitations/implications One possible threat to internal validity in the study is the unknown
effects of undiscovered faults. Specifically, the authors have injected possible faults into the classes using
Java C3.0 tool and only fixed faults are injected into the classes. However, considering the Java C3.0
community of development, testing and use, the authors can generalize that the undiscovered faults should be
few and have less impact on the results presented in this study, and that the results may be limited to the
investigated complexity metrics and the used ML techniques.
Originality/value In the literature, only few studies have been observed to directly concentrate on
metrics-based fault-proneness prediction of SOS usingML techniques. However, most of the contributions are
regarding the fault prediction of the general systems rather than SOS. A majority of them have considered
reliability, changeability, maintainability using a logging/history-based approach and mathematical
modeling rather than fault prediction in SOS using metrics. Thus, the authors have extended the above
contributions further by applying supervised ML techniques over web services metrics and measured their
capability by employing fault injection methods.
Keywords Fault, Fault prediction, Service-oriented systems, Web services, Metrics, Fault injection,
Machine learning
Paper type Research paper
1. Introduction
Service-oriented architecture (SOA) based systems (SOSs) are gaining popularity because
they have features of software with loosely coupled services. The quality of an SOS depends
on the quality of individual services and how well these services interact with each other in
order to form a system. By predicting the fault-proneness of software services, at an early
Data Technologies and
Vol. 53 No. 4, 2019
pp. 397-421
© Emerald PublishingLimited
DOI 10.1108/DTA-03-2019-0040
Received 15 March 2019
Revised 4 May 2019
Accepted 10 May 2019
The current issue and full text archive of this journal is available on Emerald Insight at:
prediction in
stage of development, the quality of the system can be enhanced. It can help to take
decisions at various levels, such as testing and maintenance phases.
Many fault prediction models have been proposed in the literature. Source code
metrics-based fault-proneness prediction models are very effective and used by many
researchers. The advantage of these specific models is that they can be easily
implemented and understood by software engineering experts and developers. In addition,
they can provide valuable and simple insights on why a specific class (module) is
classified as fault-prone by indicating which metrics have problematic values and need to
be adjusted (Hu et al., 2008). Predicting fault in procedural and object-oriented (OO)
software by utilizing the source code metrics is a common area that has attracted several
researchersattention. However, predicting fault in SOS in terms of defects in Web Service
escription Language (WSDL) Interface using source code metrics implementing the
services is a relatively lesser explored area.
Basically, fault is an incorrect functioning or inability of an entity of the system. It can be
an inherent weakness of the design or implementation and can direct the system to failure
(Avižienis et al., 2004). On the other hand, a failure is a state or condition of not meeting
a desired or intended objective, i.e. a service stops performing its required operation
(Khan and Yairi, 2018) in case of the faulty situation.
Web services code are important sources for diagnosing system and service faults.
However, if the size of the system increases, it is difficult to extract useful information from
the system effectively and efficiently and locate the fault accurately. Manual extraction of
useful information from vast amounts of data using conventional techniques would
seriously delay the response time of fault recovery. Thus, an automatic fault diagnosis
system is required to enhance performance.
A number of review works (Abaei and Selamat, 2014; Shatnawi et al., 2010; Wong et al.,
2016; Hall et al., 2012) have explored the utilization of machine learning (ML) in software
engineering as a mature technique using well-known supervised learning algorithms.
Various supervised ML-based techniques have been investigated in several fields of
systems and software engineering to predict software faults (Li et al., 2017; Shatnawi et al.,
2010), effort estimation (Moeyersoms et al., 2015; Baskeles et al., 2007), software quality
(Ding et al., 2016), etc. Many previous studies like Kumar et al. (2018), Ding et al. (2016),
Xu et al. (2014) and Rosá et al. (2017) aimed to incorporate fault prediction in systems,
pointing to the reliability and maintainability problem of SOS.
In this paper, we have proposed a methodology to predict faults in WSDL files of web
services using experimentally generated data sets from the benchmarked systems. This
model can provide detailed and helpful information for developers to diagnose faulty
services. Unlike prior works, we utilize the faultinjection tool ( Jaca C3.0; Lúcia et al., 2003) in
web services to inject the fault artificially. Several ML techniques (Naïve Bayes (NB),
artificial neural networks (ANN), Adaptive Boosting (AdaBoost), decision tree (DT), random
forest (RF), support vector machine (SVM)) are used to calculate the performance measures
to observe their effectiveness and compare them. Mateos and Zunino (Coscia et al., 2012)
have suggested that there is a significant statistical correlation between OO metrics and
WSDL-related metrics that can measure the complexity and the quality at the service level
in WSDL files. So, we have used the Chidamber and Kemerer (CK) metrics, which is based on
OO metrics. The WSDL files of web services are converted into java class files and 21
source code metrics are extracted according to CK (Chidamber and Kemerer, 1992) using the
CKJM-extended tool ( Jureczko and Spinellis, 2010).
Experiments are conducted to observe the effectiveness of variousperformance measures
like accuracy, precision, recall, f-measure, area under curve (AUC) values, etc. First, all the 21
source codemetrics are considered for the faultprediction; after that, onlythe selected relevant
metrics are measured to see the result by applying fivedifferent feature-selection algorithms

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT