Identifying financial statement fraud with decision rules obtained from Modified Random Forest

Document

Cited in

Pages	235-255
DOI	https://doi.org/10.1108/DTA-11-2019-0208
Date	11 May 2020
Published date	11 May 2020
Author	Byungdae An,Yongmoo Suh
Subject Matter	Library & information science,Librarianship/library management,Library technology,Information behaviour & retrieval,Metadata,Information & knowledge management,Information & communications technology,Internet

Identifying financial statement

fraud with decision rules obtained

from Modified Random Forest

Byungdae An and Yongmoo Suh

Korea University Business School, Seoul, Republic of Korea

Abstract

Purpose –Financial statement fraud (FSF) committed by companies implies the current status of the

companies may not be healthy. As such, it is important to detect FSF, since such companies tend to conceal bad

information, which causes a great loss to various stakeholders. Thus, the objective of the paper is to propose a

novel approach to building a classification model to identify FSF, which shows high classificationperformance

and from which human-readable rules are extracted to explain why a company is likely to commit FSF.

Design/methodology/approach –Having prepared multiple sub-datasets to cope with class imbalance

problem, we build a set of decision trees for each sub-dataset; select a subset of the set as a model for the sub-

dataset by removing the tree, each of whose performanceis less than the average accuracy of all trees in the set;

and then select one such model which shows the best accuracy among the models. We call the resultingmodel

MRF (ModifiedRandom Forest). Given a new instance,we extract rules from the MRF model to explain whether

the company corresponding to the new instance is likely to commit FSF or not.

Findings –Experimental results show that MRF classifier outperformed the benchmark models. The results

also revealed that all the variables related to profit belong to the set of the most important indicators to FSF and

that two new variables related to gross profit which were unapprised in previous studies on FSF were

identified.

Originality/value –This study proposed a method of building a classification model which shows the

outstanding performance and provides decision rules that can be used to explain the classification results. In

addition, a new way to resolve the class imbalance problem was suggested in this paper.

Keywords Financial statement fraud, Random forest, Decision rules, Feature importance, Machine learning,

Predictive model

Paper type Research paper

1. Introduction

Financial statement shows the overall financial status of a company. Therefore, financial

statement fraud (FSF) causes a great loss to various stakeholders, such as investors, creditors

and even the company’s own employees. Companies should disclose their financial

information via various kinds of official announcements, including financial statements,

company-related news, external audit report and regulatory filings (Healy and Palepu, 2001).

However, executives could be easily led astray to provide their financial information with the

falsification or dilatoriness, because a financial problem of a company, if released, could

become a serious loss to the company, such as a stock market crash, divestment and damage

to the reputation of the company. Even experts like auditors and professional investors have

considerable difficulty in detecting it in advance, and it gives rise to much greater losses than

the other kinds of fraud, such as asset misappropriations, bribery and illegal gratuities

(Rezaee, 2005).

According to the 2018 Report to the Nations on Occupational Fraud and Abuse,

published by Association of Certified Fraud Examiners (ACFE), an estimate of the loss

due to FSF of a company amounts to about 5% of its annual revenue on average. It was

also reported FSF is one of the occupational frauds in business with the greatest median

loss per case, $800,000 (ACFE, 2018). The huge loss and bankruptcy provoked by FSF

Identifying

financial

statement

fraud

235

This research is partially supported by the Korea University Business School Research Grant.

The current issue and full text archive of this journal is available on Emerald Insight at:

https://www.emerald.com/insight/2514-9288.htm

Received 18 November 2019

Revised 25 March 2020

Accepted 11 April 2020

Data Technologies and

Applications

Vol. 54 No. 2, 2020

pp. 235-255

2514-9288

DOI 10.1108/DTA-11-2019-0208

committed by Enron, WorldCom, Qwest and Tyco are well known (Rezaee, 2005).

Recently, a late public statement by Hanmi Pharmaceutical Firm, a major Korean

pharmaceutical company, which is suspected of intentional, late posting of a public

statement after short selling a large volume of stock, resulted in over $3 million loss in

2016 (Kim, 2016;Lee, 2016).

FSF committed by a company implies the current status of the company is likely to be

unhealthy. Such companies tend to conceal bad information and usually show poor

performance prior to receiving forewarning notice from a public supervisor (Chung et al.,

2014). Additionally, stock returns tend to decline before and after firms’unfaithful behavior

(Han et al., 2014). The more unfaithful disclosures are connected to a company, the higher

interest rate of loan and the lower credit rating are applied to the company (Lee et al., 2008).

Hence, it is strongly required to detect FSF prior to receiving forewarning notice in order to

distinguish financially unhealthy companies from healthy ones.

Due to the crucial impact of FSF, there have been a lot of studies regarding the fraud.

Having started as earnings or accrual manipulation, FSF has been studied by many

researchers with regard to causes, effects and motivations, while there are some

researchers who focused on identifying factors which have significant effect on FSF (Rf.

Section 2.2).

Nowadays, applying data mining techniques to solving diverse financial issues became

imperative on account of its capability to mine knowledge out of a great number of instances.

Data mining techniques enable people to extract useful knowledge from a large dataset. In

accordance with such a trend, auditors, who are responsible for finding problematic firms,

had better utilize the techniques, since some limitations of human experts such as bias and

subjective judgment could be avoided by using the techniques (Ravisankar et al., 2011).

Besides, it is difficult for human experts to catch the time-varying importance of some

financial variables in determining FSF (Ngai et al., 2009). Hence, more recently, several

researchers tried to create classification models using data mining techniques to predict FSF

in advance (Rf. Section 2.3).

However, previous studies which utilized data mining techniques to detect FSF leave

something to be desired. First, it is worthwhile to expand the scope of FSF so as to include

more kinds of frauds, which will then mitigate the class imbalance problem. Note that most of

the previous studies mainly dealt with only two types of fraud, misstatement and

restatement. Second, previous studies focused on building a classification model showing

high performance, but it is also important to derive rules that can be used to explain the

classification result (Huang et al., 2014;Pai et al., 2011).

The objective of the paper is, therefore, twofold: (1) to build a classification model with

high performance to detect four types of FSF: misstatement, restatement, delayed disclosure

and cancelled disclosure of financial statement; (2) to extract human-readable rules, by which

we can explain why a company is likely to commit FSF. To that end, we used a joined dataset

of Korean companies obtained from Korea Investor’s Network for Disclosure (KIND) system

and a database from KIS-Value (Korea Investors Service-Value) and we modified the Random

Forest (RF) algorithm.

The rest of the paper is organized as follows. In Section 2, we examine the definitions

of several types of FSF and review previous works related to FSF. Section 3 describes

research method, including the dataset, financial ratio variables used to build a

classification model and modification of Random Forest algorithm to generate better

classification performance and human-readable rules. In Section 4, experimental results

are presented, including classification results, their statistical verification and examples

of rules. Finally, Section 5 concludes the paper with a summary, contributions and

future work.

DTA

54,2

236

To continue reading

Request your trial

Subscribers can access the reported version of this case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a list of all the cited cases and legislation of a document.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a list of all the documents that have cited the case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see the revised versions of legislation with amendments.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see any amendments made to the case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a visualisation of a case and its relationships to other cases. An alternative to lists of cases, the Precedent Map makes it easier to establish which ones may be of most relevance to your research and prioritise further reading. You also get a useful overview of how the case was received.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see the list of results connected to your document through the topics and citations Vincent found.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Identifying financial statement fraud with decision rules obtained from Modified Random Forest

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users