Deep learning-based detection of tax frauds: an application to property acquisition tax

DOIhttps://doi.org/10.1108/DTA-06-2021-0134
Published date11 October 2021
Date11 October 2021
Pages329-341
Subject MatterLibrary & information science,Librarianship/library management,Library technology,Information behaviour & retrieval,Metadata,Information & knowledge management,Information & communications technology,Internet
AuthorChangro Lee
Deep learning-based detection of
tax frauds: an application to
property acquisition tax
Changro Lee
Department of Real Estate, Kangwon National University,
Chuncheon, Republic of Korea
Abstract
Purpose Sampling taxpayers for audits has always been a major concern for policymakers of tax
administration. The purpose of this study is to propose a systematic method to select a small number of
taxpayers with a high probability of tax fraud.
Design/methodology/approachAn efficient sampling method for taxpayers for an audit is investigated in
the context of a property acquisition tax. An autoencoder, a popular unsupervised learning algorithm, is
applied to 2,228 tax returns, and reconstruction errors are calculated to determine the probability of tax
deficiencies for each return. The reasonableness of the estimated reconstruction errors is verified using the
Apriori algorithm, a well-known marketing tool for identifying patterns in purchased item sets.
Findings The sorted reconstruction scores are reasonably consistentwith actual fraudulent/non-fraudulent
cases, indicating that the reconstruction errors can be utilized to select suspected taxpayers for an audit in a
cost-effective manner.
Originality/value The proposed deep learning-based approach is expected to be applied in a real-world tax
administration, promoting voluntary compliance of taxpayers, and reinforcing the self-assessing acquisition
tax system.
Keywords Autoencoder, Apriori algorithm, Reconstruction error, Deep learning, Unsupervised learning,
Tax audit
Paper type Research paper
1. Introduction
Those who have acquired real estate by purchase, exchange, gift or new construction are
subject to property acquisition tax. The taxable value is the purchase price reported by a
taxpayer, and the tax amount is determined by multiplying the taxable value by the relevant
tax rate. Hence, taxpayers tend to underreport their purchase prices when they file tax
returns for property acquisition. The under-reporting of transaction prices by taxpayers and
investigations into deficient taxpayers by the tax office have been frequently highlighted by
the National Tax Service (NTS) and mass media.
Therefore, filtering deficient taxpayers efficiently and focusing expensive tax audit
resources on them is crucial to maintaining a self-assessing tax system, such as individual
income tax and property acquisition tax. However, in contrast to the importance of detecting
deficient taxpayers for a tax audit, the current practice of selecting suspectedtaxpayers
remains at the basic level: the tax office randomly selects taxpayers for the audit, chooses
those who have not been audited for a long time or occasionally selects those who were
spotlighted by investigative journalists for tax fraud.
Consequently, devising a feasible audit-sampling strategy has become a pressing issue in
the audit business to increase the administrational efficiency of a self-assessing tax system.
This study aims to explore data-oriented solutions in the machine-learning literature and
provide a practical solution to the problem of sampling taxpayers for audits.
This study proposes a systematic approach to sample taxpayers for audits. First, the tax
returns on property acquisitions in the study area were prepared for the analysis. Second, an
autoencoder, a popular algorithm for the implementation of deep learning, was applied to the
Deep learning-
based detection
of tax frauds
329
The current issue and full text archive of this journal is available on Emerald Insight at:
https://www.emerald.com/insight/2514-9288.htm
Received 3 June 2021
Revised 17 July 2021
4 September 2021
Accepted 24 September 2021
Data Technologies and
Applications
Vol. 56 No. 3, 2022
pp. 329-341
© Emerald Publishing Limited
2514-9288
DOI 10.1108/DTA-06-2021-0134

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT