Predicting corporate credit rating based on qualitative information of MD&A transformed using document vectorization techniques

Document

Cited in

Published date	13 March 2020
DOI	https://doi.org/10.1108/DTA-08-2019-0127
Date	13 March 2020
Pages	151-168
Author	Jinwook Choi,Yongmoo Suh,Namchul Jung
Subject Matter	Library & information science,Librarianship/library management,Library technology,Information behaviour & retrieval,Metadata,Information & knowledge management,Information & communications technology,Internet

Predicting corporate credit rating

based on qualitative information of

MD&A transformed using

document vectorization techniques

Jinwook Choi and Yongmoo Suh

Korea University Business School, Seoul, Republic of Korea, and

Namchul Jung

School of Business Administration, Hongik University, Seoul, Republic of Korea

Abstract

Purpose –The purpose of this study is to investigate the effectiveness of qualitative information extracted

from firm’s annual report in predicting corporate credit rating. Qualitative information represented by

published reports or management interview has been known as an important source in addition to quantitative

information represented by financial values in assigning corporate credit rating in practice.Nevertheless, prior

studies have room for further research in that they rarely employed qualitative information in developing

prediction model of corporate credit rating.

Design/methodology/approach –This study adopted three document vectorization methods, Bag-Of-

Words (BOW), Word to Vector (Word2Vec) and Document to Vector (Doc2Vec), to transform an unstructured

textual data into a numeric vector, so that Machine Learning (ML) algorithms accept it as an input. For the

experiments, we used the corpus of Management’s Discussion and Analysis (MD&A) section in 10-K financial

reports as well as financial variables and corporate credit rating data.

Findings –Experimental results from a series of multi-class classification experiments show the predictive

models trained by both financial variables and vectors extracted from MD&A data outperform the benchmark

models trained only by traditional financial variables.

Originality/value –This study proposed a new approach for corporate credit rating prediction by using

qualitative information extracted from MD&A documents as an input to ML-based prediction models. Also,

this research adopted and compared three textual vectorization methods in the domain of corporate credit

rating prediction and showed that BOW mostly outperformed Word2Vec and Doc2Vec.

Keywords Corporate credit rating, Qualitative information, MD&A, Document vectorization, Machine

learning, Predictive model

Paper type Research paper

1. Introduction

Credit rating provided by bond rating agencies[1] is an opinion about credit quality of bond

issuers. Roles of evaluating credit rating include valuation and contract facilitation (Frost,

2007). The former is to disseminate information about the default risk or creditworthiness of

bond issuers to capital market participants, thereby helping their decision-making about it.

The latter is to facilitate contracts between bond investors and issuers by reducing

information asymmetry related to the credit risk of borrowers. As a result, credit rating has

been an important benchmark for issuers to reduce the cost of capital and for investors to

avoid default risk of their investees, as well as for regulatory bodies to achieve regulatory

objectives such as determining rating-based criteria. Thus, it is not surprising that a bunch of

studies on corporate credit rating prediction have been actively conducted in academia.

Research on predicting credit rating is important for the following reasons. First,

predicting credit rating could provide an early warning of financial distress of firms (Hajek

and Michalak, 2013). Second, since ratings by bond rating agencies may not reflect default

Predicting

corporate

credit rating

151

This research was supported by the Korea Univeristy Business School Research Grant.

The current issue and full text archive of this journal is available on Emerald Insight at:

https://www.emerald.com/insight/2514-9288.htm

Received 4 August 2019

Revised 24 December 2019

Accepted 13 January 2020

Data Technologies and

Applications

Vol. 54 No. 2, 2020

pp. 151-168

2514-9288

DOI 10.1108/DTA-08-2019-0127

risk in a timely manner (Kim and Ahn, 2012), it is necessary for lenders such as financial

institutions to estimate the credit rating of borrowers independently. Third, to assess and

update a credit rating through rating agencies are very costly, because agencies require

considerable time and effort to perform in-depth analysis of the company (Huang et al., 2004).

Machine Learning (ML) algorithms have been the primary methods to develop prediction

models for corporate credit rating. Although early studies of this issue had concentrated on

statistical models such as linear regression, ML algorithms such as Artificial Neural

Networks (ANN) and Support Vector Machines (SVM) emerged as a new solution to corporate

credit rating prediction, because ML algorithms showed better performance than traditional

statistical methods (Chen and Shih, 2006;Huang et al., 2004;Lee, 2007). All ML algorithms

build prediction models using training dataset. The more relevant information of high quality

the training data includes, the better the performance of a predictive model is. Thus, it could

be crucial to decide what kinds of information to use as an input to ML algorithms.

It is generally known that credit rating process takes into consideration both financial

risk (e.g. financial characteristics, capital structure and financial liquidity) and business

risk (e.g. industry characteristics, management integr ity, firm’s strategic position and

competitiveness). Therefore, to predict a corporate credit rating using both qualitative

information representing business risk and quantitative information representing

financial risk would be meaningful for the following reasons. First, bond rating

agencies such as S&P practically use model-driven ratings obtained from various

information sources such as published reports and management interview as well as

model-driven ratings based on mathematical works (Standard and Poor’s, 2018). Second,

while credit rating is forward looking aspect, most of quantitative financial data are

backward looking. As such, rating only based on financial data may need adjustments by

domain experts, which is a subjective judgement. Third, financial data from accounting

numbers generated under generally accepted accounting principle (GAAP) may not fully

reflect the economic circumstances, which firms face.

Nevertheless, earlier studies on identifying the determinants of corporate credit rating

mainly focused on quantitative structured information (“hard facts”), that is, financial values

from financial statements (Horrigan, 1966;Kaplan and Urwitz, 1979;West, 1970). Recently, a

few researches have attempted to consider qualitative information so-called soft facts, which

refers to unstructured disclosures included in firms’annual report or non-financial factors

(Bonsall and Miller, 2017;Bozanic et al., 2018;Bozanic and Kraft, 2014;Lehmann, 2003).

However, those studies have room for further research in that they just examined the

association between variables extracted from qualitative information and corporate credit

rating. Therefore, it would complement existing studies to explore the effectiveness of

qualitative information as an input feature to the prediction model of corporate credit rating.

Using features about soft facts when building prediction models might be more appropriate,

since they are used when determining credit rating in practice.

In this study, the authors propose a novel approach to predicting corporate credit rating,

which takes advantage of qualitative information extracted from firm’s annual report.

Specifically, the proposed method makes use of Management’s Discussion and Analysis

(MD&A) section in 10-K financial reports required by Securities and Exchange Commission

(SEC). We employ Bag-of-Words (BOW), Word to Vector (Word2Vec) and Document to

Vector (Doc2Vec) to transform an unstructured textual data into a numeric vector. They

examine the usefulness of qualitative information extracted from the MD&A document in

predicting corporate credit rating and also conduct several experiments under special

conditions to scrutinize whether using both quantitative and qualitative information could

enhance the performance of a prediction model.

The remainder of the paper is structured as follows. Section 2 reviews the previous

literature relevant to corporate credit rating prediction and MD&A section of 10-K report.

DTA

54,2

152

To continue reading

Request your trial

Subscribers can access the reported version of this case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a list of all the cited cases and legislation of a document.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a list of all the documents that have cited the case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see the revised versions of legislation with amendments.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see any amendments made to the case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a visualisation of a case and its relationships to other cases. An alternative to lists of cases, the Precedent Map makes it easier to establish which ones may be of most relevance to your research and prioritise further reading. You also get a useful overview of how the case was received.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see the list of results connected to your document through the topics and citations Vincent found.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Predicting corporate credit rating based on qualitative information of MD&A transformed using document vectorization techniques

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users