Discovering research topics from library electronic references using latent Dirichlet allocation

DOIhttps://doi.org/10.1108/LHT-06-2017-0132
Published date17 September 2018
Date17 September 2018
Pages400-410
AuthorDebin Fang,Haixia Yang,Baojun Gao,Xiaojun Li
Discovering research topics from
library electronic references
using latent Dirichlet allocation
Debin Fang, Haixia Yang and Baojun Gao
Economics and Management School, Wuhan University, Wuhan, China, and
Xiaojun Li
College of Accounting, Yunnan University of Finance and Economics,
Kunming, China
Abstract
Purpose Discovering the research topics and trends from a large quantity of library electronic references is
essential for scientific research. Current research of this kind mainly depends on human justification.
The purpose of this paper is to demonstrate how to identify research topics and evolution in trends from
library electronic references efficiently and effectively by employing automatic text analysis algorithms.
Design/methodology/approach The authors used the latent Dirichlet allocation (LDA), a probabilistic
generative topic model to extract the latent topic from the large quantity of research abstracts. Then, the
authors conducted a regression analysis on the document-topic distributions generated by LDA to identify
hot and cold topics.
Findings First, this paper discovers 32 significant research topics from the abstracts of 3,737 articles
publishedin the six top accounting journalsduring the periodof 1992-2014.Second, based on thedocument-topic
distributions generated by LDA, the authors identified seven hot topics and six cold topics from the 32 topics.
Originality/value The topics discovered by LDA are highly consistent with the topics identified by
human experts, indicating the validity and effectiveness of the methodology. Therefore, this paper provides
novel knowledge to the accounting literature and demonstrates a methodology and process for topic
discovery with lower cost and higher efficiency than the current methods.
Keywords Academic libraries, Big data, Accounting research, Latent Dirichlet allocation (LDA),
Topic model, Topic trends
Paper type Research paper
1. Introduction
In the era of big data, the increasing availability of electronic libraries helps scholars obtain
literature easier. The huge volume of electronic references, however, presents scholars with
the dilemma of information overload. Facing the large quantity of electronic references
recommended by the retrieval systems based on the title, abstract, or key words of scientific
works, scholars have to spend so much time determining the most related works. It is also
difficult for scholars entering a new research field to identify the trends in research and
the hot topics.
Many works have been devoted to discovering the research topics and trends of a
specific field. For example, Cushing (1989), Dunbar and Weber (2014), and Farcas (2015) are
all studies of this kind in the field of accounting. Historically, this kind of task mainly
depends on human justification, a process that is both time-consuming and labor-intensive.
As such, employing automatic text analysis algorithms and tools to identify the research
topics and their trends more efficiently is necessary.
Latent Dirichlet allocation (LDA) (Blei et al., 2003; Blei, 2012; Griffiths and Steyvers, 2004)
is a kind of probabilistic topic model aiming to discover the latent topics from text corpus.
Library Hi Tech
Vol. 36 No. 3, 2018
pp. 400-410
© Emerald PublishingLimited
0737-8831
DOI 10.1108/LHT-06-2017-0132
Received 30 June 2017
Revised 20 November 2017
20 November 2017
29 December 2017
Accepted 29 December 2017
The current issue and full text archive of this journal is available on Emerald Insight at:
www.emeraldinsight.com/0737-8831.htm
Conflicts of interest: the authors declare no conflict of interest.
The authors would like to thank all the supports from the National Natural Science Foundation
Programs of China (NSFC) (71771182, 71673210, 71725007).
400
LHT
36,3

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT