Coauthorship network-based literature recommendation with topic model

Publication Date12 June 2017
Date12 June 2017
AuthorSan-Yih Hwang,Chih-Ping Wei,Chien-Hsiang Lee,Yu-Siang Chen
SubjectLibrary & information science,Information behaviour & retrieval,Collection building & management,Bibliometrics,Databases,Information & knowledge management,Information & communications technology,Internet,Records management & preservation,Document management
Coauthorship network-based
literature recommendation
with topic model
San-Yih Hwang
Department of Information Management,
National Sun Yat-sen University, Kaohsiung, Taiwan
Chih-Ping Wei
Department of Information Management,
National Taiwan University, Taipei, Taiwan, and
Chien-Hsiang Lee and Yu-Siang Chen
Department of Information Management,
National Sun Yat-sen University, Kaohsiung, Taiwan
Purpose The information needs of the users of literature database systems often come from the task at
hand, which is short term and can be represented as a small number of articles. Previous works on
recommending articles to satisfy usersshort-term interests have utilized article content, usage logs, and more
recently, coauthorship networks. The usefulness of coauthorship has been demonstrated by some research
works, which, however, tend to adopt a simple coauthorship network that records only the strength of
coauthorships. The purpose of this paper is to enhance the effectiveness of coauthorship-based
recommendation by incorporating scholarscollaboration topics into the coauthorship network.
Design/methodology/approach The authors propose a latent Dirichlet allocation (LDA)-coauthorship-
network-based method that integrates topic information into the links of the coauthorship networks using
LDA, and a task-focused technique is developed for recommending literature articles.
Findings The experimental results using information systems journal articles show that the proposed
method is more effective than the previous coauthorship network-based method over all scenarios examined.
The authors further develop a hybrid method that combines the results of content-based and LDA-
coauthorship-network-based recommendations. The resulting hybrid method achieves greater or comparable
recommendation effectiveness under all scenarios when compared to the content-based method.
Originality/value This paper makes two contributions. The authors first show that topic model is indeed
useful and can be incorporated into the construction of coaurthoship-network to improve literature
recommendation. The authors subsequently demonstrate that coauthorship-network-based and content-
based recommendations are complementary in their hit article rank distributions, and then devise a hybrid
recommendation method to further improve the effectiveness of literature recommendation.
Keywords Topic modelling, Academic literature, Coauthorship network, Recommender system
Paper type Research paper
1. Introduction
In recent years, recommender systems have found their way into many aspects of our lives and
now perhaps are one of the most common applications of big data (Finger, 2014). Retail stores
such as pioneered the use of recommender systems, which customize users
personalized webpages and recommend books and products that they have not seen but are
likely to appreciate. Insurance companies and finance firms have adopted recommender
systemsto suggest coverage plans and investment opportunities fortheir customers, andeven
law firms can rely on recommender systems to arrive at better decisions (Booker, 2013).
Online Information Review
Vol. 41 No. 3, 2017
pp. 318-336
© Emerald PublishingLimited
DOI 10.1108/OIR-06-2016-0166
Received 25 June 2016
Revised 22 November 2016
Accepted 27 February 2017
The current issue and full text archive of this journal is available on Emerald Insight at:
This work was partially supported in part by Aim for the Top University Planof National Sun
Yat-sen University in Taiwan and grants from Ministry of Science and Technology of Taiwan under
Grant Nos MOST 104-2410-H-110 -039 -MY2 and MOST 104-2410-H-002-143-MY3.
With the advent of Web 2.0, recommender systems have been further employed by social
networking sites, such as LinkedIn, Twitter, Facebook, and ResearchGate, to recommend
people, communities, posts, and articles that users might be interested in. In fact, recommender
systems are needed whenever the number of choices becomes overwhelming and when the best
decisions strongly depend on personal preferences or situational contexts.
Our focus in this paper is literature article recommendation. Unlike most existing
recommendation worksthat utilize usershistorical profile (e.g. ratings or purchasing/browsing
history) to discover personal preferences for recommendation decisions, this work draws on
usersshort-term interest (i.e. a set of recently accessed articles that together constitute a task
profile) for making recommendations. This is often called a task-focused approach in the
literature (Herlocker and Konstan, 2001). The reason for adopting the task-focused approach
for literature recommendation is twofold. First, we are often obliged to gather a number of
related documents when performing a given task, which is possibly unrelated to our long-term
interests. Second, most literature databases available on the internet do not require users to
identify themselves, making userslong-term interests unavailable.
Task-focused literature recommendation methods employ various types of information,
such as content (Hwang and Chuang, 2004; Mobasher et al., 2000), usage logs (Hwang and
Chuang, 2004; Mobasher et al., 1999, 2000), and citation networks (Yang and Lin, 2013).
Specifically, given an active users task profile, these methods recommend articles that
either have similar content or are often co-accessed (co-cited) with the articles in the given
task profile. The usefulness of coauthorships has also been demonstrated in previous
research. For example, Hwang, Wei and Liao (2010) employ a coauthorship network for
recommending literature articles and show that it is more effective than a content-based
method when articles in a task profile as specified by a user are similar in their content.
Wallace et al. (2012) examine citation practices of academic articles from eight disciplines in
relation to their coauthorship network and find that papers which tend to cite collaborators
will also tend to cite collaborators of collaborators.Martin et al. (2013) observe that articles
in physics have especially high rates for citations between coauthors.
Although prior research (Hwang, Wei and Liao, 2010) has shown the superior
performance of the proposed coauthorship network-based literature recommendation
method in some scenarios, the adopted coauthorship network is quite simple, and it does not
associate any information to coauthoring relations. While there could be multiple factors
that pertain to a coauthoring relation, the topics shared by the coauthors reveal their
common interests, which, if properly used, may further improve recommendation accuracy.
For example, a user who has shown interest in a data mining article A is less likely to be
interested in another article B about technology acceptance model (TAM) even if the authors
of A and B have a strong coauthoring relationship (in their other articles).
This study aims to address this problem by integrating topic information into a
coauthorship network so as to manifest scholarsshared research interests. Specifically, we
develop a method that automatically identifies research topics involved in the collaboration
between scholars from the content of coauthoring articles and incorporate the information
on such research topics into a coauthorship network. Based on the new coauthorship
network, we represent an article as a vector of scholars, representing the strengths of
extended authors. By computing the similarity of scholar vectors between the articles from a
literature database and recently accessed articles by an active user, our proposed method
ranks the articles and selects the top Nsimilar articles for a task-focused literature
recommendation. We also conduct experiments to evaluate the effectiveness of our proposed
method across different scenarios, using the more than 10,000 articles that have appeared in
major information systems (IS) journals in the last 20 years. The evaluation results show
that our proposed method outperforms the existing coauthorship network-based method
over all scenarios. From the experimental results, we also find that the rank distribution of

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT