Long story short: finding health advice with informative summaries on health social media

Pages821-840
Published date18 November 2019
DOIhttps://doi.org/10.1108/AJIM-02-2019-0048
Date18 November 2019
AuthorYi-Hung Liu,Xiaolong Song,Sheng-Fong Chen
Subject MatterLibrary & information science,Information behaviour & retrieval,Information & knowledge management,Information management & governance,Information management
Long story short: finding health
advice with informative
summaries on health social media
Yi-Hung Liu
Department of Business Administration,
Zhejiang University of Technology, Hangzhou, China
Xiaolong Song
Department of Management Science and Engineering,
Dongbei University of Finance and Economics, Dalian, China, and
Sheng-Fong Chen
Department of Tropical Agriculture and International Cooperation,
National Pingtung University of Science and Technology, Pingtung, Taiwan
Abstract
Purpose Whether automatically generated summaries of health social media can aid users in managing their
diseases appropriately is an important question. The purpose of this paper is to introduce a novel text summarization
approach for acquiring the most informative summaries from online patient posts accurately and effectively.
Design/methodology/approach The data set regarding diabetes and HIV posts was, respectively, collected
from two online disease forums. The proposed summarizer is based on the graph-based method to generate
summaries by considering social network features, text sentiment and sentence features. Representative
health-related summaries were identified and summarization performance as well as user judgments were analyzed.
Findings The findings show that awarding sentences without using all the incorporating features
decreases summarization performance compared with the classic summarization method and comparison
approaches. The proposed summarizer significantly outperformed the comparison baseline.
Originality/value This study contributes to the literature on healthknowledge management by analyzing
patientsexperiencesand opinions through the healthsummarization model.The research additionallydevelops
a new mindsetto design abstractive summarizationweighting schemes from thehealth user-generated content.
Keywords Sentiment analysis, Social network, Health information management,
Health social media analytics, Patient forums, Text summarization
Paper type Research paper
Introduction
Health and our lives are intimately connected, and numerous daily events that involve
health issues are becoming serious concerns in modern life. Social media is a general concept
that includes many types of services, such as social networking websites (SNWs), online
community platforms, and online forums, which provide users with an accessible and
efficient means to share experiences. The emergence of health-centered SNWs and
applications has rendered the internet the primary means for seeking relevant healthcare
information. Patient-centric online forums, such as Patient (2018) and Patientslikeme (2018),
have served as public platforms for patients discussing or sharing their experiences of
diseases, lifestyles and treatments through numerous posts daily. These patient forums
have also attracted the attention of medical professionals and researchers.
The use of social media and search engines by patients and healthcare professionals to
search for information has increased significantly in recent years (Cole et al., 2016; Deng and Aslib Journal of Information
Management
Vol. 71 No. 6, 2019
pp. 821-840
© Emerald PublishingLimited
2050-3806
DOI 10.1108/AJIM-02-2019-0048
Received 5 March 2019
Revised 16 June 2019
28 June 2019
Accepted 19 July 2019
The current issue and full text archive of this journal is available on Emerald Insight at:
www.emeraldinsight.com/2050-3806.htm
The authors highly appreciate the editors and anonymous reviewers for their insightful comments and
suggestions. In addition, the authors would like to thank Dr Wen-Long Shiau for his helps and valuable
comments. This research is supported by the Social Science Foundation of China (Grant No. 18BGL249).
821
Long story
short
Liu, 2017). Although numerous posts can be immediately accessed on health forums, this
endeavor causes information overload, thereby impeding health information seekers in finding
relevant information (Cline et al., 2001). Condensed post content can provide users with the
most crucial health information efficiently during the search process. Therefore, the extraction
of useful and helpful health information for user reference has gained considerable attention.
Text summarization, which is a data abstraction process, enables users to procure a
sense of the content of a document without reading every single sentence in the full text
(Mani and Maybury, 1999; Gambhir and Gupta, 2017). Over the past decade, several
advancements in text summarization types have been reported, including single-document
(Zhongyuwei and Gao, 2014; Cheng and Lapata, 2016), multi-document (Piwowarski et al.,
2012; Aker and Gaizauskas, 2015), feature-based opinion (Huang and Cheng, 2015) and
biomedical (Plaza et al., 2011; Mishra et al., 2014) text summarization. Only a few studies
have considered extracting summaries from health social media. Cao et al. (2011) developed
a questionanswering (QA) system to accomplish semantic analysis on clinical questions;
this system returned question-focused extracted summaries as answers. However, the
majority of summarization systems are designed as general-purpose systems, in which they
do not consider the particular properties of each domain. Therefore, this study used existing
lexicon-based entity extraction approaches because of the considerable availability of
medical lexicons in the healthcare domain, such as the Unified Medical Language System
(UMLS, 2018) and Consumer Health Vocabulary (2018).
Posts and reviews existing under the same discussion topic in online forums can be
regarded as multi-document type. Studies on multi-document summarization of online health
forums have mainly focused on the analysis of documentsor texts written by a single author
at one time. Moreover, research has rarely focused on conflicting opinions, postcreation time
and post influence. In particular, multiple users in health social media forums (e.g. patient
forums) can express their opinions on the same topic. User perspectives vary because of the
time of postingand experiences of users. For time of posting,the more recent a user opinion is,
the higher isits reference value. For post content,the different experiencesof users may result
in conflicting opinions. For example, patients often share their true experiences regarding
adverse drug events(ADEs) for treatments of differentdiseases in online forums. In general,a
single drug maybe released at different times, and theformulation of this drug may vary over
time. In this case,an early user may have a different feelingand reaction to the drug compared
with a lateruser, thereby potentially leadingto conflicting opinionsbetween them. In addition,
post influence has a specific role determined jointly by the features of the post (e.g. post
content and creation time) and the authors. Therefore, this study investigated the effects of
post creation time, conflicting opinions and post influence.
This study used a semantic graph-based approach (Plaza et al., 2011) as the basis to
specifically develop a multi-document summarizer framework for online patient forums.
This summarizer framework can discover the most informative sentences in different topics
to simplify patientsopinions. Given the traits of online patient forums, our custom-designed
framework relies on four considerations: post influence (posts may have different influences
because of the authors identity, and their reliability may be determined by usersexpertise
(Chen et al., 2014)); post creation time (more recent posts usually provide users with the latest
information and have higher authenticity and more value); text sentiment (different users
have their own experiences or opinions regarding a topic, which may cause conflict of
opinion (Turney and Littman, 2003)); and sentence features (crucial in determining the
factors of text summarizer accuracy (McDonald and Chen, 2006), such as cue phrases,
position and sentence length). First, we incorporated four SNW features, sentence features
and degree of sentiment to determine the importance and sentiment strength of sentences.
Second, the modified semantic graph-based approach was used to select the top Nsentences
as summaries. To the best of our knowledge, no prior studies have focused on these four
822
AJIM
71,6

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT