Use of multi-lexicons to analyse semantic features for summarization of touring reviews
DOI | https://doi.org/10.1108/EL-11-2018-0215 |
Pages | 185-206 |
Published date | 04 February 2019 |
Date | 04 February 2019 |
Author | Hei Chia Wang,Yu Hung Chiang,Yi Feng Sun |
Use of multi-lexicons to analyse
semantic features for
summarization of touring reviews
Hei Chia Wang and Yu Hung Chiang
Department of Industrial and Information Management and Institute of
Information Management, National Cheng Kung University, Tainan, Taiwan, and
Yi Feng Sun
Department of Information Management, National Cheng Kung University,
Tainan, Taiwan
Abstract
Purpose –This paper aims to improve a sentiment analysis (SA) system to help users (i.e. customers or
hotel managers) understand hotel evaluations. There are three main purposes in this paper: designing an
unsupervised method for extracting online Chinese features and opinion pairs, distinguishing different
intensitiesof polarity in opinion words and examining the changes in polarity in the time series.
Design/methodology/approach –In this paper, a review analysis systemis proposed to automatically
capture feature opinions experienced by othertourists presented in the review documents. In the system, a
feature-level SA is designed to determine thepolarity of these features. Moreover, an unsupervised method
using a part-of-speech patternclarification query and multi-lexicons SA to summarize all Chinese reviews is
adopted.
Findings –The authors expectthis method to help travellers search for what they want and make decisions
more efficiently. The experimental results show the F-measure of the proposed method to be 0.628. It thus
outperformsthe methods used in previous studies.
Originality/value –The study is usefulfor travellers who want to quickly retrieveand summarize helpful
information from the pool of messy hotel reviews. Meanwhile, the system will assist hotel managers to
comprehensivelyunderstand service qualities withwhich guests are satisfied or dissatisfied.
Keywords Sentiment analysis, Tourism industry, Text mining, Feature extraction, Multi-lexicons
Paper type Research paper
1. Introduction
Web 2.0 technology is a vital travel shopping channel (Mohseni et al.,2018). Global online
travel sales grew 15.4per cent in 2015. In the following year, online travel sales cumulatively
generated US$564.9bn (Statista, 2018). It has been found that more than 60 per cent of
consumers consult customer feedback before making purchases (Lightspeed Research,
2011). These changes have produced a shift in marketing strategies and business
administrationof companies, especially those in the hotel industry.
In Web 2.0, electronic word-of-mouth (eWOM) has developed significantly. Customers
can make their thoughts, opinions and personal feelings easily accessible to the virtual
community of internet users (Dellarocas, 2003;Inversini et al.,2010). In terms of eWOM,
The research is based on the work supported by Taiwan Ministry of Science and Technology under
grant number MOST 103-2410-H-006-055-MY3.
Multi-lexicons
to analyse
semantic
features
185
Received1 November 2018
Revised23 January 2019
Accepted5 February 2019
TheElectronic Library
Vol.37 No. 1, 2019
pp. 185-206
© Emerald Publishing Limited
0264-0473
DOI 10.1108/EL-11-2018-0215
The current issue and full text archive of this journal is available on Emerald Insight at:
www.emeraldinsight.com/0264-0473.htm
user-generated content has grownrapidly. Min et al. (2015) pointed out that over 75 per cent
of customers will read online reviewsbefore they reserve a hotel. Furthermore, Gretzel et al.
(2008) indicated that virtual communities (TripAdvisor, VirtualTourist and LonelyPlanet)
are the most used travel websites (92.3 per cent), especially for gathering information,
evaluating alternatives, avoiding unenjoyable places and providing ideas.Furthermore, the
percentage of consumers that consult online travel reviews before purchasing is increasing
(Book et al., 2018). Therefore, onlineconsumer reviews about tourism services have become
important sources of informationfor travellers (Mauri and Minazzi, 2013).
As the number of customer reviews expands,it becomes more difficult for users to obtain
a comprehensive view of the opinions of previous travellers about various aspects of
products through a manual analysis. Consequently, proper analysis and summarization of
consumer reviews can further enable potential consumers to visualize previous positive,
negative or neutral opinions about specific product features. Moreover, product features
subsequently impact the identification of opinions and provide polarity clarification
(Bagheri et al., 2013;Jinet al.,2016). Therefore, developing an automatic and accurate review
analysis system is importantfor travellers.
A sentiment analysis (SA) system can analyse the opinions stated in consumer reviews
(Bagheri et al., 2013;Quan and Ren, 2016). SA is also called opinion mining, and for online
customer reviews, it has attracted a great deal of attention from data mining and natural
language processing researchers.SA techniques basically include two branches: document-
level SA and feature-levelSA. Document-level SA focuses on determining an overall opinion
about a document. Feature-level SA aims to extract product features (e.g. hotel equipment,
staff attitude) from reviews, and then determine opinions that are linked with each product
feature. Compared with a document-level SA, a feature-level SA can provide a more fine-
tuned SA for certain opinion targets and also has a wider range of applications (Qiu, et al.,
2011).
In a feature-level SA, it is a basic problem to find the product feature that has been
extracted in the reviews (Quan and Ren, 2016). Existing product feature extraction can
broadly be classified into two major approaches: supervised and unsupervised. Supervised
feature detection approaches require a set of pre-labelled training data. Therefore, product
feature extraction can be seen as a problem of domain-specific entityrecognition. However,
most of the existing methods used for domain-specific entity recognition often rely on
domain-specific knowledge to improve system performance, but such knowledge is
expensive to build and maintain for variousdomains, and it is difficult to extend a domain-
dependent method to other domains (Bagheri et al.,2013;Wang et al.,2018). Therefore,
extracting domain-specific product features in a generic and unsupervised manner is
desirable for a feature-levelSA.
Previous studies (Bagheri et al.,2013;Quan and Ren, 2014) have developed feature-level
SA methods in English in which a part-of-speech(POS) pattern is used to find the candidate
features (i.e. nouns þadjectives,nouns þnouns), but it is doubtful that applying a standard
Western-style phrase structural analysis to Chinese will work because of the differences
between Chinese and Indo-European languages. Furthermore, product feature extraction is
a critical task for the polarity clarification of opinion words, and the extraction methods
must be robust and easily transferable between domains (Bagheri et al.,2013;Wang et al.,
2018). Opinion words have different degrees of polarity intensity (e.g. anger and rage)
(Appel et al.,2016;Bagheri et al., 2013). If it were possible to detect more detailed opinion
words, it would be possible for customersto understand the thoughts, opinions and personal
feelings of previous customers. Therefore, herein, the intensity of opinion words (called a
sentiment score) is distinguishedwhile using knowledge ontology architecture.
EL
37,1
186
To continue reading
Request your trial