Providing consumers with a representative subset from online reviews

Document

Cited authorities 1

Cited in

Pages	877-899
Published date	09 October 2017
DOI	https://doi.org/10.1108/OIR-05-2016-0125
Date	09 October 2017
Author	Jin Zhang,Ming Ren,Xian Xiao,Jilong Zhang
Subject Matter	Library & information science,Information behaviour & retrieval,Collection building & management,Bibliometrics,Databases,Information & knowledge management,Information & communications technology,Internet,Records management & preservation,Document management

Providing consumers

with a representative subset

from online reviews

Jin Zhang

School of Business, Renmin University of China, Beijing, China

Ming Ren

School of Information Resource Management,

Renmin University of China, Beijing, China

Xian Xiao

Guanghua School of Management, Peking University, Beijing, China, and

Jilong Zhang

School of Business, Renmin University of China, Beijing, China

Abstract

Purpose –The purpose of this paper is to find a representative subset from large-scale online reviews for

consumers. The subset is significantly small in size, but covers the majority amount of information in the

original reviews and contains little redundant information.

Design/methodology/approach –A heuristic approach named RewSel is proposed to successively select

representatives until the number of representatives meets the requirement. To reveal the advantages of the

approach, extensive data experiments and a user study are conducted on real data.

Findings –The proposed approach has the advantage over the benchmarks in terms of coverage and

redundancy. People show preference to the representative subsets provided by RewSel. The proposed

approach also has good scalability, and is more adaptive to big data applications.

Research limitations/implications –The paper contributes to the literature of review selection, by

proposing a heuristic approach which achieves both high coverage and low redundancy. This study can be

applied as the basis for conducting further analysis of large-scale online reviews.

Practical implications –The proposed approach offers a novel way to select a representative subset of

online reviews to facilitate consumer decision making. It can also enhance the existing information retrieval

system to provide representative information to users rather than a large amount of results.

Originality/value –The proposed approach finds the representative subset by adopting the concept of

relative entropy and sentiment analysis methods. Compared with state-of-the-art approaches, it offers a more

effective and efficient way for users to handle a large amount of online information.

Keywords Online reviews, Redundancy, Coverage, Heuristic approach, Representative subset

Paper type Research paper

1. Introduction

Recent years have witnessed the advent of large volumes of online reviews on e-commerce

websites. Online reviews are an important source of information for consumers to make

informed decisions when purchasinga product, booking a flightor making a hotel reservation

(Archak et al., 2011; Matute et al., 2016; Neirottiet al.,2016;Wanget al., 2016). As an important

form of user-generated content, online reviews reflect consumers’genuine experiences and

describe aspectsof a product that are not disclosed in official channels (Pan and Zhang, 2011;

Chen, 2016). Therefore, consumers trust online reviews more than expert reviews on

traditionalmedia (Chen, 2008; Archak et al., 2011;Chiou et al., 2014; Yeh, 2015). Onlinereviews

are also an important source of information for vendors to identify opportunities to improve

their products or launch new products (Lee and Yang, 2015; Qi et al., 2016). Online Information Review

Vol. 41 No. 6, 2017

pp. 877-899

1468-4527

DOI 10.1108/OIR-05-2016-0125

Received 6 May 2016

Revised 8 May 2017

Accepted 15 June 2017

The current issue and full text archive of this journal is available on Emerald Insight at:

www.emeraldinsight.com/1468-4527.htm

The work is supported by Fundamental Research Funds for the Central Universities, and the Research

Funds of Renmin University of China (14XNI012).

877

Subset from

online reviews

While online reviews are useful for both consumers and vendors, the immense volume

of online reviews has caused a problem of information overload (Bawden and

Robinson, 2009; Krishen et al., 2011). For example, a product may have hundreds or even

thousands of reviews on an e-commerce platform (Hu and Liu, 2004; Park and Lee, 2008).

Due to the limited processing capacity, consumers cannot correctly and comprehensively

process all the reviews, and thus may form biased opinions (Bawden and Robinson, 2009;

Oulasvirta et al., 2009).

The problem of information overload gives rise to the need to find a small set of high-

quality information (Bawden and Robinson, 2009; Zhang et al., 2012), especially because

nowadays people use mobile devices (e.g. smart phones) that have limited screen sizes and

low navigability (Chuang et al., 2012). One well-known approach is to provide the top-k

reviews according to a rank based on some criteria, such as posting time or helpfulness.

However, the top-kapproach is susceptible to two major issues. First, it may not cover

various aspects of all the review content, and the overlooked aspects could be important to

consumers and corporations to make informed decisions. Second, the top-kreviews may

contain redundant information, because reviews that rank high are often similar.

To provide a better user experience in terms of quickly grasping a large amount of online

reviews, it is preferable to provide a representative subset of reviews (Pan et al., 2005), which

is significantly small in size but captures the majority amount of information in the original

data set (i.e. high coverage) and has little redundant information (i.e. low redundancy).

For example, Pan et al. (2005) proposed a greedy algorithm to extract a representative subset

from a database with binary tuples. Later, Zhang et al. (2014) extended the work of Pan et al.

to text data, and proposed a heuristic approach to extract a subset from web search results.

However, these methods cannot be readily applied to online reviews, which have some

distinct characteristics, such as expressing opinions on specific features. Meanwhile, there

are studies on review selection that formulate the problem as a maximum coverage problem

(Tsaparas et al., 2011; Nguyen et al., 2015), but those studies have not taken redundancy into

consideration, which is another essential property of representative information and can

highly affect user experience (Pan et al., 2005).

This paper aims to find a representative subset from a large amount of online reviews

in terms of both high coverage and low redundancy. A heuristic approach called RewSel is

proposed to select representatives one by one until the number of representatives meets

the requirement. To be concrete, in order to include as much information as possible, the

first selected representative is the one that is most similar to the original review set. When

therepresentativesubsetcontainsoneormorereviews,thenextonethatbringsleast

redundant information to the existing representative subset is selected. Such a

representative also contributes to coverage by including additional information.

Extensive experiments demonstrate that the selected subsets possess desirable

properties of high coverage and low redundancy, which are two essential

characteristics of representative information (Pan et al., 2005).

In short, the main contributions of the work can be summarized as follows:

(1) A novel representative online review selection problem taking into consideration

both coverage and redundancy is proposed in this study. To the best of our

knowledge, although the issue of redundancy within selected subsets can be found

in some document extraction studies, it still has not been adequately studied by

review selection literature.

(2) A heuristic approach that achieves high coverage and low redundancy is proposed

to select a representative subset from online reviews to facilitate consumer decision

making. In the proposed approach, relative entropy is applied as a metric to measure

the differences between review sets.

878

OIR

41,6

To continue reading

Request your trial

Subscribers can access the reported version of this case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a list of all the cited cases and legislation of a document.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a list of all the documents that have cited the case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see the revised versions of legislation with amendments.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see any amendments made to the case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a visualisation of a case and its relationships to other cases. An alternative to lists of cases, the Precedent Map makes it easier to establish which ones may be of most relevance to your research and prioritise further reading. You also get a useful overview of how the case was received.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see the list of results connected to your document through the topics and citations Vincent found.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Providing consumers with a representative subset from online reviews

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users