A collaborative trend prediction method using the crowdsourced wisdom of web search engines

DOIhttps://doi.org/10.1108/DTA-08-2021-0209
Published date28 March 2022
Date28 March 2022
Pages741-761
Subject MatterLibrary & information science,Librarianship/library management,Library technology,Information behaviour & retrieval,Metadata,Information & knowledge management,Information & communications technology,Internet
AuthorZe-Han Fang,Chien Chin Chen
A collaborative trend prediction
method using the crowdsourced
wisdom of web search engines
Ze-Han Fang and Chien Chin Chen
Department of Information Management, National Taiwan University,
Taipei, Taiwan
Abstract
Purpose The purpose of this paper is to propose a novel collaborative trend prediction method to estimate
the status of trending topics by crowdsourcing the wisdom in web search engines. Government officials and
decision makers can take advantage of the proposed method to effectively analyze various trending topics and
make appropriate decisions in response to fast-changing national and international situations or popular
opinions.
Design/methodology/approach In this study, a crowdsourced-wisdom-based feature selection method
was designed to select representative indicators showing trending topics and concerns of the general public.
The authors also designed a novel prediction method to estimate the trending topic statuses by crowdsourcing
public opinion in web search engines.
Findings The authorsproposed method achieved better results than traditional trend prediction methods
and successfully predict trending topic statuses by using the crowdsourced wisdom of web search engines.
Originality/value This paper proposes a novel collaborative trend prediction method and applied it to
various trending topics. The experimental results show that the authorsmethod can successfully estimate the
trending topic statuses and outperform other baseline methods. To the best of the authorsknowledge, this is
the first such attempt to predict trending topic statuses by using the crowdsourced wisdom of web search
engines.
Keywords Collaborative trend prediction, Crowdsourced wisdom, Search behavior, Crowd query and
trending topic status matrix, Topic trend prediction
Paper type Research paper
1. Introduction
In response to rapidly changing situations at the national and international levels, it is
important for decision makers to monitor the development of trending topics that are
associated with long-running events that affect peoples lives and activities. There is a great
deal of evidence showing that the status estimation of trending topics is so indispensable that
decision-making processes would be misguided and problematic without such surveillance
systems (Lu et al., 2009;Ketter et al., 2009).
Recently, due to the development of Internet technology and big data analytics, Internet-
generated data, such as online search data and social media data, have been widely used to
analyze trends in several fields. Many studies have focused on building prediction systems
by using information of these data sources (e.g. Ginsberg et al., 2009;Fang et al., 2010;
Altshuler et al., 2012;Li et al., 2014;Afkhami et al., 2017;Hu et al., 2018;Zadeh et al., 2019;Dong
et al., 2019;Volchek et al., 2019;Lin et al., 2020;Hisada et al., 2020;Wang et al., 2021a,b;
Lampos et al., 2021;Poongodi et al., 2021;Rho et al., 2021), and one of the most important
challenges that has emerged is to determine which indicators should be selected for modeling
the phenomenon of trending topics. While approaches of previous studies have been shown
to provide good results in various domains, most of these works select indicators by scanning
Web search
engines
741
This research was supported in part by MOST 109-2410-H-002 -071 -MY2 from the Ministry of Science
and Technology, Republic of China.
The current issue and full text archive of this journal is available on Emerald Insight at:
https://www.emerald.com/insight/2514-9288.htm
Received 8 August 2021
Revised 6 January 2022
Accepted 7 March 2022
Data Technologies and
Applications
Vol. 56 No. 5, 2022
pp. 741-761
© Emerald Publishing Limited
2514-9288
DOI 10.1108/DTA-08-2021-0209
large-scale data or by using intuition and prior knowledge of domain experts. The process of
these approaches, however, is too costly to make the trend surveillance methods adaptable to
different topics.
Crowdsourcing is the act of facilitating a distributed network of individuals to
collaboratively solve a problem that was once performed by employees. In recent years,
crowdsourcing has become increasingly prevalent in many fields and is used in a variety of
tasks such as improving web search engine usage (Kazai, 2011;Wakamiya et al., 2011;
Moradi, 2019). As shown by Zhitomirsky-Geffet et al. (2016), aggregating judgments for a
single task from several nonexpert individuals can generate results at the same level as those
created by experts. Following the stream of research on crowdsourcing, we propose a
crowdsourced wisdom-based method that bridges this methodological gap in the indicator
selection process. Moreover, we have also developed a novel prediction method to estimate
the trending topic statuses by crowdsourcing public opinion in web search engines.
The proposed indicator selection method provides an efficient and effective way to
select representative terms that are popular concerns of the general public and that come
from a small corpus by making use the ranking order of documents in a web search engine.
Representative terms and their search trends are then applied to the proposed prediction
method to enhance the effectiveness of trend prediction. We apply the user-based
collaborative filtering technique (L
uet al.,2012) to estimate a status, and the Serfling
method (SM) is incorporated to capture the periodic tendency of trending topics.
Evaluations based on trending topics of different domains demonstrate that the
proposed method can achieve better or similar prediction accuracy compared with many
trend prediction methods.
In sum, we show how machine learning and inference can be harnessed to leverage the
complementary strengths of humans (collective intelligence in the form of online search data)
and computational agents (web search engine) to solve trending topic prediction tasks. The
remainder of this paper is organized as follows. Section 2 provides a review of related works,
and then in Section 3, we present the proposed method. In Section 4, we evaluate the proposed
method, and finally in Section 5, we summarize our conclusions.
2. Related work
We begin this section with a review of the prediction methods using web search engines
(Section 2.1) and social media data (Section 2.2). Section 2.3 provides a review of the benefits of
crowdsourcing for search engines. In Section 2.4, we review recommendation systems and
pay special attention to user-based collaborative filtering. And finally, in Section 2.5 we
review the SM.
2.1 Trend prediction using web search engines
Search engine information has been widely adopted to predict trend statuses in different
domains. In Ginsberg et al. s (2009) examination of Googles search databasefor extracting
query terms, 45 query terms out of 50 million candidate queries were extracted as indicators
to develop a linear reg ression (LR) model to p redict the periodic in flection number of
influenza cases. Li et al. (2014) proposed an ontology-bas ed framework by consult ing
domain experts to capture useful features to predict the unemployment rate. They tested
each search query of the domain concepts and exploited a support vector regression(SVR)
model to boost the accuracy of unemployment rate prediction. Dimpfl and Jank (2016) found
a strong co-movement of the Dow Jonesrealized volatility and the volume of search queries.
They incorporated daily search query data in several prediction models for realized
volatility and augmented those models with search query leads for more precise forecasts.
DTA
56,5
742

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT