Location impact on source and linguistic features for information credibility of social media

Date11 February 2019
Pages89-112
Published date11 February 2019
DOIhttps://doi.org/10.1108/OIR-03-2018-0087
AuthorSuliman Aladhadh,Xiuzhen Zhang,Mark Sanderson
Subject MatterLibrary & information science,Information behaviour & retrieval,Collection building & management,Bibliometrics,Databases,Information & knowledge management,Information & communications technology,Internet,Records management & preservation,Document management
Location impact on source and
linguistic features for information
credibility of social media
Suliman Aladhadh
Computer Science, School of Science, RMIT University, Melbourne, Australia and
College of Computer, IT Department, Qassim University, Qassim, Saudi Arabia, and
Xiuzhen Zhang and Mark Sanderson
RMIT University, Melbourne, Australia
Abstract
Purpose Social media platforms provide a source of information about events. However, this information
may not be credible, and the distance between an information source and the event may impact on that
credibility. Therefore, the purpose of this paper is to address an understanding of the relationship between
sources, physical distance from that event and the impact on credibility in social media.
Design/methodology/approach In this paper, the authors focus on the impact of location on the
distribution of content sources (informativeness and source) for different events, and identify the semantic
features of the sources and the content of different credibility levels.
Findings The study found that source location impacts on the number of sources across different events.
Location also impacts on the proportion of semantic features in social media content.
Research limitations/implications This study illustrated the influence of location on credibility in
social media. The study provided an overview of the relationship between content types including semantic
features, the source and event locations. However, the authors will include the findings of this study to build
the credibility model in the future research.
Practical implications The results of this study provide a new understanding of reasons behind the
overestimation problem in current credibility models when applied to different domains: such models need to
be trained on data from the same place of event, as that can make the model more stable.
Originality/value Thisstudy investigatesseveral events includingcrisis, politicsand entertainment with
steadymethodology. This givesnew insights about the distributionof sources, credibilityand other information
types withinand outside the country ofan event. Also, this study usedthe power of location to findalternative
approaches to assesscredibility in social media.
Keywords Statistical analysis, Social media, Credibility, Information source, Semantic analysis
Paper type Research paper
Introduction
Social media is an important source of information, particularly about national and
international events. Facebook and Twitter users get their primary news from social media
(66 and 59 percent, respectively) (Gottfried and Shearer, 2016). In total, 85 percent of Twitter
posts have news-based content (Kwak et al., 2010). In crisis situation, such as the Japan
earthquakes,Twitter posts give alarms faster thana national new agency(Sakaki et al., 2010).
However, the credibility of social media content about an event is a major concern.
Studies show that fake news and rumors spread on Twitter during events, such as the Chile
earthquake in 2010 (Mendoza et al., 2010) and Hurricane Sandy in 2012 (Gupta et al., 2013).
These rumors spread quickly and are difficult to detect in timely manner thereby having a
negative impact on decision making (Oh et al., 2011).
Social media noisy contentmixes information with high and low credibility. So, assessing
information credibility is challenging. While many sources are likely to contribute information
about an event, some are more credible than others. Credibility of information can be measured
Online Information Review
Vol. 43 No. 1, 2019
pp. 89-112
© Emerald PublishingLimited
1468-4527
DOI 10.1108/OIR-03-2018-0087
Received 15 March 2018
Revised 5 June 2018
16 September 2018
Accepted 17 September 2018
The current issue and full text archive of this journal is available on Emerald Insight at:
www.emeraldinsight.com/1468-4527.htm
This paper forms part of a special section Social media mining for journalism.
89
Information
credibility of
social media
through knowledge about the source. For example, identifiable news sources will be more
credible than anonymous sources.
Users in social media research are generally classified based on their location as local
or remote. Local users are able to get first-hand information from the event site and are
called eyewitness. Remote users share information about the event from a distance.
Eyewitnesses are most likely to give first hand information about the event; however,
many limitations exist for such accounts (see related work section). Increasing the
number of credible sources that can be used to find credible information is essential in
order to assess social media information quality.
Knowledge that a source is local or remote can help enhance credibility assessment. First,
we can estimate level of credibility of sources in each location. Second, local people are able
to understand and interpret the event in terms of geographic cultural and political impacts.
Currently, it is common practice for traditional stakeholders (e.g. national press) to contact
via social media sources who are close to the place of event to get an update (Dailey and
Starbird, 2014). So, information coming from the same region of the event is likely to be
richer than remote content.
The language of t weets generated f rom the same events region differs from language of
tweets from outside that region (Morstatter et al.,2014;Kumaret al., 2014; Cheng et al., 2010).
Previous research on social media shows that content of the same credibility level (whether
credible or not) shares common practice (Castillo et al., 2013). At the same time previous
research presentthe overestimation of predictionin current credibility models whenapply in
different domain (Boididou et al., 2014; Aker et al., 2017). However, no study has investigated
the effect of location on the behavior of different sources and credibility content.
The research questions that we investigated in this paper are:
RQ1. What are the types of sources expected in different events from both in- and
outside the country of events?
RQ2. How linguistic features differ among sources of different type, credibility level,
and location?
Related work
In this section, we review the research in the areas of credibility, linguistic, information
source and user location in social media. All of these areas are related to this research, and
for each one of them we show the research gap in relation to our work.
Microblog credibility
Credibility research in social media has diverse directions based on the methods used.
Research has considered tweetscontent by choosing a number of content, user and
network-based features (Castillo et al., 2013; Gupta et al., 2014). Other research has focused
on the content features such as the sentiment of tweets and then used those features to train
the model to predict the credibility of a tweet (Mitra et al., 2017b; ODonovan et al., 2012).
Other research has used the source of the tweet, tweets author,to assess credibility
(Ghosh et al., 2012; Gupta and Kumaraguru, 2012). An overview of the rumor detection and
credibility research in social media have been well studied (Zubiaga et al., 2017).
Rumors about an event and their spreading have been studied (Zeng et al., 2016;
Aker et al., 2017; Kwon et al., 2013; Qazvinian et al., 2011). Detecting the rumor is achieved
by analysis of linguistic features, sentiment and part-of-speech tagger. Recent work shows
that credibility prediction models cannot be generalized for different events
(Boididou et al., 2014; Aker et al., 2017), as the accuracy of the classifiers drop when
apply on event for different domain.
90
OIR
43,1

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT