Research on the generalization of social bot detection from two dimensions: feature extraction and detection approaches

DOIhttps://doi.org/10.1108/DTA-02-2022-0084
Published date21 April 2023
Date21 April 2023
Pages177-198
Subject MatterLibrary & information science,Librarianship/library management,Library technology,Information behaviour & retrieval,Metadata,Information & knowledge management,Information & communications technology,Internet
AuthorZiming Zeng,Tingting Li,Jingjing Sun,Shouqiang Sun,Yu Zhang
Research on the generalization of
social bot detection from two
dimensions: feature extraction and
detection approaches
Ziming Zeng , Tingting Li , Jingjing Sun , Shouqiang Sun and
Yu Zhang
School of Information Management, Wuhan University, Wuhan, China
Abstract
Purpose The proliferation of bots in social networks has profoundly aected the interactions of legitimate
users. Detecting and rejecting these unwelcome bots has become part of the collective Internet agenda.
Unfortunately, as bot creators use more sophisticated approaches to avoid being discovered, it has become
increasingly dicult to distinguish social bots from legitimate users. Therefore, this paper proposes a novel
social bot detection mechanism to adapt to new and dierent kinds of bots.
Design/methodology/approach This paper proposes a research framework to enhance the
generalization of social bot detection from two dimensions: feature extraction and detection approaches.
First, 36 features are extracted from four views for social bot detection. Then, this paper analyzes the feature
contribution in dierent kinds of social bots, and the features with stronger generalization are proposed.
Finally, this paper introduces outlier detection approaches to enhance the ever-changing social bot detection.
Findings The experimental results show that the more important features can be more eectively
generalized to dierent social bot detection tasks. Compared with the traditional binary-class classier, the
proposed outlier detection approaches can better adapt to the ever-changing social bots with a performance
of 89.23 per cent measured using the F1 score.
Originality/value Based on the visual interpretation of the feature contribution, the features with
stronger generalization in dierent detection tasks are found. The outlier detection approaches are rst
introduced to enhance the detection of ever-changing social bots.
Keywords Social bots, Generalization, Multi-view features, Feature importance, Outlier detection
approaches
Paper type Research paper
1. Introduction
Nowadays, the rapid development of online social networks (OSNs) enables people to
communicate in real time for free (Orabi et al., 2020). People share their daily life, follow
others and stay in touch with friends on OSNs (Luo et al., 2021). Large-scale active users
produce diverse information, which has dierent impacts on our lives as the information
spreads rapidly. Many illegal organizations take advantage of these characteristics of OSNs
to register a large number of social bots for imitating legitimate users, marketing or
manipulating public opinion (Wu et al., 2020). These malicious social bots greatly disrupt
the network environment and even national stability (Guo et al., 2020).
Social bots interfere with the normal network environment in three ways. One is used to
perform simple actions such as retweeting, commenting or liking (Lingam et al., 2019).
Another kind of social bot, known as spammers, may conduct specic tasks including
brand promotion, publishing malicious uniform resources locators (URLs) and spreading
rumors (Woo et al., 2018). The third kind of social bot injects information into OSNs to
interfere with political candidates, manipulating information to change the publics
Funding: This work was supported by the National Social Science Fund of China (grant number 21BTQ046).
ThecurrentissueandfulltextarchiveofthisjournalisavailableonEmeraldInsightat:
https://www.emerald.com/insight/2514-9288.htm
177
Received26 February 2022
Revised 8 June 2022
Accepted12 August 2022
Data Technologies and
Applications
Vol. 57 No. 2, 2023
pp. 177-198
© Emerald Publishing Limited
2514-9288
DOI 10.1108/DTA-02-2022-0084
Generalization
of social bot
detection
perception or obscuring the facts, as has been reported in cases such as Brexit (Mostrous
et al., 2017) and the Singaporean elections in 2020 (Uyheng et al., 2021). It is vital for social
platforms to suspend and close these unpopular social bots.
At present, there have been various research works on the automatic detection of social
bots. Automated bot detection works under the premise that the behavior expressed by
a human diers from that of a bot. These dierences can be measured by some
representative features, such as users settings, daily posting rate, number of followers,
etc. Most research works on bot detection have focused on creating new feature
representations and applying new classiers. These research works usually train binary-
class classiers to classify legitimate users and suspicious accounts based on the extracted
features (Loyola-González et al., 2019).
Two fatal limitations exist in current social bot detection: one is that dierent
organizations create bots for diverse purposes and hence vary in their nature and
complexity (Pan et al., 2016). Although the existing research works have extracted various
features to distinguish legitimate users from social bots, the generalization of the feature
combinations is unsatisfactory when detecting new or dierent kinds of social bots (Yang
et al.,2020). The second is that most researchers use existing social bots to train the binary-
class classier,but the trained classier is dicult to generalize to the detection of new bots
(Cresci et al., 2017).Given that the social bots are of a great variety and theywill continue to
evolve in the future, with bot creators modifying the behaviors to adapt to dierent tasks,
a new strategy for automatic bot detection is needed (Rodríguez-Ruiz et al.,2020).
In response to the above research limitations, the authors take the lead from the
perspective of evolving social bot detection (including new and dierent kinds of social
bots), which almost no one has done so, and propose a research framework to enhance the
generalization of social bot detection from two dimensions: feature extraction and detection
approaches. The main contributions of this research are as follows:
(1) Through the visual interpretation of the extracted feature contribution, it is revealed
that the same feature has dierent importance in the dierent bot detection tasks.
Based on the analysis of feature importance, the features with stronger
generalization in dierent detection tasks are found.
(2) The outlier detection approaches are introduced to enhance the detection of ever-
changing social bots. The mechanism of outlier detection is that no abnormal
instances are needed during the training process, which can be used to eectively
detect the evolving social bots.
(3) The performance of the popular binary-class detection approaches is compared with
that of the proposed outlierdetection approaches. The experimental results show that
outlier detection approaches can consistently discern between legitimate and bot
accounts, while the binary-class classiers cannot eectively discern between
legitimate usersand other kinds of bots (i.e. they arenot used for training classiers).
2. Related work
Social bot detection has been researched on many network platforms, such as Facebook,
Weibo, etc. In this paper, the authors focus on the detection of bots on Twitter. This section
briey reviews the related research on social bots on Twitter and feature extraction and
detection approaches of social bots.
DTA
57,2
178

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT