Research on the generalization of social bot detection from two dimensions: feature extraction and detection approaches

Document

Cited in

DOI	https://doi.org/10.1108/DTA-02-2022-0084
Published date	21 April 2023
Date	21 April 2023
Pages	177-198
Subject Matter	Library & information science,Librarianship/library management,Library technology,Information behaviour & retrieval,Metadata,Information & knowledge management,Information & communications technology,Internet
Author	Ziming Zeng,Tingting Li,Jingjing Sun,Shouqiang Sun,Yu Zhang

Research on the generalization of

social bot detection from two

dimensions: feature extraction and

detection approaches

Ziming Zeng , Tingting Li , Jingjing Sun , Shouqiang Sun and

Yu Zhang

School of Information Management, Wuhan University, Wuhan, China

Abstract

Purpose –The proliferation of bots in social networks has profoundly aﬀected the interactions of legitimate

users. Detecting and rejecting these unwelcome bots has become part of the collective Internet agenda.

Unfortunately, as bot creators use more sophisticated approaches to avoid being discovered, it has become

increasingly diﬃcult to distinguish social bots from legitimate users. Therefore, this paper proposes a novel

social bot detection mechanism to adapt to new and diﬀerent kinds of bots.

Design/methodology/approach –This paper proposes a research framework to enhance the

generalization of social bot detection from two dimensions: feature extraction and detection approaches.

First, 36 features are extracted from four views for social bot detection. Then, this paper analyzes the feature

contribution in diﬀerent kinds of social bots, and the features with stronger generalization are proposed.

Finally, this paper introduces outlier detection approaches to enhance the ever-changing social bot detection.

Findings –The experimental results show that the more important features can be more eﬀectively

generalized to diﬀerent social bot detection tasks. Compared with the traditional binary-class classiﬁer, the

proposed outlier detection approaches can better adapt to the ever-changing social bots with a performance

of 89.23 per cent measured using the F1 score.

Originality/value –Based on the visual interpretation of the feature contribution, the features with

stronger generalization in diﬀerent detection tasks are found. The outlier detection approaches are ﬁrst

introduced to enhance the detection of ever-changing social bots.

Keywords Social bots, Generalization, Multi-view features, Feature importance, Outlier detection

approaches

Paper type Research paper

1. Introduction

Nowadays, the rapid development of online social networks (OSNs) enables people to

communicate in real time for free (Orabi et al., 2020). People share their daily life, follow

others and stay in touch with friends on OSNs (Luo et al., 2021). Large-scale active users

produce diverse information, which has diﬀerent impacts on our lives as the information

spreads rapidly. Many illegal organizations take advantage of these characteristics of OSNs

to register a large number of social bots for imitating legitimate users, marketing or

manipulating public opinion (Wu et al., 2020). These malicious social bots greatly disrupt

the network environment and even national stability (Guo et al., 2020).

Social bots interfere with the normal network environment in three ways. One is used to

perform simple actions such as retweeting, commenting or liking (Lingam et al., 2019).

Another kind of social bot, known as spammers, may conduct speciﬁc tasks including

brand promotion, publishing malicious uniform resources locators (URLs) and spreading

rumors (Woo et al., 2018). The third kind of social bot injects information into OSNs to

interfere with political candidates, manipulating information to change the public’s

Funding: This work was supported by the National Social Science Fund of China (grant number 21BTQ046).

ThecurrentissueandfulltextarchiveofthisjournalisavailableonEmeraldInsightat:

https://www.emerald.com/insight/2514-9288.htm

177

Received26 February 2022

Revised 8 June 2022

Accepted12 August 2022

Data Technologies and

Applications

Vol. 57 No. 2, 2023

pp. 177-198

2514-9288

DOI 10.1108/DTA-02-2022-0084

Generalization

of social bot

detection

perception or obscuring the facts, as has been reported in cases such as Brexit (Mostrous

et al., 2017) and the Singaporean elections in 2020 (Uyheng et al., 2021). It is vital for social

platforms to suspend and close these unpopular social bots.

At present, there have been various research works on the automatic detection of social

bots. Automated bot detection works under the premise that the behavior expressed by

a human diﬀers from that of a bot. These diﬀerences can be measured by some

representative features, such as user’s settings, daily posting rate, number of followers,

etc. Most research works on bot detection have focused on creating new feature

representations and applying new classiﬁers. These research works usually train binary-

class classiﬁers to classify legitimate users and suspicious accounts based on the extracted

features (Loyola-González et al., 2019).

Two fatal limitations exist in current social bot detection: one is that diﬀerent

organizations create bots for diverse purposes and hence vary in their nature and

complexity (Pan et al., 2016). Although the existing research works have extracted various

features to distinguish legitimate users from social bots, the generalization of the feature

combinations is unsatisfactory when detecting new or diﬀerent kinds of social bots (Yang

et al.,2020). The second is that most researchers use existing social bots to train the binary-

class classiﬁer,but the trained classiﬁer is diﬃcult to generalize to the detection of new bots

(Cresci et al., 2017).Given that the social bots are of a great variety and theywill continue to

evolve in the future, with bot creators modifying the behaviors to adapt to diﬀerent tasks,

a new strategy for automatic bot detection is needed (Rodríguez-Ruiz et al.,2020).

In response to the above research limitations, the authors take the lead from the

perspective of evolving social bot detection (including new and diﬀerent kinds of social

bots), which almost no one has done so, and propose a research framework to enhance the

generalization of social bot detection from two dimensions: feature extraction and detection

approaches. The main contributions of this research are as follows:

(1) Through the visual interpretation of the extracted feature contribution, it is revealed

that the same feature has diﬀerent importance in the diﬀerent bot detection tasks.

Based on the analysis of feature importance, the features with stronger

generalization in diﬀerent detection tasks are found.

(2) The outlier detection approaches are introduced to enhance the detection of ever-

changing social bots. The mechanism of outlier detection is that no abnormal

instances are needed during the training process, which can be used to eﬀectively

detect the evolving social bots.

(3) The performance of the popular binary-class detection approaches is compared with

that of the proposed outlierdetection approaches. The experimental results show that

outlier detection approaches can consistently discern between legitimate and bot

accounts, while the binary-class classiﬁers cannot eﬀectively discern between

legitimate usersand other kinds of bots (i.e. they arenot used for training classiﬁers).

2. Related work

Social bot detection has been researched on many network platforms, such as Facebook,

Weibo, etc. In this paper, the authors focus on the detection of bots on Twitter. This section

brieﬂy reviews the related research on social bots on Twitter and feature extraction and

detection approaches of social bots.

DTA

57,2

178

To continue reading

Request your trial

Subscribers can access the reported version of this case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a list of all the cited cases and legislation of a document.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a list of all the documents that have cited the case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see the revised versions of legislation with amendments.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see any amendments made to the case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a visualisation of a case and its relationships to other cases. An alternative to lists of cases, the Precedent Map makes it easier to establish which ones may be of most relevance to your research and prioritise further reading. You also get a useful overview of how the case was received.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see the list of results connected to your document through the topics and citations Vincent found.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Research on the generalization of social bot detection from two dimensions: feature extraction and detection approaches

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users