Techniques to detect terrorists/extremists on the dark web: a review

DOIhttps://doi.org/10.1108/DTA-07-2021-0177
Published date06 January 2022
Date06 January 2022
Pages461-482
Subject MatterLibrary & information science,Librarianship/library management,Library technology,Information behaviour & retrieval,Metadata,Information & knowledge management,Information & communications technology,Internet
AuthorHanan Alghamdi,Ali Selamat
Techniques to detect terrorists/
extremists on the dark web:
a review
Hanan Alghamdi
Umm Al-Qura University, AlQunfidhah, Saudi Arabia, and
Ali Selamat
Universiti Teknologi Malaysia, Skudai, Malaysia
Abstract
Purpose With the proliferation of terrorist/extremist websites on the World Wide Web, it has become
progressively more crucial to detect and analyze the content on these websites. Accordingly, the volume of
previous research focused on identifying the techniques and activities of terrorist/extremist groups, as revealed
by their sites on the so-called dark web, has also grown.
Design/methodology/approachThis study presents a review of the techniques used to detect and process
the content of terrorist/extremist sites on the dark web. Forty of the mostrelevant data sources were examined,
and various techniques were identified among them.
Findings Based on this review, it was found that methods of feature selection and feature extraction can be
used as topic modeling with content analysis and text clustering.
Originality/value At the end of the review, present the current state-of-the- art and certain open issues
associated with Arabic dark Web content analysis.
Keywords Arabic, Topic model, Contents analysis, Extremist web, Technique, Textual feature
Paper type Literature review
1. Introduction
The Internet infrastructure affords terrorist and extremist groups an easily accessible
setting, anonymity of communication, an inexpensive development and maintenance
atmosphere and a huge potential audience (Zhang et al., 2010). These features enable these
kinds of groups to spread their propaganda, persuade people to adopt their ideas, distribute
instructions, resources and support, and conduct fundraising (Chen, 2006).
Following the 9/11 attack on September 11, 2001, to monitor terrorism activities across the
globe, researchers in the field of information technology began to study effective methods for
tracing these groupsonline communications (Roberts, 2011). Members of the well-known
Hamburg Cell, the group found responsible for the planning and training for the 9/11 attack,
used the Internet extensively (Corbin, 2003).
The World Wide Webs covert area used by terrorist and extremist groups isreferred to
as the dark web (Chen, 2006). It is a corner of the web where terrorists and extremists share
and spread their beliefs and ideologies. For these groups, the Internet is their main medium
of communication to achieve these goals and to organize and plan illegal activities.
Regarded as the webs underside, the use of the dark web by terrorists and extremists is
clearly an abuse of the Internet (Chen et al., 2008a,b,c). Research on the dark web
contributes to the development of the science of Intelligence and Security Information (ISI)
whose aim is to empl oy methodologies, model s anda lgorithmsas well as tools and systems
of advanced information technologies to monitor applications related to national security
and international societies.
ISI encompasses research in the field of terrorism informatics following the exponential
growth of terrorist and extremist groupswebsites. Last et al. (2008) stated that the total
number of terrorist websites increased from only 12 in 2001 to nearly 5,000 sites by 2006.
Techniques to
detect
terrorists/
extremists
461
Received 12 July 2021
Revised 22 November 2021
Accepted 17 December 2021
Data Technologies and
Applications
Vol. 56 No. 4, 2022
pp. 461-482
© Emerald Publishing Limited
2514-9288
DOI 10.1108/DTA-07-2021-0177
The current issue and full text archive of this journal is available on Emerald Insight at:
https://www.emerald.com/insight/2514-9288.htm
Anand et al. (2009) reported the burgeoning number of dark websites from 5,000 in 2006 to
almost 50,000 sites in 2007. The figures comprise websites, blogs, social networking sites,
videos, multimedia sites and forums. In 2009, 300 terrorist forums were uncovered with more
than 30,000 members and stored messages of nearly a million. Around a million images and
15,000 videos were collected from some sites, written in more than 15 languages (Anand et al.,
2009). With this growing number of terrorist and extremist groupswebsites available, it has
become of significant importance to detect and analyze the websitescontent.
To prevent terrorist attacks, gathering information about the communication, movement
and activities of these groups could have a significant impact on the detection of terrorists
and their influence as well as recognizing who actually stands behind the materials published
online, such as forums and websites (Elovici et al., 2008;Larson and Chen, 2009).
There is a need to review existing studies in the literature to examine the various
techniques used to identify terrorist/extremist groups employing the dark web. This study
reports the results of a literature review conducted to gather the techniques used to identify
such groups. The paper is organized as follows: section 2 lists the main institutions concerned
with studying terrorist content and the main terrorist organizations by geographical area;
section 3 reviews dark web studies from different perspectives, such as online users
behavior, sentiment analysis and social interaction analysis, as well as discovery and genre
classification of content related to improvised explosive devices (IEDs); section 4 describes
the major web mining techniques used for detecting terrorist/extremist sites on the web;
textual feature sets and techniques used in 30 studies are explained in section 5;section 6
reports the open issues and challenges extracted from the reviews; and section 7 discussions
conclusions drawn from the study.
2. Main organizations focusing in dark web studies
Few established organizations are dedicated to detecting and monitoring the content of such
websites. Some examples are
(1) Internet Archive [1] has an archive for open access to hypertext markup language
(HTML) pages.
(2) Anti-Terrorism Coalition (ATC) [2] collects 448 extremist websites and forums.
(3) The Middle East Media Research Institute (MEMRI) [3] studies terrorism.
(4) The Search for International Terrorist Entities (SITE) Institute [4] has a broad
collection of 1,000 files.
(5) Artificial Intelligence (AI) Lab at the University of Arizona [5] does web-crawling to
gather data from international extremist and terrorist websites. It includes 1,000
websites, grouped based on their origins, namely United States (US) domestic, Latin
American and Middle Eastern.
2.1 Studies conducted relating to different terroristsgeographical regions
This section discusses studies of materials on the dark web according to geographical region
and a comparison of Middle Eastern groups with US domestic groups and Latin American
groups. Studies by Qin et al. (2007),Abbasi and Chen (2007,2008) and Abbasi et al. (2008)
categorize terrorist organizations into three regions: Middle Eastern, U.S. domestic and Latin
American. A summary of these studies is illustrated in Table 1. The studies focus mainly on
investigations of Middle Eastern groups and compare them with the other two groups.
Table 1 presents three columns labeled References,”“
Objectives,and Findings.The
source studies consulted are listed as references; the objectives column provides details of the
DTA
56,4
462

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT