Social information landscapes. Automated mapping of large multimodal, longitudinal social networks

Pages1724-1751
DOIhttps://doi.org/10.1108/IMDS-02-2015-0055
Published date19 October 2015
Date19 October 2015
AuthorEugene Ch'ng
Subject MatterInformation & knowledge management,Information systems,Data management systems
REGULAR PAPER
Social information landscapes
Automated mapping of large multimodal,
longitudinal social networks
Eugene Chng
School of Computer Science,
University of Nottingham Ningbo China, Ningbo, China
Abstract
Purpose The purpose of this paper is to present a Big Data solution as a methodological approach to
the automated collection, cleaning, collation, and mapping of multimodal, longitudinal data sets from
social media. The paper constructs social information landscapes (SIL).
Design/methodology/approach The research presented here adopts a Big Data methodological
approach for mapping user-generated contents in social media. The methodology and algorithms
presented are generic, and can be applied to diverse types of social media or user-generated contents
involving user interactions, such as within blogs, comments in product pages, and other forms of
media, so long as a formal data structure proposed here can be constructed.
Findings The limited presentation of the sequential nature of content listings within social media
and Web 2.0 pages, as viewed on web browsers or on mobile devices, do not necessarily reveal nor
make obvious an unknown nature of the medium; that every participant, from content producers, to
consumers, to followers and subscribers, including the contents they produce or subscribed to, are
intrinsically connected in a hidden but massive network. Such networks when mapped, could be
quantitatively analysed using social network analysis (e.g. centralities), and the semantics and
sentiments could equally reveal valuable information with appropriate analytics. Yet that which is
difficult is the traditional approach of collecting, cleaning, collating, and mapping such data sets into a
sufficiently large sample of data that could yield important insights into the community structure and
the directional, and polarity of interaction on diverse topics. This research solves this particular strand
of problem.
Research limitations/implications The automated mapping of extremely large networks
involving hundreds of thousands to millions of nodes, encapsulating high resolution and contextual
information, over a long period of time could possibly assist in the proving or even disproving of
theories. The goal of this paper is to demonstrate the feasibility of using automated approaches for
acquiring massive, connected data sets for academic inquiry in the social sciences.
Practical implications The methods presented in this paper, together with the Big Data
architecture can assist individuals and institutions with a limited budget, with practical approaches in
constructing SIL. The software-hardware integrated architecture uses open source software,
furthermore, the SIL mapping algorithms are easy to implement.
Originality/value The majority of research in the literature uses traditional approaches for
collecting social networks data. Traditional approaches can be slow and tedious; they do not yield
adequate sample size to be of significant value for research. Whilst traditional approaches collect only
a small percentage of data, the original methods presented here are able to collect and collate entire
data sets in social media due to the automated and scalable mapping techniques.
Keywords Online communities, Big Data, Longitudinal network, Multimodal network,
Social network analysis, Social information landscapes
Paper type Research paper
Industrial Management & Data
Systems
Vol. 115 No. 9, 2015
pp. 1724-1751
©Emerald Group Publishing Limited
0263-5577
DOI 10.1108/IMDS-02-2015-0055
Received 19 February 2015
Revised 17 July 2015
Accepted 1 August 2015
The current issue and full text archive of this journal is available on Emerald Insight at:
www.emeraldinsight.com/0263-5577.htm
The author acknowledges the financial support from the International Doctoral Innovation
Centre, Ningbo Education Bureau, Ningbo Science and Technology Bureau, Chinas MoST and
The University of Nottingham. The project is partially supported by NBSTB Project 2012B10055.
1724
IMDS
115,9
1. Introduction
What is social information landscapes (SIL) and how will it be useful for studying social
networks? Here, SIL can be defined as the automated mapping of large topological
networks from instantaneous contents, sentiments and users reconstructed from social
media channels, events and user generated contents within blogs and websites,
presented virtually as a graph that encompasses, within a timescale, contextual
information, all connections between followers, active users, comments and
conversations within a social rather than a physical space. The key phrase
automated mappingis essential here as the mapping of very large networks is
essential as network behaviour may differ in massive networks in relation to their
emergent behaviour and the way they tend to self-organise. Larger networks are also
ideal as subjects for studying geodesic distance, centrality, and density. SIL was first
mentioned in an article dealing with a Big Data funnelling approach (Chng, 2014), and
subsequently in corresponding articles on the study of online community formation
and decline when research results were published (Chng, 2015b, c). In comparison to
traditional networks, SILs carry much more information. SILs encapsulate activities,
which complements the traditional follower-followee network format, which contains
only human nodes as opposed to activity nodes that are multimodal, e.g., contents and
context. As the mapping is novel, so therefore the landscapescreated will be new in
the context that they contain a higher resolution of, and broader context of information.
The onset of the internet age has made our world smaller. As offline communities
connect to virtual communities, and become virtual communities, space and time are, in
a sense compressed to within the social medium that facilitates community interactio n.
Insights into the behaviour of virtual communities in the age of social media requires a
Big Data approach in mapping the interactions as social networks, for the interaction of
online communities for a single topic can occur over large geographical distances and
may span many months involving thousands to millions of participants from highly
diverse demographics. Such interactions may also be multimodal, involving not only
the users, but also the content of the interactions linked between multiple parties. As
such, the traditional approach of manually mapping such networks can be tedious an d
would not necessarily yield a large enough sample for social network analysis. This is
due to the fact that the collection, collation, and pre-processing of data sets are
frequently too large to manage with manual data processing. This paper presents a Big
Data solution as a methodological approach to the automated collection, cleaning,
collation, and mapping of multimodal, longitudinal data sets from social media. The
methods and algorithms presented here are generic, and can be applied to diverse types
of social media or user-generated contents involving user interactions, such as with in
blogs, comments in product pages and other forms of media, so long as a formal data
structure can be constructed.
The mapping of extremely large networks involving hundreds of thousands to
millions of nodes could possibly prove or disprove theories in the social sciences. The
goal of this paper is to demonstrate the feasibility of using automated approaches for
acquiring massive, connected data sets for academic inquiry in the social sciences.
The paper begins with a background of the research. It continues with the
methodology for mapping social networks, covering the need for a Big Data
architecture, and identifying suitable asynchronous and distributable open source
technologies that are scalable in terms of volume of data and velocity of incoming data
streams. The latter section of the methodology presents the focus of this paper the
mapping of SIL. The section describes the data structures and algorithms for mapping,
1725
Social
information
landscapes

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT