A multi-dimensional city data embedding model for improving predictive analytics and urban operations

DOIhttps://doi.org/10.1108/IMDS-01-2022-0020
Published date14 June 2022
Date14 June 2022
Pages2199-2216
Subject MatterInformation & knowledge management,Information systems,Data management systems,Knowledge management,Knowledge sharing,Management science & operations,Supply chain management,Supply chain information systems,Logistics,Quality management/systems
AuthorZhe Jing,Yan Luo,Xiaotong Li,Xin Xu
A multi-dimensional city data
embedding model for improving
predictive analytics and
urban operations
Zhe Jing
Department of Manangement & Marketing,
Faculty of Business, The Hong Kong Polytechnic University, Hong Kong, China
Yan Luo
Department of Computing, The Hong Kong Polytechnic University,
Hong Kong, China
Xiaotong Li
College of Business, The University of Alabama in Huntsville,
Huntsville, Alabama, USA, and
Xin Xu
Department of Manangement & Marketing,
Faculty of Business, The Hong Kong Polytechnic University, Hong Kong, China
Abstract
Purpose A smart city is a potential solution to the problems caused by the unprecedented speed of
urbanization. However, the increasing availabilityof big data is a challenge for transforming a city into a smart
one. Conventional statistics and econometric methods may not work well with big data. One promising
direction is to leverage advanced machine learning tools in analyzing big data about cities. In this paper, the
authors propose a model to learn region embedding. The learned embedding can be used for more accurate
prediction by representing discrete variables as continuous vectors that encode the meaning of a region.
Design/methodology/approach The authors use the random walk and skip-gram methods to learn
embedding and update the preliminary embedding generated by graph convolutional network (GCN).
The authors apply this model to a real-world dataset from Manhattan, New York, and use the learned
embedding for crime event prediction.
Findings This studys results show that the proposed model can learn multi-dimensional city data more
accurately. Thus, it facilitates cities to transform themselves into smarter ones that are more sustainable and efficient.
Originality/value The authorspropose an embedding model thatcan learn multi-dimensional citydata for
improvingpredictive analyticsand urban operations. This modelcan learn more dimensions of city data,reduce
the amountof computation and leveragedistributed computingfor smart city developmentand transformation.
Keywords Smart city, Big data, Machine learning, Region embedding, Graph convolutional network (GCN)
Paper type Research paper
1. Introduction
With the continuous growth of the global population and the fast development of urbanization, the
urban population is increasing rapidly (World Bank, 2014). The ever-increasing urban population
and rapidly changing demographics complicate the urban structure (Boeing, 2018). At the same
Multi-
dimensional city
data embedding
model
2199
Author contribution: Zhe Jing and Yan Luo contributed equally to this paper.
Funding: The research was funded by the Key Program of NSFC-FRQ Joint Project
(no: 72061127002); Hong Kong Research Gr ants Council The Diff usion of Online Lear ning
Technology; (no: 15503719); Natural Science Foundation of Guangdong Province
(no: 2019A1515012095) and Shenzhen Humanities & Social Sciences Key Research Bases.
The current issue and full text archive of this journal is available on Emerald Insight at:
https://www.emerald.com/insight/0263-5577.htm
Received 10 January 2022
Revised 17 May 2022
Accepted 17 May 2022
Industrial Management & Data
Systems
Vol. 122 No. 10, 2022
pp. 2199-2216
© Emerald Publishing Limited
0263-5577
DOI 10.1108/IMDS-01-2022-0020
time, the natural environment is susceptible to various threats, such as energy shortages, air
pollution and global warming (Dong et al., 2018;Ogura and Jakovljevic, 2018). Nowadays, many
people living in cities face various risks and problems,such as the shortage of water resources and
the unbalanced distribution of medical resources (Hadadin et al., 2010). Therefore, better
developing and managing urban areas has become increasingly important in addressing these
problems.
A smart city is a potential solution to the problems caused by the unprecedented speed of city
development and urbanization (Hall et al., 2000). The concept of smart cities can be traced back to
1974, when the first big data project for cities was created in Los Angeles (Los Angeles Community
Analysis Bureau, 1974). Since then, academia and industry have invested time and effort in
advancing smart city research. IBM proposed that policymakers treat a city as a complex
interconnected network that can proactively predict and solve problems, maximize resourcesand
use the information to make better decisions (Wiig, 2015). Many academic studies in this domain
have explored the constituent elements of smart cities and the interrelationships among them
(Hollands, 2008;Allwinkle and Cruickshank, 2011;Lombardi et al., 2012;Chourabi et al., 2012).
Based on the definitions and concepts of smart citiesput forward by different scholars, a smart city
should be able to make conscious efforts to use information systems strategically, seeking to
achieve prosperity, effectiveness and competitiveness at multiple levels of the urban society
(Angelidou, 2014). While the goals of a smart city are relatively straightforward, the approaches to
transforming a city into a smart one remain unclear.
In recent years, the development of the digital infrastructure, such as the Internet of things (IoT)
and information and communication technologies (ICT), has enabled the rapid growth of big data
at the city level (Batty, 2013;Hashem et al., 2016;Chen et al., 2017). The big data of a city can be
considered a spontaneous, objective and accurate recording of the multi-dimensional
characteristics of the city. Big data is usually generated by the passive recording of various
peoples activities (Rathore et al., 2016). As a result, big data can more comprehensively, objectively
and accurately capture the information of city residents and other physical objects (George et al.,
2014;Chen et al., 2015). Therefore, the availability of multi-dimensional city data provides a
valuable opportunity to develop smart cities.
However, the increasing availability of big data is also a challenge for transforming a city into a
smart one (Li et al., 2019). One of the biggest challenges scholars and governments face is the
diversity and hierarchy of data sources sensors, mobile phone apps, social media, web activities
history and tracking devices all of which can generate enormous amounts of data (Ghosh et al.,
2016). Thus, leveraging big data to achieve smart city transformation has become an influential
research topic. In the past, many scholars focused on studying how to develop sustainable and
smart cities by analyzing data with statistical and econometric tools. For example, Neirotti et al.
(2014) performs a regression analysis of a sample of 70 international cities to identify the cr ucial
factors that influence the coverage index that measures the impacts on the development of smart
city initiatives. Their study helps policymakers under budget constraints prioritize smart city
initiatives, thereby maximizing the return of smart city investments. Liu et al. (2021) use a spatial
econometric model to identify key factors influencing smart city development with a sample of 83
Chinese cities. Their results show that governmental support, innovativeness, economic
development and human capital are the four key factors that help policymakers make decisions
to develop smart cities.
However, conventional statistics and econometric methods may not work well with big data
(Varian, 2014). First, the massive dynamic data render data manipulation tools in econometrics
useless. Second, in many cases, people have to select appropriate predictors from a large number of
available variables to improve predictive accuracy. However, this task cannot be efficiently
achieved with conventional econometric models. Third, linear models often do not accurately
reflect the relationships among variables in big data. Thus, we need to introduce more flexible
models to examine the complex relationships among many variables.
IMDS
122,10
2200

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT