How “accessible” is open data?. Analysis of context-related information and users’ comments in open datasets

Pages19-36
Published date29 January 2020
DOIhttps://doi.org/10.1108/ILS-08-2019-0086
Date29 January 2020
AuthorEngida H. Gebre,Esteban Morales
Subject MatterLibrary & information science,Librarianship/library management,Library & information services
How accessibleis open data?
Analysis of context-related information and
userscomments in open datasets
Engida H. Gebre
Faculty of Education, Simon Fraser University, Vancouver, Canada, and
Esteban Morales
Department of Language and Literacy Education,
The University of British Columbia, Vancouver, Canada
Abstract
Purpose This paper aims to examine the nature and sufciency of descriptive informationincluded in
open datasets and the nature of comments and questions users write in relation to specic datasets. Open
datasets areprovided to facilitate civic engagementand government transparency. However,making the data
available does not guaranteeusage. This paper examined the nature of context-related informationprovided
togetherwith the datasets and identied the challenges users encounter while usingthe resources.
Design/methodology/approach The authors extracteddescriptive text provided together with (often
at the top of) datasets (N= 216)and the nature of questions and comments users post in relationto the dataset.
They then segmented text descriptions and usercomments into idea unitsand applied open-coding with
constant comparison method.This allowed them to come up with thematic issues that descriptions focus on
and the challengesusers encounter.
Findings Results of the analysis revealed that context-related descriptions are limited and normative.
Users are expectedto gure out how to use the data. Analysis of user comments/questionsrevealed four areas
of challenge they encounter: organization and accessibility of the data, clarity and completeness, usefulness
and accuracyand language (spelling and grammar). Data providers can do more to address these issues.
Research limitations/implications The purpose of the study is to understand the nature of open
data provisionand suggest ways of making open data more accessibleto non expert users. As such, it is not
focused on generalizing about open dataprovision in various countries as such provision may be different
based on jurisdiction.
Practical implications The study provides insight about ways of organizing open dataset that the
resource can be accessibleby the general public. It also provides suggestions about how opendata providers
could considerusersperspectives including providing continuoussupport.
Originality/value Research on open data often focuses on technological, policy and political
perspectives. Arguably, this is the rst study on analysis of context-related information in open-datasets.
Datasets do not speak for themselvesbecause they require context for analysis and interpretation.
Understandingthe nature of context-related information in open datasetis original idea.
Keywords Accessibility, Open data, Data literacy, Civic engagement, Open data usage,
Data and context
Paper type Research paper
Open data are resources that can be used to foster civic engagement, build open government
and develop public sector transparency to respective citizens (Kitchin, 2014;Mergel et al., 2018).
In many countries, open data are open public sector information (Zuiderwijk and Janssen, 2014)
This work was supported by the National Science Foundation [grant number IIS-1441561] and SFU/
SSHRC Small grant [Number 632223] provided to the rst author.
Information
and users
comments
19
Received25 August 2019
Revised8 November 2019
Accepted12 December 2019
Informationand Learning
Sciences
Vol.121 No. 1/2, 2020
pp. 19-36
© Emerald Publishing Limited
2398-5348
DOI 10.1108/ILS-08-2019-0086
The current issue and full text archive of this journal is available on Emerald Insight at:
https://www.emerald.com/insight/2398-5348.htm
that is created, collected, organized, preserved and made available freely for public use through
government and public sector institutions. Such data are provided through open data portals
often organized and managed by data custodians at local, provincial/state and national levels of
government. One of the assumptions behind open data is that easy access to data and its free of
charge availability will engage citizens and governments in innovative and collaborative
process of addressing community needs (European Commission, 2015;Nugroho et al., 2015;
Zuiderwijk et al., 2014). Accordingly, there are considerable initiatives that focus on building
infrastructure and establishing standards of publication including quality guidelines and
protection of privacy.
However, availability of open data does not guarantee usage and critical engagement
among the public. Currently, use of open data is generally low (Donker and van Loenen,
2017;Ruijer et al.,2017;Zuiderwijk et al., 2016) and usage is often limited to researchers,data
scientists and professionalsin the private sector (Government of Canada, 2017). Two related
challenges can be identied in relation to the use of open data by the public. The rst is
organization and presentation of data including insufciency of metadata related to
datasets (Zuiderwijk et al., 2012). The second relates to lack of data literacy skills
among actual and potential users. Understanding and using data requires accessing
information related to the nature of the data and how the data was captured and
organized (Bowne-Anderson, 2018). Data are a surrogate for and representations of
phenomenon and using data requires specic social and technological context for
analysis and interpretation (Boyd and Crawford, 2012;Kitchin, 2014). However, open
datasets are often provided in the form of spreadsheets and CSV formats with limited
context-related information about the collection and possible use of the resources.
Ruijer et al. (2017) argued that open data designs do not consider the fact that there are
various groups of users who process and use datasets to serve different purposes. It is
also the case that there is a mismatch between available datasets and users
requirement for data (Donker and van Loenen, 2017).
Studies have examined policies, standards and barriers of publishing, accessibility and
future direction of open data systems (Chu and Tseng, 2016;Conradie and Choenni, 2014;
Donker and van Loenen, 2017;Hossain et al.,2016;Okamoto, 2017). Researchers have also
examined the challenges that experienced users encounter when accessing and using open
data (Janssen et al.,2012;Zuiderwijk et al.,2012). Zuiderwijk et al. (2016) correctly criticized
that existing open data platforms are serving as data repositories with limited, if any,
opportunity to engage users in creative and valueadding interactions. Open data standards
and related research have focused on thegeneral structure of open data portals as opposed
to dataset level organizations. Accordingly, it is not easy to determine the value and
usefulness of specicdatasets (Lourenço, 2015).
The second challenge is lack of data literacy skills among actual and potential users.
Despite the promise of engaging citizens and addressing community-related issues, open
data users are mainly researchers and data scientists. Lack of technical skills among users
and the absence of opportunities to actively engage users in creative collaborations are the
reasons for the limited use of open data (Gebre, 2018;Eberhardt and Silveira, 2018;Gasc
o-
Hernández et al.,2018). In the context of secondary education,for example, open data can be
used to develop context-oriented data literacy programs where students work on project
topics that are relevant to their community and personal life (Gebre and Polman, 2016).
Although training and data literacy are essential components for a productive use of open
data, this paper focuses on the rst challenge organization of open datasets with a specic
emphasis on context-related information. This is because there are studies such as Gasc
o-
Hernández et al.s (2018) that outlinedtraining requirements related to the use of open data.
ILS
121,1/2
20

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT