A systematic review on the use of best practices for publishing linked data

Pages107-123
Published date12 February 2018
Date12 February 2018
DOIhttps://doi.org/10.1108/OIR-11-2016-0322
AuthorDanila Feitosa,Diego Dermeval,Thiago Ávila,Ig Ibert Bittencourt,Bernadette Farias Lóscio,Seiji Isotani
Subject MatterLibrary & information science,Information behaviour & retrieval,Collection building & management,Bibliometrics,Databases,Information & knowledge management,Information & communications technology,Internet,Records management & preservation,Document management
A systematic review on the use of
best practices for publishing
linked data
Danila Feitosa
Computing Institute, Federal University of Alagoas, Maceio, Brazil
Diego Dermeval
Penedo Educational Unity, Federal University of Alagoas, Penedo, Brazil
Thiago Ávila and Ig Ibert Bittencourt
Computing Institute, Federal University of Alagoas, Maceio, Brazil
Bernadette Farias Lóscio
Informatics Center, Federal University of Pernambuco, Recife, Brazil, and
Seiji Isotani
Institute of Mathematics and Computer Science,
University of São Paulo, São Carlos, Brazil
Abstract
Purpose Data providers have been increasingly publishing content as linked data (LD) on the Web. This
processincludes guidelines (i.e.good practices) to publish, share,and connect data on the Web. Severalpeople in
different areas, for instance, sciences, medicine, governments and so on, use these practices to publish data.
The LD community has been proposing many practices to aid the publication of data on the Web.However,
discoveringthese practices is a costlyand time-consuming task, consideringthe practices thatare produced by
the literature. Moreover, the communitystill lacks a comprehensive understanding of how these practices are
used for publishing LD. Thus, the purpose of this paper is to investigate and better understand how best
practicessupport the publicationof LD as well as identifying to whatextent they have been applied tothis field.
Design/methodology/approach The authors conducted a systematic literature review to identify the
primary studies that propose best practices to address the publication of LD, following a predefined review
protocol. The authors then identified the motivations for recommending best practices for publishing LD and
looked for evidence of the benefits of using such practices. The authors also examined the data formats and
areas addressed by the studies as well as the institutions that have been publishing LD.
Findings In summary, the main findings of this work are: there is empirical evidence of the benefits of
using best practices for publishing LD, especially for defining standard practices, integrability and uniformity
of LD; most of the studies used RDF as data format; there are many areas interested in dissemination data in a
connected way; and there is a great variety of institutions that have published data on the Web.
Originality/value The results presented in this systematic review can be very useful to the semantic web
and LD community, since it gathers pieces of evidence from the primary studies included in the review,
forming a body of knowledge regarding the use best practices for publishing LD pointing out interesting
opportunities for future research.
Keywords Systematic review, Linked data, Web, Best practices for publishing linked data
Paper type Literature review
Introduction
An increasing number of data providers around the world are publishing linked data (LD)
on the Web. They are publishing data that can be used for many purposes, for example, to
be consumed or integrated into their applications or to produce more informed business
decisions. Hence, this process is leading to the creation of a global data space that contains
billions of information.
Data have been published on the Web in different formats, for instance, PDF, TIFF, CSV,
spreadsheets, embedded tables in documents, and many other forms of plain text.
Online Information Review
Vol. 42 No. 1, 2018
pp. 107-123
© Emerald PublishingLimited
1468-4527
DOI 10.1108/OIR-11-2016-0322
Received 10 November 2016
Revised 4 June 2017
Accepted 15 June 2017
The current issue and full text archive of this journal is available on Emerald Insight at:
www.emeraldinsight.com/1468-4527.htm
107
Use of best
practices for
publishing LD
These les are usually linked using HTML pages links. However, this kind of data usage
has an important drawback it is formatted for human consumption and often requires a
specialized ability to consume data (Wood and Marsha Zaidman, 2014). For this reason, it is
not an easy task to access, search, or re-use this data in automatic ways.
LD includes a set of practices to publish, share, and connect data on the Web using
international standards of the World Wide Web Consortium (W3C) (Wood and Marsha
Zaidman, 2014). The LD community is increasingly proposing guidelines to aid the
development and delivery of data using LD concepts. These practices are mainly developed
by the W3C and are called Data on the Web best practices[1]. Following these practices,
governmentsand researchers have beenusing these standards for publishing LD accordingto
different needs (Consoli et al., 2014; Galiotou and Fragkou, 2013; Kaschesky and Selmi, 2013;
Marden et al., 2013).
In this way, currently, there are several practices for publishing structured data on the
Web that may be used in different contexts. However, to the best of our knowledge, there is
no investigation that provides a comprehensive understanding of how these practices are
been used in the literature.
Hence, the objective of this work is to conduct a systematic review of the literature to find
out which are the best practices for publishing LD. We investigate whether there is real
evidence to justify the use of the best practices. Moreover, we also investigate: the
motivations for proposing best practices; which data formats are been used together with
best practices; which knowledge areas have been addressed using these practices; and
which institutions are using them.
In this paper, we use the systematic literature review (SLR) method (Kitchenham and
Charters, 2007) to identify, evaluate, interpret, and synthesize the available studies to
answer research questions on the use of best practices for publishing LD and to establish
the state of evidence with in-depth analysis.
This paper is organized as follows. We first describe the SLR method used in this review
and the context within it is inserted. Next, we present the results of the quality assessment
and an overview of the studies. It then reports the findings of the review along with a
detailed analysis and discussion of each research question. Afterward, we summarize the
best practices found after analyzing the papers of this SLR. Finally, we present our
conclusions and future works.
Related work
In the literature, there are several works that aim to systematically review the literature to
study the state-of-the art in a given domain.
Figueroa et al. (2015) presented a systematic review about the state-of-the art in
recommender systems that used structured data published as LD. In such paper, the authors
considered problems that RS intend to solve, how LD was used to address the problems,
their contributions, domains of applications, and evaluation approach. As a result,
the authors highlighted current limitations and possible directions for future work. Another
systematic review about LD was made by Tew et al. (2017), which intend to understand the
use of hospital data for research in Australia in the last two decades. In this study, the
authors identified that administrative hospital data linked with other data has the potential
to be a cost-effective method to significantly improve health policies. The authors
highlighted that LD avoid losses in common follow-up problems in longitudinal studies
because they allow patients to be retrospectively traced. Calero Valdez et al. (2016) aimed to
provide a systematic review about recommender system to identify how they are applied in
health scenarios. In this sense, the authors also sought to identify structures that help to
create better health recommender system. The structure suggested by the authorsaims to
guide a developer to get a view of the restrictions about medical applications. In this sense,
108
OIR
42,1

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT