Whither the retention schedule in the era of big data and open data?

Published date15 July 2014
Date15 July 2014
DOIhttps://doi.org/10.1108/RMJ-01-2014-0010
Pages99-121
AuthorJohn McDonald,Valerie Léveillé
Subject MatterInformation & knowledge management,Information management & governance
Whither the retention schedule
in the era of big data and open
data?
John McDonald
Information Management Consulting and Education, Ottawa,
Canada, and
Valerie Léveillé
School of Library, Archival and Information Studies,
University of British Columbia, Vancouver, Canada
Abstract
Purpose – This article, which is one of the products of an international collaborative research initiative
called iTrust, aims to explore these questions and offer suggestions concerning how the issues they
raise can be addressed.
Design/methodology/approach – The article describes the results of the rst stage in a multi-stage
research project leading to methods for developing retention and disposition specications and formal
schedules for open data and big data initiatives. A ctitious organization is used to describe the
characteristics of open data and big data initiatives, the gap between current approaches to setting
retention and disposition specications and schedules and what is required and how that gap can be
closed. The landscape described as a result of this stage in the research will be tested in case studies
established in the second stage of the project.
Findings – The argument is made that the business processes supporting open data and big data
initiatives could serve as the basis for developing enhanced standards and procedures that are relevant
to the characteristics of these two kinds of initiatives. The point is also made, however, that addressing
the retention and disposition issues requires knowledge and leadership, both of which are in short
supply in many organizations. The characteristics, the issues and the approaches will be tested through
case studies and consultations with those involved with managing and administering big data and open
data initiatives.
Originality/value – There is very little, if any, current literature that addresses the impact of big data
and open data on the development and application of retention schedules. The outcome of the research
will benet those who are seeking to establish processes leading to formally approved retention and
disposition specications, as well as an instrument – the approved retention and disposal schedule –
designed to ensure the ongoing integrity of the records and data associated with big data and open data
initiatives.
Keywords Big data, Open data, Retention, Business process, Schedule, Disposition
Paper type Conceptual paper
This project was realized through the support of the SSHRC-funded InterPARES Trust project
based at the School of Library, Archival and Information Studies (SLAIS) at the University of
British Columbia. The authors are grateful for the comments and editorial suggestions made by
the InterPARES Trust’s Project Director Dr Luciana Duranti, the ITrust project coordinator;
Corrine Rogers; and Hans Hofman of the National Archives of The Netherlands (retired).
The current issue and full text archive of this journal is available at
www.emeraldinsight.com/0956-5698.htm
Whither the
retention
schedule in
the era
99
Received 20 January 2014
Revised 28 April 2014
Accepted 2 June 2014
Records Management Journal
Vol. 24 No. 2, 2014
pp. 99-121
© Emerald Group Publishing Limited
0956-5698
DOI 10.1108/RMJ-01-2014-0010
Introduction
As big data and open data initiatives are being established by organizations around the
world, concerns are being raised about their impact on the management of records[1].
Within the context of these overall concerns, questions are also being raised about the
development of records retention and disposition specications and the establishment
of formal records retention schedules. What are the thought processes and criteria
inuencing retention and disposition specications for these types of initiatives? To
what extent are current approaches to developing, approving and implementing formal
schedules relevant to the worlds of big data and open data? What will have to change?
These and related questions are being addressed by a research project established under
InterPARES Trust (ITrust), an international research collaboration designed to develop:
[…] a framework for local, national and international networks of policies, procedures,
regulations, standards and legislation concerning digital records entrusted to the Internet, to
ensure public trust grounded on evidence of good governance, a strong digital economy, and a
persistent digital memory.
This article, which is one of the products of the research project, explores these questions
by identifying the issues at the heart of establishing retention and disposition
specications within the context of big data and open data initiatives. Through an
analysis of the evolution of big data and open data concepts and of the approaches that
have been adopted for the establishment of retention and disposition schedules, the
article will offer suggestions on how these issues may be addressed, drawing
specically from a foundation based on the analysis of business processes and
workow.
Big data and open data: concepts and evolution[2]
Big data and open data have become the “next new things” in the online environment.
While the two concepts are similar with regard to the nature of their data, each has
distinct characteristics and each has followed a distinct evolutionary path supported by
distinct objectives and communities. Thus, if the issues concerning retention and
disposition are to be understood, one must rst understand the characteristics that are
associated with big data and open data initiatives.
Open data
The concept of open data is twofold; it couples the act of proactive disclosure of
government-generated information in the form of open datasets, normally within the
context of open government policy, and the intended audience’s ability to access that
data. The concept is rooted in the objective to increase government transparency,
generate public input and interest and stimulate social and economic development.
Traditionally, the notion of open data derives from the original concept of “open
information”. According to the Open Knowledge Foundation (OKF), “Open
Information” can be described by the following statement: “A piece of data is open if
anyone is free to use, reuse, and redistribute it – subject only, at most, to the requirement
to attribute and/or share-alike”. Based on this understanding of “open information”, the
OKF has developed the following denition for open data: “Open Data is data that can be
used freely, shared and built upon by anyone, anywhere, for any purpose” and is
characterized by four characteristics (James, 2013):
RMJ
24,2
100

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT