Context from the data reuser’s point of view

DOIhttps://doi.org/10.1108/JD-08-2018-0133
Pages1274-1297
Date26 September 2019
Published date26 September 2019
AuthorIxchel M. Faniel,Rebecca D. Frank,Elizabeth Yakel
Subject MatterLibrary & information science,Records management & preservation,Document management,Classification & cataloguing,Information behaviour & retrieval,Collection building & management,Scholarly communications/publishing,Information & knowledge management,Information management & governance,Information management,Information & communications technology,Internet
Context from the data
reusers point of view
Ixchel M. Faniel
OCLC Research, Dublin, Ohio, USA, and
Rebecca D. Frank and Elizabeth Yakel
School of Information, University of Michigan, Ann Arbor, Michigan, USA
Abstract
Purpose Taking the researchersperspective, the purpose of this paper is to examine the types of context
information needed to preserve datas meaning in ways that support data reuse.
Design/methodology/approach This paper is based on a qualit ative study of 105 researc hers from
three disciplinary co mmunities: quantitative soc ial science, archaeology and zo ology. The study focused on
researchersmost rec ent data reuse experience, particularly wh at they needed when deciding whether to
reuse data.
Findings Findings show that researchers mentioned 12 types of context information across three broad
categories: data production information (data collection, specimen and artifact, data producer, data analysis,
missing data, and research objectives); repository information (provenance, reputation and history, curation
and digitization); and data reuse information (prior reuse, advice on reuse and terms of use).
Originality/value This paper extends digital curation conversations to include the preservation of context
as well as content to facilitate data reuse. When compared to prior research, findings show that there is some
generalizability with respect to the types of context needed across different disciplines and data sharing and
reuse environments. It also introduces several new context types. Relying on the perspective of researchers
offers a more nuanced view that shows the importance of the different context types for each discipline and
the ways disciplinary members thought about them. Both data producers and curators can benefit from
knowing what to capture and manage during data collection and deposit into a repository.
Keywords User studies, Research work, Information studies, Data sharing, Data curation, Data reuse
Paper type Research paper
1. Introduction
Context is a critical component for data reuse. Defined as the interrelated conditions in
which something exists or occurs(Merriam Webster Online Dictionary, http://www.
merriam-webster.com/dictionary/context), the technical aspects of context necessary for
reliable, long-term access to digital content (e.g. cultural objects, data, images, software, etc.)
have been explored in the digital preservation and curation literature. Less attention has
focused on the context necessary for preserving contents meaning to support reuse. Yet,
preserving meaning has become increasingly important as scholarly communities look to
share and reuse research data to replicate and reproduce research as well as perform novel
inquiries. Although researchers have proposed frameworks (Baker and Yarmey, 2009;
Beaudoin, 2012a, b; Lee, 2011), engaging with data reusers to specify the context necessary
to support meaning making has occurred less frequently.
Lees (2011) framework for context information in digital collections outlines three context
types for any given target entity (e.g. object, data, record): representations or relational entities
of the target entity, factors external to but potentially acting upon the target entity, and
perceptions of the user of a target entity. Drawing from the cataloging and classification,
Journal of Documentation
Vol. 75 No. 6, 2019
pp. 1274-1297
© Emerald PublishingLimited
0022-0418
DOI 10.1108/JD-08-2018-0133
Received 16 August 2018
Revised 3 January 2019
Accepted 10 January 2019
The current issue and full text archive of this journal is available on Emerald Insight at:
www.emeraldinsight.com/0022-0418.htm
The Dissemination Information Packages for Information Reuse(DIPIR) Project was made possible
in part by a National Leadership Grant from the Institute of Museum and Library Services,
LG-06-10-0140-10, with additional support from OCLC and University of Michigan. The authors thank
members of the DIPIR team, including University of Michigan students, research fellows, institutional
partners and individual collaborators.
1274
JD
75,6
archives, and information needs literature, he presents a framework comprised of nine classes
of contextual entities.Although Lee provides a broadview of context, the framework primarily
focuses on the context information necessary to ensure that digital objects can be rendered over
time. According to Lee (2011, p. 104), the entities provide an exhaustive documentation of a
digital object and the framework is intendedto inform the creation, captureand curation of the
contextual information within a repository, which can help to understand, make sense of,
analyze and use a particular target digital object.Albeit a valid approach and important
contribution, Lees framework primarily speaks to digital curatorsactivities, providing very
few details about the context necessary for meaning making during reuse.
Based on a review of the digital preservation literature, Beaudoin (2012a) develops a
multi-dimensional framework. The framework goes beyond technical aspects of context to
include utilization, physical, intangible, curatorial, authentication, authorization, and
intellectual aspects. Taken together these elements enable successful retrieval, assessment,
management, access, and use of preserved digital content(Beaudoin, 2012a, Abstract). In
subsequent work, Beaudoin (2012b) generates a questionnaire intended to generate a set of
metadata elements to describe an objects context according the framework. Unfortunately,
the questions fall short of defining audience needs beyond several high-level categories,
such as educational, leisure, legal, medical and youth.
Expressing user and reuser needs at this level of abstraction is common. Studies
acknowledge that capturing data context is difficult, noting contexts tacitness, the human
labor costs required to create and manage it, and data creatorsand information
professionalslack of expertise and knowledge about what should be captured (Berg and
Goorman, 1999; Birnholtz and Bietz, 2003; Edwards et al., 2011). Interestingly, few studies in
research and practice have asked researchers about their data reuse needs in general or their
needs for context information about the data specifically. Yet, a key premise of the Open
Archival Information System reference model for the curation and preservation of digital
objects is identifying designated communities to determine whether the context information
needed to support reuse is being provided (Consultative Committee for Space Data Systems,
2012). Many librarians and archivists take issue with the term designated communities,
which can encompass a broad public making the task of preserving meaning such that it
can be understood by everyone unfathomable (Bettivia, 2016). Even for those who have a
more focused designated community in mind, it is rare that the community is defined
explicitly, and its membersinput systematically collected and incorporated into digital
repository design (Bettivia, 2016). For these reasons, we draw from the data sharing and
reuse literature, particularly Chin and Lansings (2004) study, which offers the most
comprehensive view of different context types a data reuser may require.
Using participatory analysisand design sessions, Chinand Lansing (2004) asked biologists
to describe differentscenarios of collaborative data sharingand reuse, including the kinds of
information and computational capabilities needed to support data practices within the
Biological Science Collaboratory (BSC). Drawing from the context types Chin and Lansing
identified, we consider whether their model can be generalized to researchers in three other
disciplines, whose data reuse is not the result of joint work in a collaboratory quantitative
social science (hereafter referred to as social science), archaeology and zoology. The research
questions posed are as follows:
RQ1. What types of context information do researchers need when deciding whether to
reuse data?
RQ2. How do researchersneed for different types of context information vary across
disciplinary communities?
Our findings confirmand expand Chin and Lansings conclusionsabout context, particularly
as it relatesto what researchers need toevaluate data for reuse. Beforediscussing our findings
1275
Context from
the data
reusers point
of view

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT