What do we mean by “data”? A proposed classification of data types in the arts and humanities

DOIhttps://doi.org/10.1108/JD-07-2022-0146
Published date12 December 2022
Date12 December 2022
Pages51-71
Subject MatterLibrary & information science,Records management & preservation,Document management,Classification & cataloguing,Information behaviour & retrieval,Collection building & management,Scholarly communications/publishing,Information & knowledge management,Information management & governance,Information management,Information & communications technology,Internet
AuthorBianca Gualandi,Luca Pareschi,Silvio Peroni
What do we mean by data?
A proposed classification of data
types in the arts and humanities
Bianca Gualandi
Research Services Coordination Unit, Research Services Division (ARIC),
University of Bologna, Bologna, Italy
Luca Pareschi
School of Economics, University of Rome Tor Vergata, Roma, Italy, and
Silvio Peroni
Department of Classical Philology and Italian Studies, University of Bologna,
Bologna, Italy
Abstract
Purpose This article describes the interviews the authors conducted in late 2021 with 19 researchers at the
Department of Classical Philology and Italian Studies at the University of Bologna. The main purpose was to
shed light on the definition of the word datain the humanities domain, as far as FAIR data management
practices are concerned, and on what researchers think of the term.
Design/methodology/approach The authors invited one researcher for each of the official disciplinary
areas represented within the department and all 19 accepted to participate in the study. Participants were then
divided into five main research areas: philology and literary criticism, language and linguistics, history of art,
computer science and archival studies. The interviewswere transcribed and analysed using a grounded theory
approach.
Findings A list of 13 research data types has been compiled thanks to the information collected from
participants. The term datadoes not emerge as especially problematic, although a good deal of confusion
remains. Looking at current research management practices, methodologies and teamwork appear more
central than previously reported.
Originality/value Our findings confirm that datawithin the FAIR framework should include all types of
inputs and outputs humanities research work with, including publications. Also, the participants of this study
appear ready for a discussion around making their research data FAIR: they do not find the terminology
particularly problematic, while they rely on precise and recognised methodologies, as well as on sharing and
collaboration with colleagues.
Keywords Research data management, FAIR principles, Survey, Humanities, Grounded theory approach
Paper type Research paper
Introduction
The start of the discussion around widening public access to research can be traced back to
the Budapest Open Access Initiative that, in a 2002 declaration, defined for the first time the
term open access (Chan et al., 2002). Other declarations followed suit (e.g. Brown et al., 2003;
Max Planck Society, 2003), broadening the definition to also include:
research results, raw data and metadata, source materials, digital representations of pictorial and
graphical materials and scholarly multimedia material (Max Planck Society, 2003).
Data types in
arts and
humanities
51
© Bianca Gualandi, Luca Pareschi and Silvio Peroni. Published by Emerald Publishing Limited. This
article is published under the Creative Commons Attribution (CC BY 4.0) license. Anyone may
reproduce, distribute, translate and create derivative works of this article (for both commercial and non-
commercial purposes), subject to full attribution to the original publication and authors. The full terms of
this license may be seen at http://creativecommons.org/licences/by/4.0/ legalcode
The current issue and full text archive of this journal is available on Emerald Insight at:
https://www.emerald.com/insight/0022-0418.htm
Received 6 July 2022
Revised 4 November 2022
Accepted 7 November 2022
Journal of Documentation
Vol. 79 No. 7, 2023
pp. 51-71
Emerald Publishing Limited
0022-0418
DOI 10.1108/JD-07-2022-0146
With focus shifting away from publication and towards all research materials, the terms open
data and open science were born.
The European Unions Research and Innovation policy has embraced these concepts as
pillars of the European Research Area (ERA) (European Commission, 2022). The new ERA
policy agenda for the period 20222024 sets out a number of actions to enable open science
and develop a web of FAIR data and services for science in Europe(European Commission,
2022;EOSC, 2022).
FAIR is an acronym for Findable, Accessible, Interoperable and Reusable (Wilkinson et al.,
2016) and indicates a set of data management practices centred on machine actionability. In this
scenario, the exact meaning of the word data is increasingly being discussed within the scholarly
community. The use of this term, and the application of FAIR principles, within the art and
humanities domain is not without problems. As pointed out, among others, by T
oth-Czifra:
[...]by applying the FAIR data guiding principles to arts and humanities data curation workflows, it
will be uncovered that contrary to their general scope and deliberately domain-independent nature,
they have been implicitly designed according to underlying assumptions about how knowledge
creation operates and communicates (T
oth-Czifra, 2019, p. 3).
In the same instance, the author calls for surveys to assess whether and to what extent the
term data is still a dirty word(T
oth-Czifra, 2019, p. 15), an idea first introduced by Hofelich
Mohr and colleagues in their article When data is a dirty word: a survey to understand data
management needs across diverse research disciplines (Hofelich Mohr et al., 2015a).
The present work is a contribution towards this goal and towards looking for a way
forward for FAIR data principles within the arts and humanities without relying on
assumptions drawn from other disciplines. This study addresses the following research
questions:
RQ1. How do we define datain the arts and humanities?
RQ2. What do these researchers think of the word dataand how do they associate
datawith research materials used in their discipline?
RQ3. What are their attitudes towards open science?
RQ4. What are their current data management practices?
Please note that the present study falls short of proposing solutions for the implementation of
FAIR data principles in the humanities. Rather, it concentrates primarily on the definition of
the term dataand on the way it is used in humanities research communities and,
secondarily, on surveying some of the current data management practices.
Literature review
How do we define datain the humanities? An open problem
The European Federation of Academies of Sciences and Humanities [1] (ALLEA) Working
Group on E-Humanities, in its report Sustainable and FAIR Data Sharing in the Humanities,
states that the definition of data encompasses all inputs and outputs of research that are not
publications (Harrower et al., 2020, p. 6). It does recognise, however, that both texts and
documents are data (Harrower et al., 2020, pp. 8 and 14).
The CO-OPERAS Implementation Network (IN), which is part of GO FAIR [2], aims at
helping social sciences and humanities (SSH) communities:
build a bridge between SSH data and the EOSC, widening the concept of research datato include all
of the types of digital research output linked to scholarly communication that are part of the research
process (OPERAS, 2022).
JD
79,7
52

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT