The influences of social value orientation and domain knowledge on crowdsourcing manuscript transcription. An empirical investigation of the Transcribe-Sheng project

Date24 December 2019
Published date24 December 2019
AuthorXuanhui Zhang,Si Chen,Yuxiang Chris Zhao,Shijie Song,Qinghua Zhu
subjectMatterLibrary & information science,Information behaviour & retrieval,Information & knowledge management,Information management & governance,Information management
Xuanhui Zhang and Si Chen
School of Information Management, Nanjing University, Nanjing, China
Yuxiang Chris Zhao
School of Economics and Management,
Nanjing University of Science and Teconology, Nanjing, China, and
Shijie Song and Qinghua Zhu
School of Information Management, Nanjing University, Nanjing, China
Purpose The purpose of this paper is to explore how social value orientation and domain knowledge affect
cooperation levels and transcription quality in crowdsourced manuscript transcription, and contributeto the
recruitment of participants in such projects in practice.
Design/methodology/approach The authors conducted a quasi-experiment using Transcribe-Sheng,
which is a well-known crowdsourced manuscript transcription project in China, to investigate the influences
of social value orientation and domain knowledge. The experiment lasted one month and involved
60 participants. ANOVA was used to test the research hypotheses. Moreover, inverviews and thematic
analyses were conducted to analyze the qualitative data in order to provide additional insights.
Findings The analysis confirmed that in crowdsourced manuscript transcription, social value orientation
has a significant effect on participantscooperation level and transcription quality; domain knowledge has a
significant effect on participantstranscription quality, but not on their cooperation level. The results also
reveal the interactive effect of social value orientation and domain knowledge on cooperation levels and
quality of transcription. The analysis of the qualitative data illustrated the influences of social value
orientation and domain knowledge on crowdsourced manuscript transcription in detail.
Originality/value Researchers have paid little attention to the impacts of the psychological and
cognitive factors on crowdsourced manuscript transcription. This study investigated the effect of social value
orientation and the combined effect of social value orientation and domain knowledge in this context. The
findings shed light on crowdsourcing transcription initiatives in the cultural heritage domain and can be used
to facilitate participant selection in such projects.
Keywords Digital humanities, Cultural heritage domain, Crowdsourced manuscript transcription,
Volunteer selection, Social value orientation, Domain knowledge
Paper type Research paper
1. Introduction
In recent years, the digitization of historical manuscripts has increased in galleries, libraries,
archives and museums (GLAMs) for the long-term preservation, wide dissemination and
effective use of cultural heritage (Terras, 2016; Stewart, 2018; Sturgeon, 2018). This has been
largely facilitated by a demand for digital resources on the part of cultural institutions,
researchers and the general public (Guarino et al., 2019; Trace and Karadkar, 2017; Lang and
Rio-Ross, 2011). Digitized historical manuscripts not only can provide valuable data for digital
curation (Dallas, 2016) and digital humanities research (Clement and Carter, 2017), but they can
Received 28 August 2019
Revised 10 November 2019
Accepted 29 November 2019
also make collections more visible and accessible to the public (Bonacchi et al., 2019). Traditional
manuscript transcription relies on the resources of a single institution or a few professional
transcribers. However, with the increase in the amount of manuscript material, the digitization
process has encountered many challenges, for example a lack of financial and labor support
(Cohen, 2010; Spindler, 2014), and long duration times (Lang and Rio-Ross, 2011).
Advanced technologies such as optical character recognition (OCR) and deep learning
have brought breakthroughs to the digitization of manuscripts; however, in many cases, the
results are not satisfactory due to inevitable limitations such as paper oxidation, illegible
handwriting, the great variability in calligraphic style, etc. (Traub et al., 2018; Dang, 2018;
Smith, 2014). In these cases, many cultural heritage institutions use the power of the public
to transcribe historical manuscripts. Some of the most famous projects, such as Transcribe
Bentham, Old Weather, Transcribing the Past (Civil War Manuscripts and Whats on the
Menu?), have had remarkable research outcomes (Reese, 2016; Dimock, 2013).
Crowdsourcing manuscript transcription is a common practice in the digital
humanities (Oomen and Aroyo, 2011), a harnessing of the wisdom of crowds to perform
knowledge-intensive tasks. Previous studies of crowdsourced manuscript transcription
mainly focused on the utility and importance of engaging the crowd in project
implementation. For example, Moyle et al. (2011) tested the feasibility of outsourcing the
work of manuscript transcription to members of the public for Transcribe Bentham, the first
major crowdsourcing transcription project, during which they recognized much about the
nature of community engagement with digitized resources. Daniels et al. (2014) described
how the Louisville Leader Transcription Project was implemented at the University of
Louisville, including the tools adopted and the process used. Other research studies focused
on the ways to motivate volunteers in unpaid or low-paid crowdsourcing projects, especially
when the transcription is a relatively complex task. To encourage participants, the Old
Weather Project allows volunteers to move up the ranks from cadet to captain and provides
contextual information about tasks, so that volunteers can learn more about them (Eveleigh
et al., 2013; Tinati et al., 2015). Many transcription projects have a large number of
volunteers because of intrinsic motivation (e.g. enjoyment, curiosity and fun), such as the
Whats on the Menu Project, the Open Dinosaur Project, etc. (Lascarides and Vershbow,
2016; Lang and Rio-Ross, 2011). In addition, for complex transcription projects, community
engagement is a way to increase participation. For instance, the Transcribe Bentham
Project highlights the academic value of engagement and focuses on three broad priority
communities of participants: students, researchers and scholars (Causer and Terras, 2014).
In contrast, traditional online crowdsourcing projects for manuscript transcription rely
primarily on the individual efforts of volunteers, rather than on cooperative approaches. To
motivate more volunteers to participate, researchers have considered many aspects of the
situation, such as improving the usability of the platform (Newman et al., 2010), motivating
user participation through game design (Tang and Prestopnik, 2019) and granting external
rewards suchas money or prizes (Koh, 2019). How, it can be asked,is it better to have as many
participants as possible? Prior work shows that the top 10 percent of volunteers contribute
significantlymore than others (Holley, 2010).Moreover, the quality of transcriptionsubmitted
by most volunteers is often not satisfactory, and then project sponsors have to spend even
more time and money to correct the errors (Cohen, 2010). We argue that individual efforts in
manuscript transcription are limited, even when there are a large number of participants;
however, cooperation among individuals could, to some extent, promote greater effort.
Manuscript transcription is not an easy job; it usually requires a lot of time, effort and
intellectualinput from the participants (Causerand Terras, 2014), especiallyas the complexity
of transcribing tasks is increasing (Ambati et al., 2012). Hence, it is difficult to find qualified
volunteers to engage with cooperative cultural heritage projects. In this regard, there is a
pressing need to call for suitable volunteer engagement and further identify potential

