Understanding and detecting data fabrication in large-scale assessments

Published date03 April 2018
Date03 April 2018
Pages196-212
DOIhttps://doi.org/10.1108/QAE-07-2017-0038
AuthorKentaro Yamamoto,Mary Louise Lennon
Subject MatterEducation,Curriculum, instruction & assessment,Educational evaluation/assessment
Understanding and
detecting data fabrication
in large-scale assessments
Kentaro Yamamoto and Mary Louise Lennon
Educational Testing Service, Princeton, New Jersey, USA
Abstract
Purpose Fabricated data jeopardize the reliability of large-scale population surveys and reduce the
comparabilityof such efforts by destroying the linkage between data and measurement constructs.Such data
result in the loss of comparabilityacross participating countries and, in the case of cyclicalsurveys, between
past and present surveys.This paper aims to describe how data fabrication can be understood in the context
of the complex processes involved in the collection, handling, submission and analysis of large-scale
assessment data. The actorsinvolved in those processes, and their possible motivationsfor data fabrication,
are also elaborated.
Design/methodology/approach Computer-basedassessments produce new types of information that
enable us to detect the possibility of data fabrication, and therefore the need for further investigation and
analysis. The paper presents three examples that illustrate how data fabrication was identied and
documented in the Programme for the International Assessment of Adult Competencies (PIAAC) and the
Programmefor International Student Assessment (PISA)and discusses the resulting remediationefforts.
Findings For two countries that participated in the rst round of PIAAC, the data showed a subset of
interviewerswho handled many more cases than others. In Case 1, the averageprociency for respondents in
those interviewerscaseloadswas much higher than expectedand included many duplicate responsepatterns.
In Case 2, anomalous responsepatterns were identied. Case 3 presents ndings based on data analyses for
one PISA country, where results for human-coded responses were shown to be highly inatedcompared to
past results.
Originality/value This paper shows how new sources of data, such astiming information collected in
computer-basedassessments, can be combined with other traditionalsources to detect fabrication.
Keywords Literacy, Large-scale assessment, Interviewers, Data analysis, Data fabrication
Paper type Research paper
Introduction
The impetus for large-scale assessmentshas always been the desire to collect reliable, valid
and comparable informationabout the skills possessed by a population to betterunderstand
how those skills are related to educational, economic and social outcomes. Fabricated data
jeopardize the reliability of large-scale population surveys and reduce comparability by
destroying the linkage between data and measurement constructs. Fabrication undermines
the inferences that can be made about performanceboth over time and across countries and
ultimately damagesthe quality and utility of a survey.
The issue of fabricated data is one that has been studied across a number of disciplines
and contexts. Studies of the falsicationof research data (Fanelli, 2009) have includedhighly
publicized examples such as Hwang Woo-Suks fake stem cell study (Saunders and
Savulescu, 2008) and Jan Hendrik Schöns scientic misconduct in his work on eld-effect
transistors at Bell Labs (Beasley et al.,2002). Data falsication extends beyond the research
community to government, industry, and faith groups (Agin, 2006). In the education arena,
QAE
26,2
196
Received14 July 2017
Revised10 January 2018
Accepted10 January 2018
QualityAssurance in Education
Vol.26 No. 2, 2018
pp. 196-212
© Emerald Publishing Limited
0968-4883
DOI 10.1108/QAE-07-2017-0038
The current issue and full text archive of this journal is available on Emerald Insight at:
www.emeraldinsight.com/0968-4883.htm
cheating scandals associated with the collection of achievement data in schools have been
uncovered in a number of USA school districts, where educators and administrators were
found to have systematically inated test scores (Blinder, 2015;Mezzacappa, 2014). At the
college level, cheating on admissions tests has been well documented, resulting in
countermeasuresto increase security (Strauss, 2017).
This paper describes how data fabrication can be understood in the context of the
complex processes involved in the collection, handling, submission and analysis of large-
scale assessment data. The actors involved in those processes, and their possible
motivations for data fabrication, are also elaborated upon. Detecting fabrication requires
a thorough understanding of these processesand an analysis of the data at key steps along
the way. Several types of red ags that signal the possibility of data fabrication, and
therefore the need for further investigation and analysis, are discussed. Finally, the paper
presents three examples of how data fabrication was identied and documented in the
Programme for the International Assessment of Adult Competencies (PIAAC) and the
Programme for International Student Assessment (PISA) and the remediation efforts
required to address these issues.
The focus in this paper ison analysis methods that were used in the rst cycle of PIAAC
and in PISA 2015 to detect fabrication oncedata were collected. As noted in several papers
included in this special issue, new methodologies are being used to detect anomalous
patterns of data collection muchearlier in the survey cycle. In the case of household surveys,
one example is the use of performance dashboards that can provide real-time monitoring
capabilities during data collection(Mohadjer and Edwards, this issue). Such tools can assist
in the early identicationof interviewer behaviors that may be indicative of falsicationand
contribute to improved data quality by prompting immediate intervention on the part of
national centers and surveyorganizations.
Understanding data fabrication
Large-scale population surveys are complexendeavors designed to meet the different goals
of multiple stakeholders and institutions. They involve detailed and interconnected
processes and procedures associated with sampling, data collection, data transmission,
coding, data analysis and reporting. The procedural tools and methods developed and used
in large-scale assessments are geared toward reducing the effort in data processing and
making it more standardized. But experience has shown that multiple points within these
processes are vulnerableto data fabrication.
At any point where a group or individual collects, examines or transmits data, the
opportunity for fabricationexists. This may include, for example, operationalizing thesteps
associated with sampling and data collection, meeting specied participation rates and
ensuring coder reliability.All the data collected and reported by a country, orjust a portion,
can be affected when datafabrication occurs. The complexity of the processes,as well as the
number of groups and individuals involved in large-scale surveys, requires a process of
forensic data analysisto discover cases in which fabrication has occurred.
Analyzing data to detect fabrication may not follow standard analysis plans due to the
unpredictability of how or where it may have occurred. As it is not practical to examine
every piece of information for data fabrication, we must look for incongruities in thedata at
the aggregated or summarized level and, if anything suspicious is found, follow-up with
more detailed investigation. Having to conduct any investigation into datafabrication adds
considerably to the complexity of datacollection and analysis efforts. Fabricated data from
one country often require the reanalysisof aggregated data from all participating countries,
Detecting data
fabrication
197

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT