The problem of the missing dead

DOI10.1177/0022343320962159
Date01 September 2021
AuthorSophia Dawkins
Published date01 September 2021
Subject MatterRegular Articles
The problem of the missing dead
Sophia Dawkins
Department of Political Science, Yale University
Abstract
This article examines what scholars can learn about civilian killings from newswire data in situations of non-random
missingness. It contributes to this understanding by offering a unique view of the data-generation process in the
South Sudanese civil war. Drawing on 40 hours of interviews with 32 human rights advocates, humanitar ian
workers, and journalists who produce ACLED and UCDP-GED’s source data, the article illustrates how non-
random missingness leads to biases of inconsistent magnitude and direction. The article finds that newswire data
for contexts like South Sudan suffer from a self-fulfilling narrative bias, where journalists select stories and human
rights investigators target incidents that conform to international views of what a conflict is about. This is com-
pounded by the way agencies allocate resources to monitor specific locations and types of violence to fit strategic
priorities. These biases have two implications: first, in the most volatile conflicts, point estimates about violence using
newswire data may be impossible, and most claims of precision may be false; secondly, body counts reveal little if
divorced from circumstance. The article presents a challenge to political methodologists by asking whether social
scientists can build better cross-national fatality measures given the biases inherent in the data-generation process.
Keywords
civilian, civil war, conflict, fatality, mortality, newswire data, South Sudan
Introduction
Civil war datasets offer scholars an ‘international funeral
parlour’ to view the dead.
1
This presentation of death
obscures a murky reality, fraught with uncertainty about
who has died and who is worth counting. Behind each
observation in a newswire dataset is the story of a person
who falls victim to an atrocity; a killer who disfigures,
buries or conceals a corpse; a human rights officer or a
journalist who arrives once the killing is done, and deci-
des whom to talk to and what to record; a report that
bureaucrats or newspaper editors make public; and a
researcher who locates the report, and codes it into a
dataset. This article is about what scholars can learn
about the causes and consequences of civilian killings
from newswire data in contexts wrought with non-
random missingness. It asks: How do technical proce-
dures, political agendas and normative judgements alter
inferences about violence drawn from the newswire? Can
social scientists build better cross-national fatality mea-
sures given those biases?
Scholars have long recognized that fatality data are
nasty, noisy, and muddled (Williams, 2016; Spagat
et al., 2009; Kalyvas, 2006). How researchers code deter-
mines what they see (Merry, 2016; Samban is, 2004).
Governments fabricate data or obscure it to produce
‘official ignorance’ (Aronson, 2013: 30; Werth, 1997).
Epidemiologists and statisticians must adjudicate among
assumptions to model observations missing not-at-
random (Manrique-Vallier, Price & Gohdes, 2013;
Checchi, 2010; Dulic, 2004). These problems motivate
some to reject civil war research because the data pre-
clude design-based inference, while others proceed
believing that patterns in messy data reflect stronger
trends in reality (Gerber, Green & Kaplan, 2014).
This article charts a course between nihilism and
blind faith. It starts from the premise that newswire data
Corresponding author:
sophia.dawkins@yale.edu
1
I thank Jason Stearns for this metaphor.
Journal of Peace Research
2021, Vol. 58(5) 1098–1116
ªThe Author(s) 2020
Article reuse guidelines:
sagepub.com/journals-permissions
DOI: 10.1177/0022343320962159
journals.sagepub.com/home/jpr
remain invaluable to conflict scholarship in spite of non-
random missingness. Not all fatality count biases are
equal: some distort our inferences about civilian killings
in ways that preclude precision, while others prove tri-
vial. I argue that conflict scholars should train their
methodological eye on the first type. But to do so, they
need to understand what they are.
I contribute to this understanding by offering a
unique view ‘under the hood’ of newswire data for South
Sudan. Most quantitative datasets in the social sciences
stand on qualitative foundations, which demand careful
source interpretation (Kreuzer, 2010). I begin by
demonstrating how this feature of newswire data creates
confusions in defining civilians, biases in counting
methods, and estimation problems. I then unveil the
data-generation process by recording the experiences of
people who handle the dead and file newswire reports.
Drawing on 40 hours of interviews with 32 human rights
advocates, humanitarian workers, and journalists, I
investigate the biases we should expect to distort data
for South Sudan. I focus su bnationally on killings in
Jonglei State to assess patterns of non-random missing-
ness. I conclude with lessons about which biases matter
most and assess the merits of using categories rather than
counts to analyze violent deaths in low-information
contexts.
Counting the dead
In war, ‘something is always wrong with the facts one is
given’ (Nordstrom, 1997: 43). Scholars, journalists, and
human rights advocates disagree about who dies in crises
for many reasons. They diverge on who counts as a
civilian, and face political and technical choices about
how they handle the dead. This leads to variations in
how researchers record killings and estimate what they
do not see. Newswire datasets offer starting points for
navigating these problems by offering systematic records
of the dead over time. The problem for civil war
researchers is that they can rely neither on a missing-
at-random assumption nor on a general pattern of miss-
ingness that applies across contexts.
Which civilians?
How researchers define dead civilians determines how
they code, and how they code determines what they see
(Sambanis, 2004). This confronts conflict scholars with
two challenges.
The first is to distinguish civilians from combatants –
a difficult exercise in irregular wars without subjective
judgements (Gade, 2010). Belligerents may remove
uniforms or put them on the dead, and people shape
shift between civilian and combatant categories at unpre-
dictable times and in unpredictable ways. For example,
South Sudan’s ‘White Army’ is an ad-hoc mobilization
of cattle camp youth without clear central command,
which sometimes aligns with insurgents (Thomas,
2015). Whether observers count these youth as civilians
or combatants depends on what they use death counts
for (Ratnayake, Degomme & Guha-Sapir, 2009). For
example, human rights advocates may choose an expan-
sive definition to obviate excluding ‘victims’.
The second challenge is that conflict researchers study
‘direct’ civilian deaths, requiring an ‘event grammar’ that
identifies a perpetrator, a victim, and an act (Landman &
Gohdes, 2013: 78; Ball, 1996). Even where the act is
observable, scholars must disaggregate whether a homi-
cide is conflict-related, criminal or both (Williams,
2016). This is difficult where criminal networks run
deep into the battlefield and conflict-related killings
shape criminal homicides (Kleinfeld, 2017).
Political and technical bias
Once researchers settle on a civilian measure, they face
technical biases, which arise from counting methods, and
political biases, which arise when people’s agendas shape
what they report (Aronson, 2013). Two issues make this
distinction a useful but blurry framing device. The first is
that technical choices can motivate political biases that
would not otherwise pose a challenge. For example, wire
journalists may interview survivors at an atrocity site.
Due to political pressure, civilians might misreport what
they observed. Thus, the recording technique motivates
political bias.
The second issue is that everybody, from survivors to
researchers, has agendas that shape what they say, choose
methodologically, and judge credible. These agendas
may be technical. For example, epidemiologists and for-
ensic statisticians arrive at different counts because they
have different objectives that motivate different survey
questions (epidemiologists seek data to improve the
health of the living; forensic statisticians investigate
homicides) (Aronson, 2013). These agendas may
deprioritize accuracy to advance a normative cause
(De Waal, 2016). Political authorities may also distort
official records to propagate a particular social outlook
(Merry, 2016; Nelson, 2015; Tishkov, 1999).
Biases arise at four levels of data production and use,
when: (1) investigators count the dead; (2) researchers
code data; (3) scholars analyse the coded data; and (4)
people and institutions make secondary use of this
Dawkins 1099

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT