A study of the use of simulated work task situations in interactive information retrieval evaluations: a meta-evaluation

DOIhttps://doi.org/10.1108/JD-06-2015-0068
Published date09 May 2016
Date09 May 2016
AuthorPia Borlund
Subject MatterLibrary & information science,Records management & preservation,Document management
1
A study of the use of simulated work task situations
in interactive information retrieval evaluations: a meta-evaluation
1. Introduction
This paper examines how the test instrument of a simulated work task situation is used in the
research literature. The major challenge of its use lies in the design of authentic and applicable
simulated work task situations, which are relevant and realistic to the test participants who are to
apply the situations for reliable interactive information retrieval (IIR). In that light, it is interesting
to examine how the test instrument is used in previous IIR evaluation studies, and what we can
learn from those evaluations and that use. Hence, the paper reports a meta-evaluation of the use of
simulated work task situations. The present paper follows up on a paper by Borlund and Schneider
(2010), in which preliminary results were presented.
This type of meta-evaluation and further development of approaches for IIR evaluation is
motivated by Belkin in his 2008 ECIR Keynote (Belkin, 2008). In the keynote, he particularly
addresses the need for alternative evaluation approaches to the Cranfield model, or TREC style as
he formulates it (Belkin, 2008, p. 52). This need still stands. Belkin explicitly highlights the IIR
evaluation model by Borlund (e.g., Borlund, 2003) as such an attempt (Belkin, 2008, p. 52). He
further points out how the contradictions between the necessity for realism and the desire for
comparability and generalization have not yet been solved (Belkin, 2008, p. 52). The issues of
realism and comparability are dealt with by the test instrument of a simulated work task situation
inherent the IIR evaluation model. In order to compare search behaviors and performance results
and hereby generate a reliable knowledge base, the employed simulated work task situations must
be realistic to the test participants.
In brief, the concept of a simulated work task situation was introduced in 1997 via a
feasibility study of the use of cover-stories functioning as scenarios for user-authentic evaluation of
IR effectiveness and user satisfaction with retrieved information (Borlund & Ingwersen, 1997). In
2000 the IIR evaluation model was developed (Borlund, 2000a; 2003); an evaluation model for the
evaluation of IR interaction that includes the application of simulated work task situations
according to a set of empirically based requirements of how to use this test instrument (Borlund;
2000a; 2000b; 2003).
The overall objectives of the present study are to identify how simulated work task situations
are used, and for what types of evaluations they are used. In particular, we want to learn about the
intentional, and the unintentional use of simulated work task situations in order to clarify and
improve the requirements for the application of simulated work task situations. The idea is to learn
from previous research and improve future research. The overall long term ambition is to increase
the knowledge base of empirical IIR studies with refined requirements about how to use simulated
work task situations. The insight gained will also help to set out directions for future meta-
evaluations of the use of simulated work task situations.
The remainder of the paper is structured as follows: section 2 introduces the concept and test
instrument of a simulated work task situation, and summarises the most basic requirements of how
to use the instrument. Section 3 presents the methodological approach taken to examine the use of
2
simulated work task situations as reported in the research literature. Section 4 reports on the results
of the study concerning how the evaluation instrument is used in previous evaluation studies. On
this basis, directions for future empirical studies that validate and increase our understanding of
how to use simulated work task situations are outlined. The paper closes with concluding
statements in section 5.
2. The test instrument of a simulated work task situation
This section introduces the test instrument of a simulated work task situation and the corresponding
requirements regarding its use. The theoretical assumptions underlying this instrument are
described in Borlund and Ingwersen (1997) and Borlund (2000a; 2000b; 2003).
A simulated work task situation is a short textual description that presents a realistic
information requiring situation that motivates the test participant to search the IR system (Borlund,
2003). A simulated work task situation serves two main functions: (1) it causes a ‘simulated
information need’ by allowing for user interpretations of the simulated work task situation, leading
to cognitively individual information need interpretations as in real life; and (2) it is the platform
against which situational relevance is judged by the test participant (Borlund & Ingwersen, 1997,
pp. 227-228). More specifically it helps to describe to the test participants:
The source of the information need;
The environment of the situation;
The problem which has to be solved; and also
Serves to make the test participants understand the objective of the search
(Borlund & Ingwersen, 1997, pp. 227-228).
As such the simulated work task situation is a stable concept, i.e., the given purpose and goal of the
IR system interaction. Figure 1 depicts a classic example of a simulated work task situation tailored
to university students.
[Insert Figure 1 about here]
Further, by being the same for all the test participants experimental control is provided, and the
search interactions are comparable across the group of test participants for the same simulated work
task situation. As such, the use of simulated work task situations ensures the IIR study both realism
and control.
The issue of realism of the descriptions of the simulated work task situations is essential in
order for the prompted search behaviour and relevance assessments of the test participants to be as
genuine as intended. The simulated work task situations create simulated information needs that are
to replicate genuine information needs. Therefore realism is emphasised in the requirements of how
to employ simulated work task situations (Borlund, 2003). In brief, the requirements are as follows:

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT