Evaluating the effectiveness of thesauri in digital information retrieval systems

Date05 February 2018
Published date05 February 2018
Pages55-70
DOIhttps://doi.org/10.1108/EL-02-2017-0033
AuthorSanjeev K. Sunny,Mallikarjun Angadi
Subject MatterInformation & knowledge management,Information & communications technology,Internet
Evaluating the eectiveness of
thesauri in digital information
retrieval systems
Sanjeev K. Sunny and Mallikarjun Angadi
Centre for Library and Information Management Studies,
Tata Institute of Social Sciences, Mumbai, India
Abstract
Purpose The purpose of this study is to carry out a systematic literature review for evidence-based
assessment of the effectivenessof thesaurus in digital information retrieval systems. It also aimed to identify
the evaluation methods, evaluation measures and data collection tools which may be used in evaluating
digital informationretrieval systems.
Design/methodology/approach A systematic literature review(SLR) of 344 publications from LISA
and 238 from Scopus has been carried out to identify the evaluation studies foranalysis, and 15 evaluation
studieshave been analyzed.
Findings This study presents evidencesfor the effectiveness of thesaurus in digital information retrieval
systems. Variousmethods for evaluating digital informationsystems have been identied. Also, a wide range
of evaluationmeasures and data collection tools have been identied.
Research limitations/implications The study was limited to the literature published in English
language and indexedin LISA and Scopus. The evaluation methods, evaluation measuresand data collection
tools identied in thisstudy may be used to design more cognizant evaluationstudies for digital information
retrievalsystems.
Practical implications The ndings havesignicant implications for the administratorsof any type of
digital informationretrieval systems in making more informeddecisions toward implementation of thesaurus
in resource descriptionand access to digital collections.
Originality/value This study extendsour knowledge on the potentials of thesauri in digital information
retrieval systems. It also provides cues for designing more cognizant evaluation studies for digital
informationsystems.
Keywords Systematic literature review, Evaluation studies, Thesaurus,
Digital information retrieval systems, Information retrieval effectiveness, Usability studies
Paper type Literature review
Introduction
The origin of information retrieval thesauri may be traced back to the proposal of
Mooers (1951), who apparently mentioned the use of a thesaurus in a mechanized
retrieval system in 1947 (Small, 1984). The concept evolved through the 1950s in
various forms, and it is now agreed that a thesaurus was rst used in the context of
information retrieval systems by Luhn (1957) (Jean and Clarke, 2004). ISO 2788 (1986)
denes thesaurus,inthiscontext,asthe vocabulary of a controlled indexing language,
formally organized so that the a priori relationships between concepts (for example, as
broader and narrower) are made explicit. A thesaurus was intended as a tool for
searchers (Small, 1984) and it is still used today in various forms (Gilchrist, 2003). ISO
259641(2011) recommends its application to:
Evaluating the
eectiveness of
thesauri
55
Received8 February 2017
Revised8 February 2017
Accepted2 May 2017
TheElectronic Library
Vol.36 No. 1, 2018
pp. 55-70
© Emerald Publishing Limited
0264-0473
DOI 10.1108/EL-02-2017-0033
The current issue and full text archive of this journal is available on Emerald Insight at:
www.emeraldinsight.com/0264-0473.htm
vocabularies used for retrieving information about all types of information resources irrespective
of the media [...] including knowledge bases and portals, bibliographic databases, text, museum
or multimedia collections, and the items within them.
The rapid spread of full text searchingand the advent of the Internet have contributed to the
origin of different types of digital information retrieval systems (hereafter referred to as
digital information systems), such as digital libraries, web portals, online databases, online
questionanswer systems, online image libraries and so forth. This, in turn, has created a
greater need and opportunitythan ever before to implement information retrievalthesauri to
exploit their potential in information retrieval (Jean and Clarke, 2004). A considerable
amount of literature has been publishedon the applications of thesauri in different types of
digital informationretrieval systems.
In recent years, there has been extensive research on implementing thesauri in different
types of digital information systems (Alonso GaonaGarcía et al., 2014;Dalmau et al., 2005;
Feki et al., 2014;Hienert et al., 2011;Lüke et al., 2012;Shiri et al., 2013). A search of the
literature revealed that the majority of researchers deployed a thesaurus in real digital
information systems (Alonso GaonaGarcía et al., 2014;Dalmau et al., 2005;Feki et al., 2014;
Hienert et al., 2011;Lüke et al., 2012;Shiriet al., 2013); some reported its implementation ina
pilot study or in a prototype/model digital information system (Alani et al., 2000;Blocks
et al., 2006;Shiri et al., 2004), and a few proposed models for its deployment in digital
information systems (Maso-Maresma and Sebastia-Salat, 2013;Nakashima et al., 2003;
Petri
cet al., 2011). Thesauri have been used indigital information systems mainly for three
purposes: indexing, searchingand browsing. The majority of the literature reportedthe use
of thesauri in both searchingand browsing together (Alonso Gaona García et al., 2014;Lüke
et al., 2012;Maso-Maresma and Sebastia-Salat, 2013;Shiri and Revie,2003, 2005;Shiriet al.,
2013;Wu and Witten, 2010). Moreover,many researchers report its use in all three purposes
(Atherton, 2002;Binding and Tudhope, 2004;Blocks et al., 2006;Hienert et al., 2011;Petri
c
et al., 2011). Furthermore, a couple of researchers used it in both indexing and searching
(Shiri et al., 2011;Soo et al., 2003;Torres and Reis, 2008) and a few reported its use only in
searching (Bakar and Rahman, 2003;Feki et al.,2014;Sarmento et al., 2008;Thangaraj and
Gayathri, 2013).
In the context of indexing the resources in digital information systems, a thesaurus is
used for maintaining uniformity in metadata creation. While it is primarily used for
uniformity in subject/keyword descriptions (Petri
cet al., 2011;Shiri et al., 2004;Torres and
Reis, 2008), it is also used to create consistent place names (Dalmau et al., 2005;Soo et al.,
2003). Digital information systemsgenerally provide its users options to browse collections
in different ways. While browsing, a thesaurus can be used to present a domain-specic
hierarchy of concepts on the interface and to provide the facility to directly retrieve
documents from digital collectionsrelated to a selected concept. For this purpose, thesaurus
terms may be represented on the interface as hyperlinks or nodes; users can retrieve the
resources associated with a node by clicking on the term. Many researchers presented
thesaurus terms on the interface along with the facility to search the database (Alonso
Gaona García et al., 2014;Dalmau et al., 2005;Petri
cet al., 2011;Shiri et al., 2013;Stafford
et al., 2008). Also, many presented thesaurus terms on the interface, after searching the
database, as suggestions related to the query in context (Sihvonen and Vakkari, 2004;Wu
and Witten, 2010). While using a thesaurus in searching, it can be applied by users to
formulate queries by selecting terms from it (Nakashima et al., 2003;Shiri and Revie, 2003;
Sihvonen and Vakkari, 2004). It may also be used to suggest an alternativeif a search query
does not nd an exact match in the database (Atherton, 2002). Moreover,thesauri were also
used to expand the query automatically by adding a broader term (BT), a narrower term
EL
36,1
56

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT