User search terms and controlled subject vocabularies in an institutional repository

DOIhttps://doi.org/10.1108/LHT-11-2016-0133
Pages360-367
Published date18 September 2017
Date18 September 2017
AuthorScott Hanrath,Erik Radio
Subject MatterLibrary & information science,Librarianship/library management,Library technology,Information behaviour & retrieval,Information user studies,Metadata,Information & knowledge management,Information & communications technology,Internet
User search terms and controlled
subject vocabularies in an
institutional repository
Scott Hanrath
University of Kansas, Lawrence, Kansas, USA, and
Erik Radio
University of Arizona, Tucson, Arizona, USA
Abstract
Purpose The purpose of this paper is to investigate the search behavior of institutional repository (IR)
users in regard to subjects as a means of estimating the potential impact of applying a controlled subject
vocabulary to an IR.
Design/methodology/approach Google Analytics data were used to record cases where users arrived at
an IR item page from an external web search and subsequently downloaded content. Search queries were
compared against the Faceted Application of Subject Terminology (FAST) schema to determine the topical
nature of the queries. Queries were also compared against the items metadata values for title and subject
using approximate string matching to determine the alignment of the queries with current metadata values.
Findings A substantial portion of successful user search queries to an IR appear to be topical in nature.
User search queries matched values from FAST at a higher rate than existing subject metadata. Increased
attention to subject description in IR records may provide an opportunity to improve the search visibility of
the content.
Research limitations/implications The study is limited to a particular IR. Data from Google Analytics
does not provide comprehensive search query data.
Originality/value The study presents a novel method for analyzing user search behavior to assist IR
managers in determining whether to invest in applying controlled subject vocabularies to IR content.
Keywords Information retrieval, Academic libraries, FAST subject headings, Metadata, Search engines,
Institutional repositories
Paper type Research paper
Introduction
The applicationof controlled subjectvocabularies is a means of improvingresource discovery.
By consistently applying subject terms across resources, similar resources can be more
effectivelycollocated when searching or browsing by subject. Controlled subject vocabularies
may appeal to managersof institutional repositories (IRs) becausesuch vocabularies have the
potential to increase the visibility and interoperability of repository content by employing
standard vocabularies used in similar repositories or within similar disciplines.
However, applying controlled subject vocabularies to IR records can incur significant
costs. This is especially true in cases where the controlled vocabulary is to be applied
retroactively to repository content that has been submitted by a variety of users. Submitters
may include, for example, authors who are unfamiliar with principles of information
organization and cataloging. IR workflows may include little quality control of the
submitted metadata values. Compounding the problem, IRs often include a wide range
of content, from articles and gray literature to institutional records, in a wide range of
disciplines. Such scenarios can result in a great diversity of subject and keyword terms
applied unevenly across content and over a significant period of time. After-the-fact
metadata remediation and enhancement thus potentially requires a great deal of effort.
Repository managers are therefore faced with value proposition: given the potential costs
of applying a controlled subject vocabulary, how likely is it to positively impact the users
success in discovering repository content?
Library Hi Tech
Vol. 35 No. 3, 2017
pp. 360-367
© Emerald PublishingLimited
0737-8831
DOI 10.1108/LHT-11-2016-0133
Received 23 November 2016
Revised 23 March 2017
Accepted 2 May 2017
The current issue and full text archive of this journal is available on Emerald Insight at:
www.emeraldinsight.com/0737-8831.htm
360
LHT
35,3

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT