Integrating thesaurus relationships into search and browse in an online photograph collection

Date01 September 2005
Published date01 September 2005
Pages425-452
DOIhttps://doi.org/10.1108/07378830510621829
AuthorMichelle Dalmau,Randall Floyd,Dazhi Jiao,Jenn Riley
Subject MatterInformation & knowledge management,Library & information science
OTHER ARTICLE
Integrating thesaurus
relationships into search and
browse in an online photograph
collection
Michelle Dalmau, Randall Floyd, Dazhi Jiao and Jenn Riley
Indiana University Digital Library Program, Bloomington, Indiana, USA
Abstract
Purpose – Seeks to share with digital library practitioners the development process of an online
image collection that integrates the syndetic structure of a controlled vocabulary to improve end-user
search and browse functionality.
Design/methodology/approach – Surveys controlled vocabulary structures and their utility for
catalogers and end-users. Reviews research literature and usability findings that informed the
specifications for integration of the controlled vocabulary structure into search and browse
functionality. Discusses database functions facilitating query expansion using a controlled vocabulary
structure, and web application handling of user queries and results display. Concludes with a
discussion of open-source alternatives and reuse of database and application components in other
environments.
Findings – Affirms that structured forms of browse and search can be successfully integrated into
digital collections to significantly improve the user’s discovery experience. Establishes ways in which
the technologies used in implementing enhanced search and browse functionality can be abstracted to
work in other digital collection environments.
Originality/value – Significant amounts of research on integrating thesauri structures into search
and browse functionalities exist, but examples of online resources that have implemented this
approach are few in comparison. The online image collection surveyed in this paper can serve as a
model to other designers of digital library resources for integrating controlled vocabularies and
metadata structures into more dynamic search and browse functionality for end-users.
Keywords Controlled languages, Photography,Digital storage, Collections management
Paper type Technical
Background
The Charles W. Cushman Photograph Collection (www.dlib.indiana.edu/collections/
cushman/) is an online resource providing access to nearly 15,000 color photographs
The Emerald Research Register for this journal is available at The current issue and full text archive of this journal is available at
www.emeraldinsight.com/researchregister www.emeraldinsight.com/0737-8831.htm
The four authors of the paper all contributed to this project in significant and unique ways, as
did approximately 15 other members of the project team from the Indiana University Digital
Library Program and Indiana University Archives. Development of the Cushman collection web
site was funded through an Institute of Museum and Library Services National Leadership
Grant. The authors would like to thank everyone involved with the project for their hard work,
patience, and inspiration. Special thanks to Kristine Brancolini, Jon W. Dunn and John A. Walsh
for their helpful review of this paper.
Integrating
thesaurus
relationships
425
Received 31 August 2004
Revised 10 November 2004
Accepted 10 November 2004
Library Hi Tech
Vol. 23 No. 3, 2005
pp. 425-452
qEmerald Group Publishing Limited
0737-8831
DOI 10.1108/07378830510621829
shot by Charles W. Cushman, an amateur photographer, from 1938 to 1969. The vast
majority of the photographs in the collection were taken on Kodachrome color slide
film, which was originally introduced in 1936. The slides were taken at hundreds of
locations all over the world, and there are particularly large quantities of images taken
in Chicago from 1941 to 1951 and in San Francisco from 1954 to 1969. The collection
shows a similar breadth of subject matter. Cushman was apparently fond of
photographing plants, but also seemed to favor shooting architecture and people, often
showing these in various states of decay or misfortune.
Cushman, an Indiana University alumnus, bequeathed his collection of slides, along
with notebooks documenting each slide and some additional related materials, to
Indiana University upon his death in 1972, where they were deposited in the Indiana
University Archives (http://www.indiana.edu/ ,libarch/). The collection was
rediscovered in the Archives in late 1999 and recognized as remarkable for its
breadth, level of documentation, and representation of color photography in a time we
today generally envision in black and white. The Indiana University Digital Library
Program (www.dlib.indiana.edu/) and Indiana University Archives collaborated on the
project to digitize and build a delivery system for the images, which was funded by an
Institute of Museum and Library Services (IMLS) National Leadership Grant (www.
imls.gov/). The large amount of description that came with the slides allowed us to
focus our development efforts on creating robust metadata for the collection an d using
it in novel ways for searching and browsing.
Metadata issues
Metadata for image collections
Creators of image collections commonly follow the traditional metadata model of
providing a combination of free-text descriptions and access terms from controlled
vocabularies. Free-text descriptions serve many functions in metadata records. For
historical or archival collections, these descriptions can preserve the terminology used
to describe an item by its creator or an important collector. They provide
human-readable, in-depth details about an item, such as “pen and brown (iron-gall) ink
and wash, graphite, watercolor, gouache and opaque white, with gum arabic and
scraping out, on gray wove paper” for an art drawing (Visual Resources Association,
2004, p. 97). Descriptions of this sort can supply context and expert interpretation for
end-users.
Appropriate terms, called “authorized terms,” from controlled vocabularies are
commonly added to metadata records for personal and corporate names, geographical
locations, form and genre, and the topical nature of the item being described, whe ther
or not the concept represented is already present in the descriptive fields. This is done
for a number of reasons. First, controlled vocabularies increase the number of access
points available for an item. By specifying ahead of time a fixed set of info rmation
fields – such as geographical location at country, state, county and city levels; names;
genre terms; and topical terms – records become more consistent and thus more useful
for searching. Second, the use of controlled vocabularies ensures that the same term is
used to describe the same concept, same person, or same place among all records, a
practice which results in the collocation of all relevant records under a single form of a
term. Similarly, controlled vocabularies provide disambiguation between different
meanings of the same term or different people and places with the same name by
LHT
23,3
426

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT