An investigation of benchmark image collections: how different from digital libraries?

Date03 June 2019
Pages401-418
DOIhttps://doi.org/10.1108/EL-10-2018-0195
Published date03 June 2019
AuthorJingye Qu,Jiangping Chen
Subject MatterInformation & knowledge management,Information & communications technology,Internet
An investigation of benchmark
image collections: how dierent
from digital libraries?
Jingye Qu
Beihua University, Jilin City, Jilin, China, and
Jiangping Chen
Department of Library and Information Sciences, University of North Texas,
Denton, Texas, USA
Abstract
Purpose This paper aims to introduce the construction methods, image organization,collection use and
access of benchmark image collectionsto the digital library (DL) community. It aims to connect two distinct
communities: the DL community and imageprocessing researchers so that future image collections could be
better constructed,organized and managed for bothhuman and computer use.
Design/methodology/approach Image collections are rst identiedthrough an extensive literature
review of published journalarticles and a web search. Then, a coding scheme focusing on imagecollections
creation, organization, access and use is developed. Next, three major benchmark image collections are
analysed based on the proposed coding scheme. Finally,the characteristics of benchmark image collections
are summarizedand compared to DLs.
Findings Although most of the image collections in DLs are carefully curated and organized using
various metadataschema based on an images external features to facilitate humanuse, the benchmark image
collections created for promoting image processing algorithms are annotated on an images content to the
pixel level, which makes each image collection a more ne-grained, organized database appropriate for
developingautomatic techniques on classication summarization,visualization and content-basedretrieval.
Research limitations/implications This paper overviews image collections by their application
elds. The threemost representative natural image collections in generalareas are analysed in detail based on
a homemade coding scheme, whichcould be further extended. Also, domain-specic image collections,such
as medicalimage collections or collections for scienticpurposes, are not covered.
Practical implications This paper helps DLs with image collections to understand how benchmark
image collections usedby current image processing research are created, organizedand managed. It informs
multiple partiespertinent to image collections to collaborate on building,sustaining, enriching and providing
access to imagecollections.
Originality/value This paper is the rst attemptto review and summarize benchmark image collections
for DL managers and developers. The collection creation process and image organization used in these
benchmark image collections open a new perspective to digital librarians for their future DL collection
development.
Keywords Digital libraries, ImageNet, Image collections, Metadata standards, Image organization,
MS COCO, PASCAL VOC
Paper type Research paper
Introduction
The extensive development of various digital libraries (DLs) has brought to the world rich
information resourcesthat combine images and texts with well-annotated metadata records.
Images are important resources for learning and cultural communication. Libraries and
Investigation
of benchmark
image
collections
401
Received1 October 2018
Revised27 December 2018
Accepted12 January 2019
TheElectronic Library
Vol.37 No. 3, 2019
pp. 401-418
© Emerald Publishing Limited
0264-0473
DOI 10.1108/EL-10-2018-0195
The current issue and full text archive of this journal is available on Emerald Insight at:
www.emeraldinsight.com/0264-0473.htm
museums have curated large DLs, mostly containing image collections, for the purpose of
preserving world and local cultural heritage. A large number of information professionals
are working on building these digital collections by annotating them and managing them
(Berinstein, 1998;Burns, 2012;Ester, 1996;Galloway, 2004;Kenney and Rieger,2000;Reilly
and De, 2001;Wallace and Van Fleet, 2004).These DLs and associated image collections are
accessed by web usersto better understand the world and to generate new knowledge.
DLs are not the only resource for imagecollections. The evolving digital era has led to an
enormous explosion of image resources.Billions of personal photos are uploaded every day
on Flickr (an image and video hosting service created initially by Ludicorp in 2004).
Additionally, according to the Internet Trends Report 2018 (Meeker, 2018), more than 3
billion images are createdon Facebook properties and Snapchat alone.
The knowledge and information contained in image data are also fundamental to art,
remote sensing, business, military, geography, plant sciences, medicine and society. To
understand and extract knowledge and information from images, researchers have been
working diligently on image processing algorithms and techniques. The availability of big
image data has greatly facilitated image processing-related research, including image
annotation (Hanbury, 2008;Tousch et al., 2012;Wang, 2011;Zhang et al.,2012), image
understanding (Bowyer et al., 2008;Crevier and Lepage, 1997), image retrieval (Feng et al.,
2002;Oussalah, 2008;Rui et al., 1999;Smeulders et al.,2000;Zhou and Huang, 2003), object
recognition (Bucak et al.,2014;Campbell and Flynn, 2001;Roth and Winter, 2008), image
summarization (Fan et al.,2008;Shi et al.,2009;Simon et al.,2007;Tan et al.,2012;Xu et al.,
2011;Yang et al., 2013), image clustering and classication (Lu and Weng, 2007) and image
segmentation (Fu and Mui, 1981;Haralick, 1983;Lucchese and Mitra, 2001;Pal and Pal,
1993;Zhang, 1996).
Specically, these research areas are fostered by the availability of respective image
collections. These image collections enable researchers to explore and compare different
models and techniques. Many image data sets are created, and some of them have already
become the benchmark in specic imageprocessing areas.
The DL community has good knowledge of how digitalcollections and associated image
collections are created and organized (Fox et al.,2012;Xie and Matusiak, 2016). However,
there is little knowledge or understanding of the image collections beyond DLs. The
following questions remain to be answered: Are there other ways to create and organize
image collections? How are some largeimage collections created, organized and used? What
are the restrictions on the access of these collections? And more importantly, what can we
learn from the creation, organization, access and use (COAU) of these image collections?
Answers to these questions may help us to design, develop and maintain DLs more
effectively and efciently through learning from or collaborating with image processing
researchers. This studyis designed to answer these questions.
The purposes of this study include:
to analyse the construction methods, image organization, collection use and access
of benchmark image collections for the DL community; and
to connect the DL community and image processing researchers so that future
image collections could be better constructed, organized and managed for both
human and computer use.
The rest of this paper is organized as follows. First, the methodology of this study is
reported, including the literature review of existing image collections, developing a coding
scheme for analysing image collections and the actual analysis of three benchmark image
collections. Then, the results of the analysis are presented on the COAU of three selected
EL
37,3
402

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT