Sensitivity analysis of mapping local image features into conceptual categories

Date13 June 2008
Pages255-273
Published date13 June 2008
DOIhttps://doi.org/10.1108/07378830810880351
AuthorChih‐Fong Tsai,David C. Yen
Subject MatterInformation & knowledge management,Library & information science
Sensitivity analysis of mapping
local image features into
conceptual categories
Chih-Fong Tsai
Department of Accounting and Information Technology,
National Chung Cheng University, Chia Yi, Taiwan, and
David C. Yen
Department of Decision Sciences and Management Information Systems,
Farmer School of Business, Miami University, Oxford, Ohio, USA
Abstract
Purpose – Image classification or more specifically, annotating images with keywords is one of the
important steps during image database indexing. However, the problem with current research in terms
of image retrieval is more concentrated on how conceptual categories can be well represented by
extracted, low le vel features for an e ffective classi fication. Conseq uently, image fea tures
representation including segmentation and low-level feature extraction schemes must be genuinely
effective to facilitate the process of classification. The purpose of this paper is to examine the effect on
annotation effectiveness of using different (local) feature representation methods to map into
conceptual categories.
Design/methodology/approach – This paper compares tiling (five and nine tiles) and regioning
(five and nine regions) segmentation schemes and the extraction of combinations of color, texture, and
edge features in terms of the effectiveness of a particular benchmark, automatic image annotation set
up. Differences between effectiveness on concrete or abstract conceptual categories or keywords are
further investigated, and progress towards establishing a particular benchmark approach is also
reported.
Findings – In the context of local feature representation, the paper concludes that the combined color
and texture features are the best to use for the five tiling and regioning schemes, and this evidence
would form a good benchmark for future studies. Another interesting finding (but perhaps not
surprising) is that when the number of concrete and abstract keywords increases or it is large (e.g.
100), abstract keywords are more difficult to assign correctly than the concrete ones.
Research limitations/implications Future work could consider: conduct user-centered
evaluation instead of evaluation only by some chosen ground truth dataset, such as Corel, since
this might impact effectiveness results; use of different numbers of categories for scalability analysis
of image annotation as well as larger numbers of training and testing examples; use of Principle
Component Analysis or Independent Component Analysis, or indeed machine learning techniques for
low-level feature selection; use of other segmentation schemes, especially more complex tiling schemes
and other regioning schemes; use of different datasets, use of other low-level features and/or
combination of them; use of other machine learning techniques.
Originality/value – This paper is a good start for analyzing the mapping between some feature
representation methods and various conceptual categories for future image annotation research.
Keywords Statisticalanalysis, Sensitivity analysis
Paper type Research paper
1. Introduction
Digital libraries provide many information resources for many people in many ways
which satisfy most information needs. However, it is a complex task for users to
The current issue and full text archive of this journal is available at
www.emeraldinsight.com/0737-8831.htm
Mapping local
image features
255
Received 25 September 2007
Revised 16 November 2007
Accepted 27 January 2008
Library Hi Tech
Vol. 26 No. 2, 2008
pp. 255-273
qEmerald Group Publishing Limited
0737-8831
DOI 10.1108/07378830810880351
retrieve including search and filter tens of thousands of matches from digital
libraries (Borgman, 2001). In general, many digital libraries, such as museums and
art galleries contain large numbers of digital visual images (Tsai, 2007). As a result,
how to effectively index and retrieve images is one major research problem in
digital libraries.
A major concern of prior research on image indexing and retrieval over the past few
years has been on the needs to produce image indexing schemes which can be
effectively utilized to support the retrieval processes of searchers dealing mainly with
large image databases. Over the past three or four years the major focus of automa tic
image indexing research has moved from schemes dealing with lower level featu res
such as color or texture, to schemes in supporting most searchers’ preferred method of
formulating initial searches using keywords.
Content-Based Image Retrieval (CBIR) (Rui et al., 1999; Smeulders et al., 2000)
automatically extracts visual (or low-level) features of images, such as color, texture,
and shape. Searchers may either be able to supply example images like the ones they
require, or to browse the image database by navigating some types of similarity space.
The degree of similarity between images is usually based on the distance between
low-level features of images in the feature space. The study from Puzicha and his
colleagues (Puzicha et al., 1999) have undertaken a systematic comparison of similarity
measures for image retrieval.
A number of prior studies, e.g. Enser (2000), have shown that searchers usually
directly formulate image queries in terms of high-level concepts and/or semantics
instead of low-level features. For example, “find for me the images about sunsets”
rather than “find for me the images which contain brown and/or yellow color on the
top”. In addition, current visual-based similarity measures do not correspond well to
the similarity judgments of humans (Mojsilovic and Rogowitz, 2001). This result is the
so-called semantic gap problem in CBIR (Shi and Malik, 2000) in which it is difficult to
translate low-level features into high-level concepts.
The semantic gap problem has recently attracted much work focusing on automatic
concept-based image indexing by using (supervised) machine learning techniques,
such as naı
¨ve Bayes classifiers (Chang et al., 2003) and support vector machines
(SVMs) (Chapelle et al., 1999; Wong and Hsu, 2006), to learn the correspondence
between visual features and high-level concepts expressed as keywords. By providing
a set of training examples with each of which is a pair of low-level features and an
associated class labels (or categories), a learning machine can learn the mapping
between these aforementioned examples and hence, assign unlabelled images into the
corresponding categories. In other words the learning machines can automatically
annotate images with the assigned keywords and thus have the capability of allowing
keyword-based queries for image retrieval users. From the discussion above, image
annotation is in fact, the process of labeling images with respect to one of predefined
(training) categories, and as a result, it can be seen as an automatic classification
problem.
While different classification techniques are applied an d/or novel learning
algorithms are proposed to enhance the accuracy of classification (or annotation)
(Chang et al., 2003; Fan et al., 2004; Goh et al. 2005; Li et al., 2003), feature representation
is one of the major factors affecting the system’s indexing effectiveness. During imag e
feature extraction, image visual content (represented as low-level features) can be the
LHT
26,2
256

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT