Scenery image retrieval by meta‐feature representation

Pages517-533
Date03 August 2012
Published date03 August 2012
DOIhttps://doi.org/10.1108/14684521211254040
AuthorChih‐Fong Tsai,Wei‐Chao Lin
Subject MatterInformation & knowledge management,Library & information science
Scenery image retrieval by
meta-feature representation
Chih-Fong Tsai
Department of Information Management, National Central University,
Jhongli, Taiwan, ROC, and
Wei-Chao Lin
Department of Computer Science and Information Engineering,
Hwa Hsia Institute of Technology, Taipei, Taiwan, ROC
Abstract
Purpose – Content-based image retrieval suffers from the semantic gap problem: that images are
represented by low-level visual features, which are difficult to directly match to high-level concepts in
the user’s mind during retrieval. To date, visual feature representation is still limited in its ability to
represent semantic image content accurately. This paper seeks to address these issues.
Design/methodology/approach – In this paper the authors propose a novel meta-feature feature
representation method for scenery image retrieval. In particular some class-specific distances (namely
meta-features) between low-level image features are measured. For example the distance between an
image and its class centre, and the distances between the image and its nearest and farthest images in
the same class, etc.
Findings – Three experiments based on 190 concrete, 130 abstract, and 610 categories in the Corel
dataset show that the meta-features extracted from both global and local visual features significantly
outperform the original visual features in terms of mean average precision.
Originality/value – Compared with traditional local and global low-level features, the proposed
meta-features have higher discriminative power for distinguishing a large number of conceptual
categories for scenery image retrieval. In addition the meta-features can be directly applied to other
image descriptors, such as bag-of-words and contextual features.
Keywords Image retrieval,Feature extraction, Featurerepresentation, Class-specificdistances,
Meta-features,Digital images, Image processing
Paper type Research paper
Introduction
Since the size of image collections increases rapidly, e.g. personal and/or stock photos,
medical images, etc. effective management of these image collections has become an
important research problem in image retrieval. In particular image retrieval techniques
have been actively developed in order to fulfil industrial demand, i.e. to operate on
large scale image collections. In addition a successful image retrieval system is capable
of effectively indexing image databases to retrieve images with high or satisfactory
precision and/or recall. In other words given a query the aim of image retrieval systems
is to retrieve as many similar (or relevant) images as possible.
The current issue and full text archive of this journal is available at
www.emeraldinsight.com/1468-4527.htm
This research was supported by the National Science Council of Taiwan under Grant NSC
99-2410-H-008-033-MY2.
Scenery image
retrieval
517
Received 17 March 2011
Accepted 27 August 2011
Online Information Review
Vol. 36 No. 4, 2012
pp. 517-533
qEmerald Group Publishing Limited
1468-4527
DOI 10.1108/14684521211254040
However a semantic gap problem occurs in image retrieval when low-level (visual)
features (such as colour, texture, etc.) are extracted by image processing algorithms
and used to represent or describe the image content. That is, images are represented as
points in a high dimensional feature space. Then a similarity metric is used to measure
dis/similarity between images on this space. As a result images close to the query are
similar to the query and retrieved. Nevertheless this kind of system-based similarity
measure does not necessarily correspond to human-based similarity measures since
the notion of similarity in the user’s mind is typically based on high-level abstractions,
such as activities, entities/objects, events, or evoked emotions, among others
(Smeulders et al., 2000).
To this end some studies focus on object-based retrieval in order to improve
retrieval effectiveness. In NeTra (Ma and Manjunath, 1999) and Blobworld (Carson
et al., 2002), two representative object-based retrieval systems, images are segmented
into numbers of regions (i.e. blobs), and users can search some specific objects located
in images. Their studies show that systems using region-based image representation
not only provide useful information about objects (even though segmentation may not
be perfect) but also outperform systems using global image representation, such as
colour histograms (Swain and Ballard, 1991).
However, when the image database contains a large number of conceptual
categories (e.g. 500) for retrieval, low-level features are highly unlikely to accurately
represent the semantic contents of images. This leads to poor performance, which
relates to low precision and recall. In other words in the visual feature space
semantically similar images may have dissimilar low-level features and semantically
different images may contain similar low-level features.
In this paper we introduce a novel feature extraction and representation method,
namely meta-features, which aim at improving retrieval performance. In this method
nine meta-features of each image are extracted from low-level image features including
two types of class-specific distance based information. The first type of information is
based on the distances between an image represented by low-level features and its
nearest and farthest class centres. The second one contains the distances between an
image and its nearest and farthest images in the same class (see Meta-feature
representation section).
The idea behind the meta-feature representation is based on some related studies
using distances for effective pattern recognition, such as centroid-based classification
in which class centres or centroids over a given dataset can be used to recognise
dis/similar classes (Han and Karypis, 2000); the triangle area features as the distances
between a data point and two cluster centres (Tsai and Lin, 2010); and a soft
assignment scheme to assign some weight to neighbouring image features, which
depends on the distance between the descriptor and cluster centres (Philbin et al., 2008),
etc.
These distance-based features, i.e. meta-features, are different from those in studies
of distance metric learning, such as Wu et al. (2010). Our proposed meta-features are
used as image descriptors for later similarity measurement, recognition, and
classification. In contrast supervised distance metric learning is based on learning a
distance metric from side information that is typically presented in a set of pair-wise
constraints. The optimal distance metric is found by keeping objects in equivalence
constraints close, and at the same time, objects in inequivalence constraints well
OIR
36,4
518

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT