Semantic text-based image retrieval with multi-modality ontology and DBpedia

Pages1191-1214
DOIhttps://doi.org/10.1108/EL-06-2016-0127
Date06 November 2017
Published date06 November 2017
AuthorYanti Idaya Aspura M.K.,Shahrul Azman Mohd Noah
Subject MatterInformation & knowledge management,Information & communications technology,Internet
Semantic text-based image
retrieval with multi-modality
ontology and DBpedia
Yanti Idaya Aspura M.K.
Library and Information Science, Faculty of Computer Science & Information
Technology, University of Malaya, Kuala Lumpur, Malaysia, and
Shahrul Azman Mohd Noah
Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia,
Bangi, Malaysia
Abstract
Purpose The purpose of this study is to reduce the semantic distance by proposing a model for
integrating indexes of textual and visual features viaa multi-modality ontology and the use of DBpedia to
improve thecomprehensiveness of the ontologyto enhance semantic retrieval.
Design/methodology/approach A multi-modality ontology-based approach was developed to
integrate high-level conceptsand low-level features, as well as integrate the ontology base with DBpedia to
enrich the knowledge resource. A complete ontology model was also developed to represent the domain of
sport news, with image caption keywords and image features. Precisionand recall were used as metrics to
evaluate the effectiveness of the multi-modality approach, and the outputs were compared with those
obtainedusing a single-modality approach (i.e. textualontology and visual ontology).
Findings The results based on ten queries show a superiorperformance of the multi-modality ontology-
based IMR system integratedwith DBpedia in retrieving correct images in accordancewith user queries. The
system achieved 100 per cent precision for six of the queries and greaterthan 80 per cent precision for the
other four queries.The text-based system only achieved 100 percent precision for one query; all other queries
yielded precisionrates less than 0.500.
Research limitations/implications This study only focused on BBC Sport News collection in the
year 2009.
Practical implications The paper includes implications for the development of ontology-based
retrievalon image collection.
Originality value This study demonstrates the strength of usinga multi-modalityontology integrated
with DBpedia for image retrievalto overcome the deciencies of text-based and ontology-based systems. The
result validates semantic text-based with multi-modality ontologyand DBpedia as a useful model to reduce
the semanticdistance.
Keywords DBpedia, Ontology, Image retrieval, Multi-modality ontology, Semantic indexing,
Text-based retrieval
Paper type Research paper
1. Introduction
Images are now becoming a major source of content on the Web. The volume of image
information is increasing rapidly partly because of the use of various gadgets, such as
mobile phones, tablets and notebooks, which are equipped with image-capture devices.
Furthermore, various Web 2.0 and social media sites allow users to upload and share their
images in an effortless manner. As a result, the task of searching for and retrieving such
Semantic text-
based image
retrieval
1191
Received8 June 2016
Revised7 January 2017
Accepted12 April 2017
TheElectronic Library
Vol.35 No. 6, 2017
pp. 1191-1214
© Emerald Publishing Limited
0264-0473
DOI 10.1108/EL-06-2016-0127
The current issue and full text archive of this journal is available on Emerald Insight at:
www.emeraldinsight.com/0264-0473.htm
images is becoming non-trivial. These images are usually annotated with EXIF metadata,
such as the date, time, location and resolution, but these data are considered insufcient
even for supporting simple semantic querying, such as nd all pictures related to English
Premier League footballor nd pictures of Wayne Rooney. Semantic information
requires enhanced image retrieval processing and integration with external knowledge
sources.
Current approaches to image retrieval mostly rely on content-based (Ha et al., 2008;Su
et al., 2010;Xue and Wanjun, 2011) and textual-based (Noah et al., 2008;Zhang et al., 2005)
approaches, or a hybrid of both (He et al., 2011;Wang and Liu, 2008). However, this image
retrieval model overlooks the authentic semanticinformation related to the image. Content-
based methods extract and index the low-level features of images, such as colour, texture
and shape. The extracted information is used to annotate and index the images
automatically with content descriptors.However, the annotations are usually made in terms
of a very small number of phrases or keywords. Semanticallysimilar terms do not occur in
the annotation for the same image. For example, the annotation of one image may contain
carand that for another image may have motorcycle, but these two terms generally do
not appear together in the annotationof one image, although they are semantically similar in
some sense. Therefore, the traditionalcontent-based approach cannot meet the requirements
of users because this approach relies totally on extraction and basic features, but without
considering the semantic image contents. By contrast, the textual-based approaches
typically use the surrounding text to describe the content of an image, which may be
affected by the polysemy problem and ambiguity or even irrelevant image descriptions. To
overcome the aforementioned issues, there is great interest in representing or annotating
images using knowledge representation methods, particularlyin the form of ontologies. An
ontology is dened as a conceptual representation of a domain(Gruber, 1993), and
ontologies have potential applications in the area of query expansion for conventional
information retrievalsystems (Bhogal et al., 2007).
Content-based imageretrieval (CBIR) is a research area that started to attractattention in
1994. Smeulders et al. (2000) discussed the progress made in this area by considering the
crucial aspects of CBIR that require further detailed research. One of the main issues with
advanced CBIR is the semantic gap, wherethe meaning of an image is rarely self-evident. It
was also mentioned that the aim of an image retrieval system is to provide maximum
support to bridge the semantic gap between the simple content of the available visual
features and the richness of the user semantics. Furthermore, Datta et al. (2008) identied
two main gaps in image retrieval research: the sensory gap and the semantic gap. The
sensory gap refers to the difference between an objectin the real world and the information
in a description derived from a recording of the same scene. The semantic gap refers to the
lack of agreement between the information that can be extracted from the visual data and
the interpretation of the same data used in a given situation. In the present study, the
semantic gap problem is addressed between the high-level concepts of human perception
and the low-level featuresused to describe images.
In this study, the aim is to reducethe semantic gap by using an ontology-based model to
integrate low-level visual features and high-level textual information, thereby representing
the semantics of the image contents for image retrieval. Both features are represented by
two different ontologies but they are conceptually integrated. This method is a multi-
modality ontology-based approach. The major difculty when developing an ontology-
based approach is the extra work required to create the ontology and make detailed
annotations. Thus, an information-extraction method was developed that automatically
extracts the image content, as well as performing automatic image classication and
EL
35,6
1192

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT