ML2S-SVM: multi-label least-squares support vector machine classifiers
Date | 09 December 2019 |
Pages | 1040-1058 |
DOI | https://doi.org/10.1108/EL-09-2019-0207 |
Published date | 09 December 2019 |
Author | Shuo Xu,Xin An |
Subject Matter | Information & knowledge management,Information & communications technology,Internet |
ML
2
S-SVM: multi-label
least-squares support vector
machine classifiers
Shuo Xu
Research Base of Beijing Modern Manufacturing Development,
College of Economics and Management, Beijing University of Technology,
Beijing, PR China, and
Xin An
School of Economics and Management, Beijing Forestry University,
Beijing, PR China
Abstract
Purpose –Image classificationis becoming a supporting technology in severalimage-processing tasks. Due
to rich semantic informationcontained in the images, it is very popular for an image to have severallabels or
tags. Thispaper aims to develop a novel multi-label classificationapproach with superior performance.
Design/methodology/approach –Many multi-label classification problems share two main
characteristics:label correlations and label imbalance.However, most of current methods are devotedto either
model labelrelationship or to only deal with unbalanced problemwith traditional single-label methods.In this
paper, multi-label classification problem is regarded as an unbalanced multi-task learning problem. Multi-
task least-squares support vector machine (MTLS-SVM) is generalized for this problem, renamed as multi-
label LS-SVM(ML
2
S-SVM).
Findings –Experimental resultson the emotions, scene, yeast and bibtex data sets indicate that the ML
2
S-
SVM is competitivewith respect to the state-of-the-art methodsin terms of Hamming loss and instance-based
F1 score. The values of resulting parameters largely influence the performance of ML
2
S-SVM, so it is
necessaryfor users to identify proper parameters in advance.
Originality/value –On the basis of MTLS-SVM, a novel multi-labelclassification approach, ML
2
S-SVM,
is put forward. Thismethod can overcome the unbalanced problem but also explicitlymodels arbitrary order
correlations among labels by allowing multiple labels to share a subspace. In addition, the multi-label
classification approachhas a wider range of applications. That is to say, it is not limited to the field of image
classification.
Keywords Support vector machine, Multi-label learning, LS-SVM, Image classification
Paper type Research paper
1. Introduction
The prevalence of digital photographydevices (e.g. digital cameras and mobile phones) has
led to 2.5 trillion images shared online in 2016 globally and the number is continuously
growing according to Deloitte’sTechnology, Media and Telecommunications Predictions
2016 report. Several large digital libraries or museums (Hung, 2018;Pentland et al., 1996;
This research received the financial support from Social Science Foundation of Beijing Municipality
under grant number 17GLB074. Our gratitude also goes to the anonymous reviewers for their
valuable comments.
EL
37,6
1040
Received3 September 2019
Revised11 September 2019
Accepted7 October 2019
TheElectronic Library
Vol.37 No. 6, 2019
pp. 1040-1058
© Emerald Publishing Limited
0264-0473
DOI 10.1108/EL-09-2019-0207
The current issue and full text archive of this journal is available on Emerald Insight at:
www.emeraldinsight.com/0264-0473.htm
Sclaroff et al., 1997) were built to collect images from different locations. At present, the
main challenge faced by both academia and business is how to effectively manage these
image resources to meet the needs of different users. There is growing interest in image-
related research areas, such as annotation (Zhang et al., 2012), retrieval (Datta et al., 2008;
Khalid and Noah, 2017), classification (Meng et al., 2017) and understanding (Torfason
et al., 2018).
Since users are often interested in not only individual images but also a group of
meaningful images, image classification becomes a central issue in the field of image
retrieval. For the purpose of this goal, meaningfulimage groupings and respective labels are
required (Hearst, 2006). Image classification is known as the task of assigning one or
more predefined categories to images. Instead of manually classifying images, many
machine learning algorithms are used to automatically assign a few relevant labels to an
unseen image on the basis of human-labeled or socially tagged images (Sun et al.,2011;
Westman et al.,2011).
There are two ways to automatically classify images: single-label classification and
multi-label classification.Traditionally, single-label classification is concerned with learning
from a set of instances fx
!ngN
n¼1, in each of which a single label y
n
from a set of disjoint
labels L(M=|L|>1) is attached; that is, y
n
[L.IfM= 2. This problemis called a binary
classification problem (Shawe-Taylor and Cristianini, 2004), or a multi-class classification
problem (Hsu and Lin, 2002) otherwise. In multi-label classification (Tsoumakas and
Katakis, 2007), each instance x
!nis associated with a set of labels y
!nL. That is to say,
there is no constraint on how many of the labels each instance can be assigned to. Table I
shows an example of the multi-labeldata set.
Though in the past multi-label classification was mainly motivated by the tasks of text
categorization and medical diagnosis, it is increasingly important in modern applications,
ranging from image classification (Boutell et al.,2004;Devkar and Shiravale, 2017;Shen
et al.,2004;Zhang et al., 2014), music categorization(Li and Ogihara, 2003), protein function
classification (Kolesov et al., 2014;Luo and Zincir-Heywood, 2005), and so forth. The naïve
method is to treat a multi-label problem as Mseparate binaryclassification problems. More
specifically, for each binary classification problem ‘[L, it can be represented as
{( x
!n;yn¼þ1Þj‘2y
!n;n2NNg[f x
!n;yn¼1
j‘62 y
!n;n2NNg. To take the data
set in Table I as an example, one can transform it into M= 4 separate binary classification
problems, shown in Table II.
As a matter of fact, on closer examination, one can see that there are two main
characteristicsshared by many multi-label classification problems, such as
(1) There exists some underlying (potentially nonlinear) label correlations (Sun et al.,
2015;Zhang and Yeung, 2013;Zhang and Zhang, 2010).
Table I.
Examples of a multi-
label data set
x
!n
y
!n
Urban Sunset Mountain Beach
x
!1y
!1¼Urban
fg
x
!2y
!2¼Sunset;Mountain;Beach
fg
x
!3y
!3¼Urban
fg
.
.
..
.
..
.
..
.
..
.
..
.
.
x
!Ny
!N¼Beach;Mountain
fg
Support vector
machine
classifiers
1041
To continue reading
Request your trial