A high-dimensional classification approach based on class-dependent feature subspace

Date04 December 2017
Pages2325-2339
DOIhttps://doi.org/10.1108/IMDS-11-2016-0491
Published date04 December 2017
AuthorFuzan Chen,Harris Wu,Runliang Dou,Minqiang Li
Subject MatterInformation & knowledge management,Information systems,Data management systems,Knowledge management,Knowledge sharing,Management science & operations,Supply chain management,Supply chain information systems,Logistics,Quality management/systems
A high-dimensional
classification approach based on
class-dependent feature subspace
Fuzan Chen
College of Management and Economics, Tianjin University, Tianjin, China
Harris Wu
Department of Information Technology and Decision Sciences,
Old Dominion University, Norfolk, Virginia, USA
Runliang Dou
College of Management and Economics, Tianjin University, Tianjin, China, and
Minqiang Li
College of Management and Economics, Tianjin University, Tianjin, China and
State Key Laboratory of Hydraulic Engineering Simulation and Safety,
Tianjin University, Tianjin, China
Abstract
Purpose The purpose of this paper is to build a compact and accurate classifier for high-dimensional
classification.
Design/methodology/approach A classification approach based on class-dependent feature subspace
(CFS) is proposed. CFS is a class-dependent integration of a support vector machine (SVM) classifier and
associated discriminative features.For each class, our genetic algorithm(GA)-based approach evolves the best
subset of discriminative featuresand SVM classifier simultaneously.To guarantee convergenceand efficiency,
the authors customize the GA in terms of encoding strategy, fitnessevaluation, and genetic operators.
Findings Experimental studies demonstrated that the proposed CFS-based approach is superior to other
state-of-the-art classification algorithms on UCI data sets in terms of both concise interpretation and
predictive power for high-dimensional data.
Research limitations/implications UCI data sets rather than real industrial data are used to evaluate
the proposed approach. In addition, only single-label classification is addressed in the study.
Practical implications The proposed method not only constructs an accurate classification model but
also obtains a compact combination of discriminative features. It is helpful for business makers to get a
concise understanding of the high-dimensional data.
Originality/value The authors propose a compact and effective classification approach for high-
dimensional data. Instead of the same feature subset for all the classes, the proposed CFS-based approach
obtains the optimal subset of discriminative feature and SVM classifier for each class. The proposed approach
enhances both interpretability and predictive power for high-dimensional data.
Keywords Classification, Feature selection, Genetic algorithm, Support vector machine,
Class-dependent feature subspace
Paper type Research paper
1. Introduction
Classification is one of the fundamental tasks in business intelligence, data mining, machine
learning, and pattern recognition (He et al., 2012; Chen et al., 2014). In the era of big data,
classification often needs to be carried out on high-dimensional data (Wu et al., 2014;
Industrial Management & Data
Systems
Vol. 117 No. 10, 2017
pp. 2325-2339
© Emerald PublishingLimited
0263-5577
DOI 10.1108/IMDS-11-2016-0491
Received 14 November 2016
Revised 20 February 2017
Accepted 4 March 2017
The current issue and full text archive of this journal is available on Emerald Insight at:
www.emeraldinsight.com/0263-5577.htm
The work was supported by the General Program of the National Natural Science Foundation of China
(Nos 71771169, 71101103 and 71201115), the State Key Program of National Natural Science of China
(No. 71631003) and the program of exchanging scholar supported by the China Scholarship Council (CSC).
The authors sincerely appreciate Dongfang Chen, who used to be a Master Student in College of Man-
agement and Economics in Tianjin University, to provide preliminary experimental validation in this study.
2325
High-
dimensional
classification
approach

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT