A new approach for histological classification of breast cancer using deep hybrid heterogenous ensemble

Document

Cited authorities 2

Cited in

DOI	https://doi.org/10.1108/DTA-05-2022-0210
Published date	21 April 2023
Date	21 April 2023
Pages	245-278
Subject Matter	Library & information science,Librarianship/library management,Library technology,Information behaviour & retrieval,Metadata,Information & knowledge management,Information & communications technology,Internet
Author	Hasnae Zerouaoui,Ali Idri,Omar El Alaoui

A new approach for histological

classiﬁcation of breast cancer using

deep hybrid heterogenous ensemble

Hasnae Zerouaoui

MSDA, Mohammed VI Polytechnic University, Ben Guerir, Morocco

Ali Idri

Modeling, Simulation and Data Analysis, Mohammed VI Polytechnic University,

Ben Guerir, Morocco and

Software Project Management Research Team, ENSIAS, Mohammed V

University, Rabat, Morocco, and

Omar El Alaoui

UM5R ENSIAS, Rabat, Morocco

Abstract

Purpose –Hundreds of thousands of deaths each year in the world are caused by breast cancer (BC).

An early-stage diagnosis of this disease can positively reduce the morbidity and mortality rate by

helping to select the most appropriate treatment options, especially by using histological BC images for

the diagnosis.

Design/methodology/approach –The present study proposes and evaluates a novel approach which

consists of 24 deep hybrid heterogenous ensembles that combine the strength of seven deep learning

techniques (DenseN et 201, Inception V3, V GG16, VGG19, Incepti on-ResNet-V3, Mob ileNet V2 and

ResNet 50) for feature extraction and four well-known classiﬁers (multi-layer perceptron, support

vector machines, K-nearest neighbors and decision tree) by means of hard and weighted voting

combination methods for histological classiﬁcation of BC medical image. Furthermore, the best deep

hybrid heterogenou s ensembles were compa red to the deep stacked ens embles to determine th e best

strategy to design the deep ensemble methods. The empirical evaluations used four classiﬁcation

performance criteria (accuracy, sensitivity, precision and F1-score), ﬁvefold cross-validati on, Scott–

Knott (SK) statistical test and Borda count voting method. All empirical evaluations were assessed

using four performance measures, including accuracy, precision, recall and F1-score, and were over the

histological BreakHis public dataset with four magniﬁcation factors (40×, 100×, 200× and 400×). SK

statistical test and Borda count were also used to cluster the designed techniques and rank the

techniques belongi ng to the best SK cluster , respectively.

Findings –Results showed that the deep hybrid heterogenous ensembles outperformed both their singles

and the deep stacked ensembles and reached the accuracy values of 96.3, 95.6, 96.3 and 94 per cent across the

four magniﬁcation factors 40×, 100×, 200× and 400×, respectively.

Originality/value –The proposed deep hybrid heterogenous ensembles can be applied for the BC diagnosis

to assist pathologists in reducing the missed diagnoses and proposing adequate treatments for the patients.

Keywords Breast cancer, Computer-aided diagnosis, Digital pathology, Deep convolutional neural

networks, Image processing, Histological images, Ensemble methods

Paper type Research paper

This work was conducted under the research project “Machine Learning based Breast Cancer

Diagnosis and Treatment”, 2020–2023. The authors would like to thank the Moroccan Ministry of

Higher Education and Scientiﬁc Research, Digital Development Agency (ADD), CNRST and UM6P for

their support.

Funding: This study was funded by Mohammed VI polytechnic university at Ben Guerir Morocco.

Conﬂicts of interest/competing interests: The authors report no conﬂicts of interest.

ThecurrentissueandfulltextarchiveofthisjournalisavailableonEmeraldInsightat:

https://www.emerald.com/insight/2514-9288.htm

245

Received 20 May 2022

Revised 24 June 2022

29 August 2022

Accepted 14 September 2022

Data Technologies and

Applications

Vol. 57 No. 2, 2023

pp. 245-278

2514-9288

DOI 10.1108/DTA-05-2022-0210

Histological

classiﬁcation

of BC using

DHHtE

1. Introduction

Breast cancer (BC) is one of the most eminent diseases in women worldwide. In 2020,

2.3 million cases of BC were reported, which makes this cancer the most common cancer in

women and a critical public health problem in both middle-income and developed countries

(Ferlay et al., 2021). Thus, an early diagnosis of the BC disease is a decisive factor to prevent

its progression and reduce the morbidity rate for women (Hameed et al., 2020). BC occurs

from the growth of abnormal breast cells that may invade healthy tissues (Sung et al., 2021).

The analysis of clinical radiology images such as ultrasound imaging, magnetic resonance

imaging (MRI) and mammography are initially used by radiologists for screening the BC

(Hameed et al., 2020); however, histological images remain the best and the most widely

used pathological method for a more precise BC diagnosis to eﬀectively determine the

cancerous areas and therefore propose the eﬀective treatment for the patients (Yan et al.,

2020). The diagnosis of histological images is usually conducted by manual qualitative

analysis by pathologists, which can face many challenges such as (1) lack of pathologists in

small hospitals, (2) dependency on the pathologist’s professional experience and knowledge

on the diagnosis, (3) complexity of the histological BC subtype diagnosis and (4) marginal

errors due to the large number of diagnoses (Fernandes et al., 2019;Gandomkar et al., 2018;

Stoﬀel et al., 2018). To this end, and facing the emergence of producing whole slide imaging

and digital histological slides, many studies investigated the use of new computer-aided

approaches and frameworks for a BC histological binary classiﬁcation (Gandomkar et al.,

2018;Hameed et al., 2020;Yan et al., 2018,2020) in order to help the pathologists to conﬁrm

or refute their diagnoses.

Nowadays, machine learning (ML) classiﬁers have been successfully used in many

application domains especially in the BC classiﬁcation using medical images and image

processing (Zerouaoui and Idri, 2021). ML techniques have helped to increase the survival

rate by oﬀering new automatic approaches that facilitate the BC diagnosis, ameliorate the

accuracy and reduce the eﬀort and time of the diagnosis (Zerouaoui and Idri, 2021).

Moreover, deep learning (DL) techniques proved their strengths for feature extraction

(FE) due to their notable progress in computer vision when using medical images

(Zerouaoui and Idri, 2021), especially the deep convolutional neural network architectures

for histological BC diagnosis (Beevi et al., 2016;Berisha et al., 2019;Carvalho et al., 2020;

Gandomkar et al., 2018;Saikia et al., 2019;Valkonen et al., 2017;Wang et al., 2021;Zerouaoui

and Idri, 2022;Ghiasi and Zendehboudi, 2021;Kadam et al., 2019;Senan et al., 2021;

Boumaraf et al., 2021;Houssein et al., 2021). Despite the encouraging classiﬁcation results

provided by ML and DL models, the use of one classiﬁer does not always provide the best

outperforming model and the higher accuracy in all circumstances since (1) it highly

depends on the type of the problem dealt with and (2) each single classiﬁer has its

advantages and weaknesses (Hosni et al., 2019). To deal with this limitation, researchers

investigated the ensemble learning approach (Ahmed and El Sadig, 2019;Idri et al., 2020;

Kassani et al., 2020;Nakach et al., 2022;El Ouassif et al., 2021) which consists of combining

single learners that are accurate and diverse in order to consolidate their advantages and

overcome their weaknesses using a combination rule such as simple majority voting or

weighted voting (Kuncheva, 2003). In the medical ﬁeld and more precisely for BC diagnosis

classiﬁcation, many studies proposed the use of ensemble learning to improve the

classiﬁcation performances. For instance, the study by Hameed et al. (2020) proposed two

homogenous ensemble methods by combining end-to-end architectures of DL models (ﬁne-

tuned VGG16 and VGG19 and fully trained VGG16 and VGG19) by means of average

predicted probabilities for histological image classiﬁcation; the results proved the power of

the ensemble models compared to their singles. In the study by Ahmed and El Sadig (2019),

the authors proposed and evaluated a heterogenous ensemble method for BC detection

DTA

57,2

246

using mammograms by combining ﬁve classical architectures using multi-layer perceptron

(MLP), support vector machines (SVMs), decision tree (DT), K-nearest neighbors (KNNs)

and Naive Bayes (NB) for classiﬁcation and traditional techniques for FE using the

weighted voting combination method; results showed that the proposed ensemble

technique outperformed their singles and improved the performances. However, some

limitations have been detected in the studies (Ahmed and El Sadig, 2019;Hameed et al.,

2020;Idri et al., 2020;Kassani et al., 2020;El Ouassif et al., 2021;Xiao et al., 2017): (1) the

design of heterogenous or homogenous ensemble using only one combination method, (2)

the use of end-to-end or classical architectures for the design of the ensemble methods and

(3) except for the studies (Idri et al., 2020;El Ouassif et al., 2021), it is observable that there is

a lack of use of statistical test to evaluate the performances of the proposed ensemble

approaches.

In a previous work (El Alaoui et al., 2022), deep stacked ensembles were developed using

seven pretrained DL models: VGG16, VGG19 (Simonyan and Zisserman, 2015), ResNet 50

(He et al., 2016), Inception V3 (Szegedy et al., 2016), Inception-ResNet-V2 (Szegedy et al.,

2017), Xception (Chollet, 2017) and MobileNet (Sandler et al., 2018); then a logistic regression

was used as a meta learner that learns how to best combine the predictions of the DL

models. The results show that the proposed deep stacking ensemble reports an overall

accuracy of 93.8, 93.0, 93.3 and 91.8 per cent over the four magniﬁcation factors (MF) values

of the BreakHis dataset: 40×, 100×, 200× and 400×, respectively. In order to compare the

results of the study (El Alaoui et al., 2022), and to elevate the burdens of the previous related

works, this paper develops and evaluates the performances of 24 deep hybrid heterogenous

ensembles (DHHtEs) using DL models (DenseNet 201 (Huang et al., 2017), Inception V3,

VGG16, VGG19, Inception-ResNet-V3, MobileNet V2 and ResNet 50) for FE and ML models

(MLP, SVM, DT and KNN) for classiﬁcation over the BreakHis histological images dataset.

The choice of the members of base learners for the DHHtEs is based on the ﬁnding of the

previous studies (Zerouaoui et al., 2021;Zerouaoui and Idri, 2022) which designed 28 hybrid

architectures using seven DL techniques for FE including DenseNet 201, Inception V3,

VGG16, VGG19, Inception-ResNet-V3, MobileNet V2 and ResNet 50 and four ML classiﬁers

(MLP, SVM, DT and KNN). Results showed that for all the four MF values 40×, 100×, 200×

and 400× of the BreakHis dataset, the hybrid architecture MLP for classiﬁcation and

DenseNet 201 for feature extraction (MDEN) constructed using DenseNet 201 for FE and

MLP for classiﬁcation outperformed the others. Furthermore, to design the proposed new

approach of DHHtE, we selected the best-performing hybrid architectures of the study

(Zerouaoui and Idri, 2022) for each classiﬁer, combining top 2, top 3 and top 4 by means of

hard and weighted voting. We therefore obtain 24 ensembles (3 ensembles with hard voting

for each MF + 3 ensembles with weighted voting for each MF) × 4 MF values. To the best of

our knowledge, this study is the ﬁrst to propose, design and evaluate DHHtEs using deep

hybrid architectures as base learners, built using DL techniques as feature extractors and

four ML classiﬁers, tested on the histological BreakHis dataset for a binary BC diagnosis

classiﬁcation. To this end, the present study discusses three research questions (RQs):

RQ1. What is the overall performance of the DHHtE designed?

RQ2. Does the DHHtE methods outperform their singles?

RQ3. Among the two combination methods (hard voting and weighted voting), the

number and the type of singles, which of them provides a better accuracy for the

DHHtEs?

RQ4. Does the best DHHtEs outperform the deep stacked ensembles?

The main contributions of this empirical study are the following:

Histological

classiﬁcation

of BC using

DHHtE

247

To continue reading

Request your trial

Subscribers can access the reported version of this case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a list of all the cited cases and legislation of a document.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a list of all the documents that have cited the case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see the revised versions of legislation with amendments.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see any amendments made to the case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a visualisation of a case and its relationships to other cases. An alternative to lists of cases, the Precedent Map makes it easier to establish which ones may be of most relevance to your research and prioritise further reading. You also get a useful overview of how the case was received.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see the list of results connected to your document through the topics and citations Vincent found.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

A new approach for histological classification of breast cancer using deep hybrid heterogenous ensemble

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users