Big Data analytics for prediction: parallel processing of the big learning base with the possibility of improving the final result of the prediction

Document

Cited in

Pages	147-160
DOI	https://doi.org/10.1108/IDD-02-2018-0002
Published date	20 August 2018
Date	20 August 2018
Author	Laouni Djafri,Djamel Amar Bensaber,Reda Adjoudj
Subject Matter	Library & information science,Library & information services,Lending,Document delivery,Collection building & management,Stock revision,Consortia

Big Data analytics for prediction: parallel

processing of the big learning base with

the possibility of improving the ﬁnal

result of the prediction

Laouni Djafri

Department of Computer Science, Djillali Liabes University, EEDIS laboratory -univ-SBA-Algeria, Sidi-Bel-Abbes, Algeria

Djamel Amar Bensaber

Superior School of Computer Science, LabRI laboratory -ESI-SBA-Algeria, Sidi Bel-Abbes, Algeria, and

Reda Adjoudj

Department of Computer Science, Djillali Liabes University, EEDIS laboratory -univ-SBA-Algeria, Sidi Bel-Abbes, Algeria

Abstract

Purpose –This paper aims to solve the problems of big data analytics for prediction including volume, veracity and velocity by improving the

prediction result to an acceptable level and in the shortest possible time.

Design/methodology/approach –This paper is divided into two parts. The ﬁrst one is to improve the result of the prediction. In this part, two ideas are

proposed: the double pruning enhanced random forest algorithm and extracting a shared learning base from the stratiﬁed random sampling method to

obtain a representative learning base of all original data. The second part proposes to design a distributed architecture supported by new technologies

solutions, which in turn works in a coherent and efﬁcient way with the sampling strategy under the supervision of the Map-Reduce algorithm.

Findings –The representative learning base obtained by the integration of two learning bases, the partial base and the shared base, presents an

excellent representation of the original data set and gives very good results of the Big Data predictive analytics. Furthermore, these results were

supported by the improved random forests supervised learning method, which played a key role in this context.

Originality/value –All companies are concerned, especially those with large amounts of information and want to screen them to improve their

knowledge for the customer and optimize their campaigns.

Keywords Big Data analytics, Sampling, Random forests, Apache spark, Apache zookeeper, Parallel processing

Paper type Research paper

1. Introduction

The computer science world is in effervescence around a

phenomenon of an explosion of new sources of diverse data

with ﬁne granularity and low latency, bearing the name “Big

Data.”It is data that exceeds the typical capacity of storing,

processing, analyzing and computing traditional databases.

Big Data requires ad vanced methods and p owerful

technologies that can be applied to analyze and extract

predictive models f rom heterogeneous and comple x data. It is

also characterized mainly by the three Vs: volume, variety and

velocity (Furht and Villanustre, 2016;Laney, 2001).

Moreover, there is an important factor in the analysis of

massive and complex data, which is the visualization of data.

It allows managers to q uickly understand t he relationships

and results that, else where, are not easily visible. It a lso allows

to create an infography, interactive or not, or a representation

in the form of data map ping (Brail and Klosterman, 2001;

Kohavi et al., 2002).

Global economic enterprises seek to exploit Big Data

available on the int ernet so that to explo re the open data of

social media such as logs, tweets and social networks at 34 per

cent, as well as Web logs and click streams at 31 per cent

(Russom, 2011).Theamountofthisdatawillhavereached

35 zettabytes by the year 2020 in a way that Twitter generates

more than 7 terabytes al one and Facebook 10 terabytes a day

(Zikopoulos and Eaton, 2011). In a study conducted in 2012

by IBM, 2.5 billion by tes of data are produc ed daily via the

Internet, where Fac ebook has more than 2.5 billion like s and

300 million downlo ads of photos (He et al., 2015). Big D ata

analysis seeks to develop products, create new service and

improve business . In a study conducted by He et al.,onawide

range of data, the authors used tweets on the world’slargest

two-serie retail (Costco and Walmart), where they compared

The current issue and full text archive of this journal is available on

Emerald Insight at: www.emeraldinsight.com/2398-6247.htm

Information Discovery and Delivery

46/3 (2018) 147–160

[DOI 10.1108/IDD-02-2018-0002]

Received 4 February 2018

Revised 6 March 2018

3 April 2018

10 May 2018

2 June 2018

Accepted 3 June 2018

147

To continue reading

Request your trial

Subscribers can access the reported version of this case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a list of all the cited cases and legislation of a document.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a list of all the documents that have cited the case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see the revised versions of legislation with amendments.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see any amendments made to the case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a visualisation of a case and its relationships to other cases. An alternative to lists of cases, the Precedent Map makes it easier to establish which ones may be of most relevance to your research and prioritise further reading. You also get a useful overview of how the case was received.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see the list of results connected to your document through the topics and citations Vincent found.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Big Data analytics for prediction: parallel processing of the big learning base with the possibility of improving the final result of the prediction

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users