Predictive analytic models of student success in higher education. A review of methodology

Pages208-227
Published date11 March 2019
Date11 March 2019
DOIhttps://doi.org/10.1108/ILS-10-2018-0104
AuthorYing Cui,Fu Chen,Ali Shiri,Yaqin Fan
Subject MatterLibrary & information science
Predictive analytic models of
student success in higher education
A review of methodology
Ying Cui and Fu Chen
Department of Educational Psychology, University of Alberta, Edmonton,
Alberta, Canada
Ali Shiri
Department of Library and Information Studies, University of Alberta,
Edmonton, Alberta, Canada, and
Yaqin Fan
Department of Educational Technology, Northeast Normal University, Changchun,
Jilin, China
Abstract
Purpose Many higher education institutions are investigating the possibility of developing predictive
student successmodels that use different sources of data availableto identify students that might be at risk of
failing a course or program.The purpose of this paper is to review the methodologicalcomponents related to
the predictivemodels that have been developed or currently implemented in learninganalytics applications in
higher education.
Design/methodology/approach Literature review was completed in three stages. First, the authors
conducted searches and collected related full-text documents using various search terms and keywords.
Second, they developedinclusion and exclusion criteria to identifythe most relevant citations for the purpose
of the current review. Third, they reviewed each documentfrom the nal compiled bibliography and focused
on identifyinginformation that was needed to answer the research questions
Findings In this review, the authors identify methodological strengths and weaknesses of current
predictive learning analytics applications and provide the most up-to-date recommendations on predictive
model development,use and evaluation. The review resultscan inform important future areas of researchthat
could strengthen the development of predictive learning analytics for the purpose of generating valuable
feedbackto students to help them succeed in higher education.
Originality/value This review provides an overview of the methodological considerations for
researchersand practitioners who are planning to develop or currently in the process of developing predictive
student successmodels in the context of higher education.
Keywords Higher education, Machine learning, Student success, Learning analytics,
Educational data mining, Methodology review, Predictive models
Paper type Literature review
Introduction
The 2016 Horizon Report Higher Education Edition (Johnson et al., 2016) predicts that
learning analytics will be increasingly adopted by higher education institutions across the
globe in the near future to make use of student data gathered through online learning
environments to improve, support and extend teaching and learning. The 2016 Horizon
report denes learning analytics as an educational application of web analytics aimed at
ILS
120,3/4
208
Received11 October 2018
Revised24 January 2019
11February 2019
Accepted19 February 2019
Informationand Learning
Sciences
Vol.120 No. 3/4, 2019
pp. 208-227
© Emerald Publishing Limited
2398-5348
DOI 10.1108/ILS-10-2018-0104
The current issue and full text archive of this journal is available on Emerald Insight at:
www.emeraldinsight.com/2398-5348.htm
learner proling, a process of gathering and analyzing details of individual student
interactions in online learning activities(p. 38). It can help to build better pedagogies,
empower active learning, target at-risk student populations, and assess factors affecting
completion and student success(p. 38). Terms such as educational data mining,
academic analyticsand the more commonly adopted learning analyticshave been used
in the literature to refer to the methods, tools and techniques for gathering very large
volumes of online data about learners and their activities and contexts. The advantages of
learning analytics have been enumerated by Siemens et al. (2011) and Siemens and Long
(2011), and some of the important ones include: early detection of at-risk students and
generating alerts for learners and educators; personalization and adaption of learning
process and content; extension and enhancement of learner achievement, motivation and
condence by providing learners withtimely information about their performance and that
of their peers; higher quality learning design and improved curriculum development;
interactive visualizations of complex information that give learners and educators the
ability to zoom inor zoom outon data sets; and more rapid achievement of learning
goals by giving learnersaccess to tools that help them to evaluate their progress.
Many higher education institutionsare beginning to explore the use of learning analytics
for improving student learning experiences (Sclater et al., 2016). According to a recent
literature review on learning analytics in higher education (Leitner et al., 2017), the most
popular strand of research in the eld is to use student data to make predictions of their
performance (36 citations out of thetotal of 102 found in the literature review). The primary
goal of this area of researchis to develop predictive student success models that make use of
different sources of data available within a higher educationinstitution to identify students
who might be at risk of failing a course or programand could benet from additional help.
This type of learning analytics research and application is important as it generates
actionable information that allowsstudents to monitor and self-regulate their own learning,
as well as allows instructors to develop and implement effective learning interventions and
ultimately helpstudents succeed.
The purpose of the present paper is to systematically review the methodological
components of the predictive modelsthat have been developed or currently implemented in
the learning analytics applications in higher education. Student learning is a complex
phenomenon as cognitive, socio and emotional factors, together with prior experience, all
inuence how students learn and perform (Illeris, 2006). As a result, to predict student
performance in a course or a program, many variables need to be considered, such as
cognitive variables associated with targeted knowledge and skills in the domain and socio-
emotional variables, such as engagement, motivation and anxiety. Student demographic
characteristics and past academic history are also often used in model building to reect
information related to student prior experiences. Supervised machine learning techniques
such as logistic regression and neural networks are then applied to these student variables
to train and test the predictive models so as to estimate the likelihood of a students
successful passing of a course. Kotsiantis (2007) specied several key issues that are
consequential to the successof supervised machine learning applications, including variable
(i.e. attributes, features) selection,data preprocessing, choosing specic learning algorithms
and model validation. These issues are directlyrelated to the steps of the typical process of
statistical modeling in quantitative research, which have guided us in terms of identifying
our research questions,as outlined below:
RQ1. What data sources and student variables were used to predict student
performance in highereducation?
Predictive
analytic models
in higher
education
209

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT