Scale up predictive models for early detection of at-risk students: a feasibility study

DOIhttps://doi.org/10.1108/ILS-05-2019-0041
Pages97-116
Date08 February 2020
Published date08 February 2020
AuthorYing Cui,Fu Chen,Ali Shiri
Subject MatterLibrary & information science,Librarianship/library management,Library & information services
Scale up predictive models for
early detection of at-risk students:
a feasibility study
Ying Cui and Fu Chen
Department of Educational Psychology, University of Alberta,
Edmonton, Canada, and
Ali Shiri
Department of Library and Information Studies,
University of Alberta, Edmonton, Canada
Abstract
Purpose This study aims to investigatethe feasibility of developing general predictive models for using
the learning managementsystem (LMS) data to predict student performancesin various courses. The authors
focused on examining three practical but important questions: are there a common set of student activity
variables that predict student performance in different courses? Which machine-learning classierstend to
perform consistently well across different courses? Can the authors develop a general model for use in
multiplecourses to predict student performance based on LMS data?
Design/methodology/approach Three mandatory undergraduatecourses with large class sizes were
selected from three different faculties at a large Western Canadian University, namely, faculties of science,
engineeringand education. Course-specic models for these three courseswere built and compared using data
from two semesters,one for model building and the other for generalizability testing.
Findings The investigation has led the authors to conclude that it is not desirable to develop a general model
in predicting course failure across variable courses. However, for the science course, the predictive model, which
was built on data from one semester, was able to identify about 70% of studentswho failed the course and 70%
of students who passed the course in another semester with only LMS data extracted from the rst four weeks.
Originality/value The results of this study are promising as they show the usability of LMSfor early
prediction of student course failure, which has the potential to provide students with timely feedback and
supportin higher educationinstitutions.
Keywords Higher education, Learning management system, Predictive models, Learning analytics,
Student activity data, Student course success
Paper type Research paper
Scale up models for early detection of at-risk students: a feasibility study
The recent decade has witnessed an increasingly common use of learning management
systems (LMSs) as a valuable tool to facilitate teaching and learning in higher education
(Becker et al.,2017). LMSs are web-based applicationswith a variety of functions that allow
instructorsto share course materials,create online forums, developonline quizzes, collectand
evaluate assignments and record grades. The adoptionof LMSs for teaching and learning in
higher education not only enhances the efciency of course administration but also has the
potential to promote active learning and studentstudent and studentinstructor interaction/
collaboration(Azevedo and Aleven,2013;Cerezo et al.,2016).
Funding: This research was supported by the University of Alberta Teaching and Learning Research
Fund (RES0035131).
Predictive
models
97
Received10 May 2019
Revised21 November 2019
15January 2020
Accepted15 January 2020
Informationand Learning
Sciences
Vol.121 No. 3/4, 2020
pp. 97-116
© Emerald Publishing Limited
2398-5348
DOI 10.1108/ILS-05-2019-0041
The current issue and full text archive of this journal is available on Emerald Insight at:
https://www.emerald.com/insight/2398-5348.htm
Moreover, LMSs have the capacity to record and store detailed activity data of students
while interacting with the system, such as logins, assignment submissions, resources accessed
andfrequencyandinteractionwithdiscussionforums. A growing body of research has focused
on examining the use of these student activity data to monitor student progress and identify
students who might be at risk of course failure (Cerezo et al., 2016). The rationale of this strand
of research is that student activity data serve as indicators of their engagement and efforts in
the course, which have been shown to be positively related to student academic performance
(Chen and Jang, 2010;Davies and Graff, 2005;de Barba et al., 2016;Kizilcec et al., 2013;
Morris et al.,2005;Tempelaar et al., 2015). Traditional summative assessments of student
learning, such as midterm and nal examinations, typically measure the amount of knowledge
that students have acquired and evaluate the overall sufciency of student learning. However,
these assessments provide little information regarding student learning strategies, the degree
of engagement and interactions with peers (Coates, 2005;Richardson, 2005). Therefore, the
benet of analyzing student activity data in LMSs is to identify students who fail to engage
adequately in coursework at an early stage, and therefore, might be at risk of course failure.
Early identication of potentially at-risk students is important as instructors can provide them
with timely feedback and suggest to them to change their behaviors (e.g. participate in-group
discussions or submit assignment on time) to increase their chance of success in the course.
This type of personalized learning advice is realized and supported by data-informed solutions
and was typically not feasible with large class sizes. With this information, students can reect
on their learning process and plan their future learning in a more proactive and structured
manner.
The research project we are reporting here is a university levelproof-of-concept study to
explore the usability of LMS data for the early detection of students who are at the risk of
failing a course at a Western Canadian University. Predictive model building involves
variables (i.e. attributes and predictors) selection, data processing, choosing specic
machine-learning classiers and model validation (Kotsiantis et al.,2007). Building a
predictive model requires extensive quantitative skills and efforts involved in data
preparation, analysis and interpretation. The research team had numerous discussions on
who should be responsible for developing, maintaining and updating predictive models.
Neither all individual instructors possess the required quantitative skills nor do they have
time to develop such models. It would be more efcient to develop a general predictive
model that can be trained once and applied directly to various courses. There have been a
number of research studies on developing general models for predicting student course
performances (Arnold and Pistilli,2012;Essa and Ayad, 2012;Jayaprakash et al., 2014). For
example, course signals (Arnold and Pistilli, 2012) are a system based on general predictive
models that can be used in differentcourses. Predictors include:
percentage of points earned in course to date;
LMS activities as compared to studentspeers;
prior academic history including academic preparation, high school grade point
average (GPA) and standardized test scores; and
student characteristics such as residency, age or credits attempted.
The open academic analytics initiative program (Jayaprakash et al., 2014) aimed at scaling
up predictive models across different higher institutions by applying models trained with
Marist College data to data from several other institutions. However, the variables they
chose in their predictive models were mostly related to studentdemographics (e.g. age and
gender) and past academic history (e.g. scholastic assessment test scores and cumulative
ILS
121,3/4
98

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT