ArticlesAll Issue
ArticlesA Three-Layered Student Learning Model for Prediction of Failure Risk in Online Learning
• Danial Hooshyar1,2, Yueh-Min Huang3 , and Yeongwook Yang4,*

Human-centric Computing and Information Sciences volume 12, Article number: 28 (2022)
Cite this article 2 Accesses
https://doi.org/10.22967/HCIS.2022.12.028

Abstract

Modelling students’ learning behavior has proven to be a fundamental indicator of their success or failure in online courses. However, many studies ignore properly considering such modelling while predicting students’ risk of failure. This study proposes a new educational data mining approach, called StudModel, that automatically models students based on their learning behavior, and accordingly predicts their risk of failure in courses. Briefly, a three-layered students’ learning model with respect to their content access, engagement, and assessment behavior in a course is developed, and then clustering and classification methods are employed to put students into low- and high-risk of failure categories. To evaluate the approach, three courses with different numbers of students from the Moodle system of the University of Tartu were used. Our findings showed that StudModel achieved accuracies higher than 90%, outperforming many state-of-the-art approaches in predicting students’ risk of failure in courses with different numbers of students (with deep neural network being among the best classifiers). Furthermore, using a local interpretable model-agnostic explanations approach, the StudModel provides explanations on its decisions which can nurture educators, practitioners, and learners’ trust in such predictions. These reveal that it is feasible to accurately and transparently predict students' risk of failure in online courses by using their current activity data that is available in most online learning environments, not their past performance or demographic data that either cannot be controlled by them or might be unavailable.

Keywords

Predication of Risk of Failure, Higher Education, Educational Data Mining, Student Modeling, Explaining the Prediction, Online Learning

Introduction

Educators would have the opportunity to provide students with some specific or additional instructions, or to adjust their pedagogical strategies if they knew in advance whether their students are at risk of failure in a course. Student failure in courses is among the most important challenges faced by educators during the teaching-learning process. According to numerous researchers, one of the vital steps to boosting the likelihood of students’ academic success is detection of students at failure risk in a course, and then notifying their teachers and themselves to take essential actions (e.g., [1]). Predictions of students’ risk of failure in courses is one prospective way toward this step [2]. Nowadays, many institutions and instructors employ LMSs (learning management systems) for teaching and supporting online learning due to their potential in managing learning resources, as well as monitoring learning progress of students. Basically, these LMSs produce a vast amount of educational data that could be used for multiple purposes, such as optimizing the learning environments and adjusting teaching strategies to enhance students’ learning. Educational data mining (EDM) is a developing field concerned with designing and advancing new approaches exploring data from educational settings aimed at discovering new insights about students’ learning process and the environments that are used for their learning. Through application of EDM approaches, educators and educational institutes will have the chance to take essential measures or make necessary decisions during the learning process. Providing timely feedback, recommending individualized learning paths, and courses and learning materials are among the many ways that EDM approaches can support both the learning and teaching process [3]. Another example of EDM approaches is prediction systems for detection of students at risk of failure in courses. To make a highly accurate prediction, such systems usually require different data amount, and this depends on the approach in use and the parameter that is being predicted. Basically, in order to develop highly performed prediction, many existing EDM methods suggest using a big load of data for training their models. Nonetheless, a large amount of data does not ensure a highly accurate prediction. Therefore, identifying and properly modeling students’ relevant dynamic characteristics and variables are the most challenging task in achieving a decent prediction system [4, 5]. Students’ learning behavior regarding accessing learning resources, engaging with peers, and taking assessment tests are among the most essential dynamic characteristics, affecting their success or failure in online courses (e.g., 6). Many studies have highlighted students’ activity data logged in online learning environments (like Moodle) as a way to represent their learning behavior in online courses (e.g., 7). Moreover, they report that different students’ activity types can potentially be considered as predictor (or indicator) of their success or failure. Even though modelling students’ learning behavior has proven to be a fundamental indicator of their success or failure in online courses, many studies ignore to properly consider such modelling (using their activity level) while predicting students’ risk of failure (e.g., 8). Furthermore, while the related research has shown good performance, they often ignore some underlying factors. These comprise building students’ model taking into account different aspects of their learning behavior during a course (e.g., their content access, engagement, and assessment behavior), application of advanced yet simple approaches in EDM for prediction of risk of failure, and prediction of risk of failure in a course by modelling students’ learning behavior derived from their activity data. Moreover, typically, their models neither explain their predictions (lack of interpretability), nor are tested on new unseen courses. With regard to the latter, it is worth noting that some EDM research have only been validated and tested on one course (or dataset) using cross-validation techniques, and the practicality of their prediction models has not been investigated on further unseen courses. This research aims to address some of these gaps by proposing a simple, generalizable, and interpretable EDM approach that automatically models students based on their learning behavior of content access, engagement, and assessment in online courses, and accordingly predicts their risk of failure. To this aim, the following research questions have been set:

Can modelling students learning with respect to content access, engagement, and assessment during courses be used to accurately predict their risk of failure in online courses?

Can a robust, interpretable, and generalizable educational data mining approach be developed that by using students’ activity data transparently detects students at risk of failure in the future courses?

Contribution of this research can be summarized as follows. First, the proposed approach identifies low-performing students that are at risk of failure, providing educators with opportunities to take necessary actions. Second, it considers comprehensive students’ learning behavioral patterns with respect to their content access, engagement, and assessment behavior in a course to develop learning behavior models, and accordingly predicts their risk of failure in online courses. Third, it develops a generalizable student modelling approach that can accurately detect students at risk of failure in different unseen courses. Fourth, by offering explanation on its predictions, it nurtures educators and learners’ trust in such decisions, leading to more meaningfully employing machine learning-based models in the context of supporting learning and teaching. In the following sections, related research, the proposed EDM approach, experimental results, and discussion and conclusions have been presented, respectively.

Related Work

Educational data mining has shown a great potential when it comes to analyzing a big volume of data in the field of distance education [9]. It employs almost all data mining method spectrum and has addressed all various data mining tasks (e.g., [10-16]). Numerous recent researches have highlighted the increasing upsurge of the EDM field, e.g., for learning management systems [17]. These learning environments usually store different types of data, such as participation rate and regularity in courses, downloading (or viewing) its relevant activities such as quiz, or resources such as videos, URLs, and e-books. EDM approaches are often employed to mine such data so as to provide educators and students with various types of supports to eventually improve academic and learning achievement. Some examples of such services are timely feedback and recommendations, prediction of success or failure of students, early identification of students at risk (with learning difficulties), offering personalized learning, and so forth [18]. Prediction of students’ outcome and learning behaviors in online courses is among the key areas of EDM that can be grouped into three types of predictive problems: prediction of students’ performance (e.g., [19]), prediction of students’ failure or dropout (e.g., [20]), and prediction of students’ achievement or grade (e.g., [21]). Because students’ achievement or grade in a course is considered among the most crucial factors for their future academic success, a number of EDM research have been developed for prediction of students’ grade or achievement in a course. For instance, in their predictive models for course achievement or students’ performance, [22] included students’ class attendance, [23] used students’ grades from previous semesters, [24] included demographic data of students such as gender, age, and family background, [25] included prior grade point average of students, and [26]] utilized students’ performance in entrance exams. As it is apparent, many of these works ignore considering activity data of students during a course and instead take into account past performance or non-academicrelated data of students in their model. Some of such predictive models clearly overlook that neither some of these predictor variables—e.g., demographics, psychometric factors of students, or their past performance—can be controlled by teachers or students, nor they might be available (due to reasons like data privacy). Therefore, instead, more EDM studies concerning predictive models should consider including students’ activity data during a course in their model as they are logically amongst the finest predictors of students’ performance. Students’ failure or poor academic achievement in a course, which is usually measured with course grade or achievement test scores, is among the most important individual predictors of academic failure [27]. Several EDM approaches have been applied to educational data for predicting achievement or grades of students in courses. For instance, [28] employed matrix factorization and regression-based approaches to predict grade of students in future courses. Their findings show that matrix factorization outperforms other methods in prediction of students’ future grades. In another study, [29] proposed an algorithm to predict students' grade in a class. Their proposed method learns the optimal prediction for each student in a timely manner and could successfully predict with 75% accuracy whether students will pass the course. In a different attempt, [30] have put forward a new approach to predict students’ risk of failure in online courses by modelling their engagement rates. Their predictive model could achieve the accuracy of 89% and revealed that while higher engagement rate is important, consistency in interaction over time is the most crucial in predicting students’ risk of failure. Although different studies included multiple variables related to students’ activities during courses and accordingly have achieved good results with their models, yet they often overlook some important factors. For instance, some ignore simplicity of the EDM approaches for practitioners or the interpretability of the predictive model, while some do not properly work with fewer data and are unsuitable for being used in courses with small amounts of data. Moreover, most existing works concerning prediction of students’ achievement or risk of failure in courses ignore properly modeling student learning behavior or preferences during the courses. To build on the existing works, this work proposes a simple, generalizable, and interpretable EDM approach to develop student models of learning behavior during a course, including students’ behavior regarding accessing learning resources, engaging with peers, and taking assessment tests.

Method

Problem Description
We assume that there exists a set of students’ actions logged to the system (or features) referred to as
$𝐹$ = {$𝑓_1, 𝑓_2, … 𝑓_{|𝐹|}$} wherein each feature includes activity data of a set of students referred to as $𝑈$ = {$𝑢_1, 𝑢_2, … , 𝑢_{|𝑈|}$} that are aggregated to the system during a course. A set of student features are further used to model their learning behavior in each course, represented by $𝐵 = (𝑏_1, 𝑏_2, 𝑏_3)$. More specifically, 𝑏1 denotes students’ behavior with regard to accessing content of online learning materials contents (like course view, resource view, resource download, URL view, etc.), $𝑏_2$ revolves around modelling students engagement behavior with other class members (e.g., forum view discussion, add discussion, add comments, and so forth), while $𝑏_3$ represents students’ assessment behavior during a course (such as quiz view, quiz attempt, assignment view, etc.). Each student $u$ is associated with $𝐵_𝑢$. Given the aforementioned information, we seek to predict students’ risk of failure, i.e., low or high. In doing so, we take into account if students will have had necessary learning activities or behaviors regarding the course materials content access, engaging with other class members, and assessment during a course. Notions used in this research are listed in Table 1.

Table 1.Notations
Notation Explanation
$U, u$ A set of students and a specific student, $𝑈 = {𝑢1,𝑢2, … , 𝑢|𝑈|}$
$F, f$ A set of features and a specific feature, $𝐹 = {𝑓1, 𝑓2, … 𝑓|𝐹|}$
$B, b$ A set of models of learning behavior and a specific model of learning behavior, $𝐵 = (𝑏1, 𝑏2, 𝑏3)$
$b_1^{ca}, b_2^{en}, b_3^{as}$ The predefined list of features associated with model of each learning behavior (content access, engagement and assessment), $b_1^{ca} \subset F, b_2^{en} \subset F, b_3^{as} \subset F$ and $b_1^{ca} \cap F, b_2^{en} \cap F, b_3^{as} \cap F = ∅$
$P$ The set of performance metrics
$C$ The optimal classification method

StudModel: The Proposed EDM Approach
Fig. 1 shows the overall architecture of the proposed EDM approach, and the following sections elaborate the approach further. To start, various groups of students’ activity data logged in the online learning system are used to develop feature vectors and feature vector spaces, that are logical representation of students learning behavior in online courses, for each student in each course. These feature vectors that signify students’ level of activity—and are continuous with multiple values to represent each student—are then used to model their learning behavior (learner model) during online courses. More explicitly, as shown in Equation (1), the aggregation of students’ action logs is used to establish the student’s feature vector. And then, for modeling each student's learning behavior, we predefined the list of features that were related to each student's learning behavior pattern, namely $b_1^{ca}, b_2^{en}$ and $b_3^{as}$ for their learning behavior with respect to content access, engagement, and assessment. These feature vectors are then used to generate three numeric values that are used to model students’ learning behavior (or to identify patterns of their online learning behavior), namely accessing learning resources, engaging with peers, and taking assessment tests, see Equation (2) and Algorithm 1. Basically, students’ three-layered model forms a comprehensive learning behavioral pattern for a course. More explicitly, students’ actions (or activity data) help to infer whether the student’s preference—which is correlated to their course achievement (see the statistical analysis in Table 2)—is to study by accessing learning materials (e.g., course view, resource view, URL view), simply by taking assessment tests (e.g., quiz view, assignment view, quiz attempt, assignment submission), by engaging with peers and/or the instructor (e.g., forum add discussion, forum view discussion), or perhaps all. This could further help the learning system and educators to take necessary remedial actions, according to the student’s model, during the semester.
Fig. 1.Overall architecture of the StudModel approach.

Algorithm 1 presents the StudModel approach for prediction of students’ risk of failure through modelling their learning behavior. In the first step, a three-layered students’ model ($𝐵_𝑢$), including content access ($𝑏_1$), engagement ($𝑏_2$), and evaluation ($𝑏_3$) behavior is developed for each course. More explicitly, students’ feature vector ($𝐹$) is firstly inputted from our dataset. The students’ features are different according to the courses. For example, the first course has features such as course resource viewed, course modules viewed, feedback viewed, feedback received, assignment submitted, assignment viewed, and posts created and updated in the forum. In addition to some of these features, the second course includes features like course materials downloaded, URL viewed, quiz view, quiz attempt, forum discussion viewed, discussion created in forum, and discussion viewed in forum. As the feature vectors mostly deal with attributes on a different scale, a min-max normalization was performed at this stage, mapping the minimum and maximum value in 𝐹 to 0 and 1 respectively. This feature vector is used to create three lists representing students’ behavior of content access, engagement, and evaluation in a course. In the second step, the student models are then inputted to a clustering method (i.e., X-means) to group students who exhibit similar categories of behavior into the number of clusters found by the clustering method. Unlike many other clustering methods, this method does not require predefinition of the number of clusters, and it automatically determines the optimal number of clusters. To ensure that students with similar behavioral models are clustered together, a further cluster analysis is performed, (semantically) assessing the clusters to confirm the appropriateness of clusters formed in mapping student risk of failure. In the third step, a predictor is trained and used to classify students into different classes (i.e., low- and high-risk). To do so, each student model which consists of three different layers is used as an input to several different classification methods using three different performance measures to identify the highest performed classifier for the prediction task. The classification methods include k-nearest neighbors (KNN), support vector machine (LibSVM), deep learning, rule induction, decision tree, random forest, neural network, logistic regression, and naïve bayes. To avoid overfitting, we used an evolutionary optimization method (i.e., genetic algorithm) to fine-tune the hyperparameters in our algorithms. For instance, while using deep learning, we considered regularization, and reducing hidden layers and their corresponding nodes; for the decision tree, we employed both pre and post pruning techniques; for naïve Bayes, we used Laplace correction; and so on. The evaluation metrics employed to measure performance of each method are accuracy, precision, and recall. Observe that the classification methods and performance measures used are among the most widely cited and used classification methods in EDM research. In the final step, the local interpretable model-agnostic explanations approach is employed to explain the predictions made by the classifiers.

(1)

(2)

Algorithm 1.Generalizable approach for prediction of students’ risk of failure through modelling their learning behavior

Table 2.Statistical analysis
content access
(b1)
engagement
(b2)
assessment
(b3)
final grade
Teaching and reflection I content access (b1) 1.00 0.22 0.32 0.09
engagement (b2) 0.22 1.00 0.17 0.20
assessment (b3) 0.32 0.17 1.00 0.43
final grade 0.09 0.20 0.43 1.00
count 242.00 242.00 242.00 242.00
mean 0.08 0.09 0.21 75.12
std 0.10 0.15 0.12 12.66
min 0.00 0.00 0.00 49.00
max 0.44 1.00 0.60 94.00
Teaching and reflection I content access (b1) 1.00 0.77 0.66 0.61
engagement (b2) 0.77 1.00 0.69 0.69
assessment (b3) 0.66 0.69 1.00 0.72
final grade 0.61 0.69 0.72 1.00
count 92.00 92.00 92.00 92.00
mean 0.22 0.21 0.49 82.40
std 0.14 0.13 0.20 17.41
min 0.00 0.00 0.00 49.00
max 0.64 0.80 0.88 100.00
Teaching and reflection II content access (b1) 1.00 0.33 0.41 0.32
engagement (b2) 0.33 1.00 0.23 0.22
assessment (b3) 0.41 0.23 1.00 0.40
final grade 0.32 0.22 0.40 1.00
count 224.00 224.00 224.00 224.00
mean 0.16 0.18 0.46 90.51
std 0.10 0.15 0.12 11.66
min 0.00 0.00 0.00 49.00
max 0.61 0.95 0.75 100.00

Experimental Results

Datasets and Statistical Analysis
To properly evaluate the StudModel approach, data from three courses with different number of students from the Moodle system of the University of Tartu in Estonia is used. The blended courses used were entitled “Teaching and Reflection I” with 242 students, “Teaching and Reflection II” with 224 students, and “Digital Literacy” with 92 students. All courses are among the compulsory basic module of teacher education studies on both bachelor and master level at the Institute of Education. The aim of the first course is to give an introduction to child development and main learning principles, while the second course focuses more on student diversity, assessment and interventions. The third course revolves around the components of digital literacy, as well as its evaluation, application, and possibilities for improvement in the educational context. To create our datasets, different types of students’ action (activity or features (F)) data are extracted, including number of times: (1) course resource viewed, (2) course modules viewed, (3) course materials downloaded, (4) feedback viewed, (5) feedback received, (6) forum discussion viewed, (7) discussion created in forum, (8) book chapters viewed, (9) book list viewed, (10) assignment submitted, (11) assignment viewed, (12) discussion viewed in forum, (13) post created in forum, (14) comments viewed, (15) posts updated in forum, and (16) posted created. Besides, number of attempts to quizzes and assignment, and grades were considered. Before proceeding with the analysis, it is essential to investigate how different parts of the student model for each course correlate with their course grades. As Table 2 shows, there exists a positive correlation between content access, engagement, and assessment, and the course grade in courses. This shows that there are correlations between different layers of students’ learning behavior model and the final grade of the course.
Fig. 2.Cluster analysis of (a) teaching and reflection I, (b) digital literacy, and (c) teaching and reflection II. The y-axis is according to average aggregation function.

Categorizing Students with Similar Behavioral Model
To categorize students with similar behavioral models, X-means clustering method is employed with further statistical cluster analysis—k min=2, k max=60, numerical measure=EuclideanDistance, clustering algorithm=K-means. Briefly, this method makes use of heuristic to identify the correct number of centroids. Fig. 2 illustrates results of X-means clustering for the courses (where the y-axis uses average aggregation function), along with mean and standard deviation of clusters. According to the results, cluster A with the average grade of 81.00, 90.92, and 93.27, assessment of 10.47%, 18.45%, and 13.88% larger, content access of 7.98%, 22.50%, and 52.24% larger, and engagement rate of 12.10%, 22.16%, and 83.45% larger than the overall population (datasets) could be considered as a group of students with low risk of failure in the first, second, and third course, respectively. On the other hand, cluster B with average grade of 57.32, 53.58, and 54.62, assessment of 31.75%, 58.70%, and 7.86% smaller, content access of 19.15%, 71.60%, and 29.59% smaller, and engagement rate of 36.72%, 76.88%, and 47.27% smaller than the overall population can be considered as a group of students with high risk of failure in the first, second, and third course, respectively. The values for the cluster centroids of both courses are shown in Table 3. It is worth noting that in some other courses more or less clusters could be established, and to have more or less personalization for each student, one could form more or less clusters on the data which leads to more or less detailed clusters

Table 3.Cluster centroids
 content access engagement assessment Cluster A Cluster B Cluster A Cluster B Cluster A Cluster B Teaching reflection I 0.124 0.071 0.416 0.033 0.264 0.204 Digital literacy 0.271 0.063 0.261 0.049 0.584 0.204 Teaching reflection II 0.238 0.11 0.338 0.097 0.522 0.422

Datasets and Statistical Analysis
After clustering students with similar learning models, a classifier is trained for classifying students into two different classes, namely high- and low-risk of failure. Nine different classification methods were implemented on both courses using three performance metrics (accuracy, precision, and recall referred to as P).

4.3.1 Validation-test phase: cross-validation
Table 4 shows performance measures of classification methods, using 5-fold cross-validation, for both courses. The cross-validation was employed as it is considered among the best and most reliable validation methods for the future accuracy of a predictive model. Simply put, it has been employed as it enables us to test how well our models is able to get trained by some data and then predict data it has not seen (within the same course or dataset). The choice of k involves a tradeoff between the error prediction’s efficiency and accuracy, and k = 5 or k = 10 are usually preferred as they have shown empirically to produce test error rate estimates suffering neither from excessive high variance nor very high bias. According to this result, for the teaching and reflection II course, while most classifiers predicted the labels with accuracy (or percentage of correct predictions) higher than 95%, deep learning outperformed other algorithms with accuracy of 99.56%. The “±” denotes one standard deviation (computed from the five model accuracies), where the smaller value indicates more stable model. Precision can be considered as a measure of quality and recall as the measure of quantity. In case of the teaching and reflection II course, high value for precision and recall shows that when the model predicted the high-risk of failure, it was often correct, and when the example was actually high-risk of failure, the classifier often detected correctly, respectively. For the digital literacy course, deep learning and logistic regression appeared to be highest performing methods, while SVM happened to have the lowest performance. Lastly, logistic regression and KNN showed the best performance for the teaching and reflection I course. Overall, considering the accuracy and its standard deviation, precision, and recall, it can be concluded that deep learning and logistic regression are among the best performing algorithms regardless of the size of the courses.

Table 4.Performance measure of classification methods (using cross-validation)
 Accuracy Precision Recall Teaching and reflection II KNN 96.00 (±6.74) 95.75 (±6.72) 92.57 (±13.56) LibSVM 96.88 (±3.37) 100.00 (±0.00) 91.61 (±8.88) Deep learning 99.56 (±0.99) 100.00 (±0.00) 98.75 (±2.80) Rule induction 93.29 (±2.76) 90.34 (±2.76) 91.15 (±7.64) Decision tree 94.66 (±2.52) 95.14 (±4.98) 90.14 (±3.30) Random forest 95.99 (±4.27) 94.54 (±9.13) 95.22 (±4.95) Neural network 98.66 (±1.71) 98.81 (±2.38) 97.56 (±2.82) Logistic regression 99.11 (±1.22) 98.82 (±2.63) 98.82 (±2.63) Naive Bayes 95.97 (±3.68) 94.06 (±6.84) 95.07 (±8.14) Digital literacy KNN 97.78 (±3.04) 98.57 (±3.19) 98.57 (±3.19) LibSVM 93.57 (±5.88) 92.48 (±6.89) 100.00 (±0.00) Deep learning 98.95 (±2.35) 100.00 (±0.00) 98.67 (±2.98) Rule induction 97.78 (±3.04) 98.67 (±2.98) 98.57 (±3.19) Decision tree 97.89 (±4.71) 98.57 (±3.19) 98.57 (±3.19) Random forest 97.84 (±2.96) 98.67 (±2.98) 98.46 (±3.44) Neural network 98.91 (±2.17) 100.00 (±0.00) 98.61 (±2.78) Logistic regression 98.95 (±2.35) 98.75 (±2.80) 100.00 (±0.00) Naive Bayes 93.63 (±6.58) 86.43 (±19.63) 90.00 (±13.69) Teaching and reflection I KNN 99.59 (±0.91) 97.50 (±5.59) 100.00 (±0.00) LibSVM 99.58 (±0.93) 100.00 (±0.00) 96.67 (±7.45) Deep learning 99.58 (±0.93) 100.00 (±0.00) 96.67 (±7.45) Rule induction 99.18 (±1.12) 97.50 (±5.59) 97.14 (±6.39) Decision tree 99.18 (±1.13) 97.50 (±5.59) 96.67 (±7.45) Random forest 99.18 (±1.13) 100.00 (±0.00) 93.81 (±8.52) Neural network 99.17 (±0.96) 97.22 (±5.56) 96.88 (±6.25) Logistic regression 99.59 (±0.91) 100.00 (±0.00) 97.14 (±6.39) Naive Bayes 97.09 (±3.16) 84.14 (±14.66) 100.00 (±0.00)

4.3.2 Application phase: testing on independent courses
To further test the practicality and performance of the prediction models on the real-world future cases, the model trained on each course was tested on two different independent courses. For instance, the model trained on dataset of the teaching and reflection I course was tested on the disjoint digital literacy course in order to determine the classification error. The same testing process was carried out for other courses. Figs. 3–5 illustrate performance measures of classification methods on independent courses. According to this result—with regards to the accuracy, precision, and recall—for the teaching and reflection II course, neural network followed by deep learning and decision tree showed the highest accuracies while being tested on the digital literacy course (with accuracies around 90%). Comparing these with the results from the cross-validation, there is only around 9% drop in the accuracy showing the stability, generalizability, and practicality of the classifiers. Surprisingly, the trained model on the teaching and reflection II course could successfully, with the same accuracy obtained by cross-validation, predict the labels of the independent course of teaching and reflection I (using decision tree and rule induction algorithms). When trained on the digital literacy course, which has a rather smaller size than the other two courses, the error of the prediction slightly increased and the best performing classifiers could achieve 78% for the teaching and reflection II course, and 93% for the teaching and reflection I course. Like previous cases, deep learning showed the highest performance. Lastly, training on teaching and reflection I course could manage to successfully predict the correct labels with accuracies of around 94% for the other two independent courses. While KNN and SVM were the highest performing algorithms, deep learning showed to have overperformed many other algorithms. Overall, it was shown that regardless of the testing method (cross-validation or independent datasets), our proposed method seems to be robust, generalizable, and practical for unseen courses.
Fig. 3.Performance measure of classification methods for the teaching and reflection II course using independent courses of (a) digital literacy and (b) teaching and reflection I.

Fig. 4.Performance measure of classification methods for the digital literacy course using independent courses of (a) teaching and reflection II and (b) teaching and reflection I.

Fig. 5.Performance measure of classification methods for the teaching and reflection I course using independent courses of (a) teaching and reflection II and (b) digital literacy

Prediction Explanation and Prescriptive Analytics
To better understand the predictions, this study employed prediction explanations using the local interpretable model-agnostic explanations approach. Basically, this approach produces a neighboring set of data points, and accordingly determines the local attribute weights in that neighborhood using correlation. Table 5 demonstrates explanation on prediction made by deep learning algorithm on the digital literacy course (the model was trained on teaching and reflection II). In the table, 24 examples from the 92 examples of the independent dataset (digital literacy course) along with predictions and color highlighting of attributes are illustrated, where red and green indicates the value of the attribute that are contradicting or supporting the prediction, respectively. For instance, in the first row, the attribute assessment supports prediction of cluster_0 (i.e., cluster A = low risk), while cluster_1 (cluster B = high risk) is the correct label. In other words, the support predictor is making the model to predict wrongly in case of the first example. In case of example 20, both assessment and engagement support the predicted label (cluster_0), whereas content access contradicts the predicted label. Finally, in example 8, all the three attributes contradict the correctly predicted label, with assessment being the strongest contradictor.

Table 5.Explaining the prediction made by deep learning algorithm on the independent dataset (the digital literacy course)
 Label Prediction(label) Confidence content access engagement assessment Cluster B Cluster A Cluster B Cluster A 0.076 0.924 0.078 0 0.391 Cluster B Cluster B 0.943 0.057 0.020 0.000 0.232 Cluster B Cluster B 0.82 0.18 0.027 0.001 0.271 Cluster B Cluster B 0.999 0.001 0.079 0 0.109 Cluster B Cluster B 1 0 0.003 0.000 0 Cluster B Cluster B 0.999 0.001 0.071 0 0.106 Cluster B Cluster B 0.924 0.076 0.061 0 0.24 Cluster B Cluster B 0.988 0.012 0.015 0 0.184 Cluster B Cluster B 1 0 0.002 0 0 Cluster B Cluster B 1 0 0.026 0 0.011 Cluster B Cluster B 1 0 0 0 0.004 Cluster B Cluster A 0.001 0.999 0.046 0.001 0.535 Cluster B Cluster B 1 0 0 0 0 Cluster B Cluster B 1 0 0.05 0.012 0.046 Cluster B Cluster B 1 0 0.141 0.007 0.042 Cluster B Cluster A 0.01 0.99 0.136 0.112 0.410 Cluster B Cluster A 0.071 0.929 0.066 0.129 0.347 Cluster A Cluster A 0 1 0.125 0.005 0.639 Cluster A Cluster A 0 1 0.218 0.256 0.766 Cluster A Cluster A 0 1 0.339 0.265 0.523 Cluster A Cluster A 0 1 0.381 0.278 0.586 Cluster A Cluster A 0.001 0.999 0.13 0.114 0.502 Cluster A Cluster A 0 1 0.233 0.237 0.547 Cluster B Cluster A 0.049 0.951 0.093 0.118 0.361
Red and green indicates the value of the attribute that are contradicting or supporting the prediction, respectively.

The analytics uses the trained deep learning model on the digital literacy course. In so doing, ten constant values were assigned to the attribute assessment (with an assumption that every student should take assessment when taking a course, if he/she is to pass the course). As can be seen in the table, the optimal values or settings (in other words, values for engagement, content access, and likelihood of low and high risk) have been prescribed. For instance, according to the trained model, if a student’s engagement and content access are 0.475 and 0.331 in a course (considering the constant value of 0.5 for the assessment), he/she will have a 99% likelihood of being classified as low risk of failure. However, if value for assessment is set to 0.05, it is prescribed that the optimized inputs are 0.473 for engagement and 0.498 for content access to have 55% likelihood of being categorized as low risk of failure.

 Example Prediction label Confidencea) content access assessment engagement Cluster A Cluster B 1 Cluster A 0.546 0.454 0.498 0.05 0.473 2 Cluster A 0.986 0.014 0.507 0.1 0.474 3 Cluster A 0.656 0.344 0.001 0.15 0.474 4 Cluster A 0.992 0.008 0.499 0.2 0.474 5 Cluster A 0.978 0.022 0.429 0.25 0.475 6 Cluster A 0.989 0.011 0.507 0.3 0.472 7 Cluster A 0.841 0.159 0.186 0.35 0.474 8 Cluster A 1 0 0.507 0.4 0.472 9 Cluster A 0.952 0.048 0.035 0.45 0.463 10 Cluster A 0.999 0.001 0.331 0.5 0.475

Conclusion

This study has investigated the effectiveness of a student modelling-based predictive model for estimating their risk of failure in a course. As previously shown, the results presented reveal that our proposed EDM approach, called StudModel, is capable of automatically modelling students according to their learning behavior and preferences, and with high accuracy predict their risk of failure in different courses. This could provide educators with opportunities to identify students that are not performing well and adjust their pedagogical strategies accordingly. More specifically, by employing the StudModel, educators can potentially provide students with timely feedback or support based on their category of risk. In this regard, [31] found that, compared to summative feedback at the end of the semester, regular and timely feedback—derived by different means such as prediction of students’ risk of failure—hold a great potential to motivate students that leads to enhancing their academic achievement. The StudModel approach is robust, generalizable, easy to use and interpret, and could be used on various online and blended courses, offering personalized learning to different groups of students. More specifically, any online learning environment that logs various students’ actions (or activity data) during a course could benefit from the StudModel approach. Results reported in this research differ from several existing EDM approaches for prediction of students’ failure in online courses. For instance, on the one hand, while there are studies that develop student models for prediction of their achievement or performance in online courses (e.g. [4,32,33]), they mostly rely on particular type of data that are either related to students’ learning style like Felder Silverman Learning Style Model (e.g., [34]) or students’ personality traits, e.g., Big Five Model [34]. One challenge with such attempts is the fact that such student models often fail to support educators in identifying weak students or reducing academic failure rate, as they are occasionally built upon static characteristics of students (not their dynamic characteristics that projects their actual learning behavior during a course). Furthermore, while some of these EDM-based approaches take into account both dynamic and static characteristics of students, they require particular type of data, or their development is complex. For example, to infer students’ personality traits (based on the Big Five Model), [33] collected self-reported data on students’ openness to experience and extraversion. Accordingly, they developed a model to predict students’ achievements. Similarly, to model students according to their learning style (e.g., Felder Silverman Learning Style Model), [35] used some specific type of students’ data that may not exist in all online learning environments, and ignores some informative students’ activity data that can easily be found in all online learning environments. The argument is that a generalizable EDM approach should consider data types and requirements of different online educational systems and be applicable to various learning systems on different platforms. On the other hand, there are several EDM-based approaches for predicting students’ risk of failure or course achievements in online courses that, without limiting themselves to a specific types of data, have shown a good potential (e.g., [27, 29]). Nonetheless, these studies either overlook modeling different aspects of students’ learning behavior, or they include students’ past performance or demographic data. The StudModel approach benefits from a simple, yet accurate algorithm that considers various types of students’ activity data during a course to build a multi-layered student model so as to have a comprehensive students’ learning behavioral pattern which can later be used for prediction of their failure risk. Unlike many of the aforementioned existing works that require specific types or a large amount of data to build their predictive model, the StudModel approach: (1) pays the utmost attention to different types of students’ activity data which are available in most of online learning systems and are logically amongst the best predictors of students’ failure risk in online courses (i.e., learning resource content access, engagement with peers, and taking assessment tests), (2) is robust and generalizable as it can detect students’ risk of failure in different courses with different number of students, (3) is interpretable because it provides explanation on its decisions by highlighting both supporting and contradicting predictors for each decision, and (4) outperforms many of the existing state-of-the-art approaches by achieving accuracies over 90% in courses with different size. Aside from the high performance, applicability, and generalizability of the StudModel, its interpretability could result in nurturing educators, practitioners, and learners’ trust in such decisions, leading to more meaningfully employing machine learning-based models in the context of supporting learning and teaching. Consequently, our findings provide evidence that using various types of students’ activity data which are available in most online learning systems, it is feasible to predict students risk of failure in online courses through modelling their learning behavior by using a robust, interpretable, and generalizable educational data mining approach. A few limitations of this study could be regarded as directions for future research. For example, even though we have effectively modelled students’ learning behavior and accurately predicted their risk of failure in courses, the StudModel approach does not pay attention to students with medium risk of failure and only detect those that are at low or high risk of failure. In future works, it would be useful if the StudModel approach can benefit from multi-class classification so it can also identify students with medium risk of failure. Another limitation of this work is that the StudModel approach has not been put into practice yet and its effectiveness and generalizability in practice is unknown. Moreover, considering students’ time series behavior would enable our proposed approach to provide educators with identification of students with learning difficulties in a timely manner. Therefore, another future work could be to consider students’ time series behavior, investigate the generatability of the approach using data from different platforms, and integrate the proposed approach into an online learning environment to provide educators and learners with individualized feedback and instructions.

Author’s Contributions

Conceptualization, DH, YMH, YY. Funding acquisition, YY. Investigation and methodology, DH, YY. Supervision, DH, YMH, YY. Writing of the review and editing, DH, YMH, YY. Data curation, YY. Visualization, DH, YY.

Funding

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. 2021R1C1C2004868), and the University of Tartu ASTRA Project PER ASPERA, financed by the European Regional Development Fund.

Competing Interests

The authors declare that they have no competing interests.

Author Biography

Name : Danial Hooshyar
Affiliation : Institute of Education, University of Tartu, Tartu, Estonia School of Digital Technologies, Tallinn University, Tallinn, Estonia
Biography : He received his Ph.D. degree from artificial intelligence department at University of Malaya, Malaysia, in 2016. He had worked as a research professor in department of computer science and engineering at Korea University for nearly four years. He is currently an associate professor of learning analytics at the University of Tartu and also works as researcher at Tallinn University. His research interest includes artificial intelligence in education, adaptive educational systems, and educational technology.

Name : Yueh-Min Huang
Affiliation : Department of Engineering Science and Institute of Education, National Cheng Kung University, Taiwan>
Biography : He received the M.S. and Ph.D. degrees in electrical engineering from The University of Arizona, in 1988 and 1991, respectively. He is currently a Chair Professor with the Department of Engineering Science and Institute of Education, National Cheng Kung University, Taiwan. He has completed over 60 Ph.D. and 300 MS thesis students. He has coauthored three books and has published more than 280 refereed journal research articles. His research interests include e-learning, multimedia communications, and artificial intelligence.

Name : Yeongwook Yang
Affiliation : Division of Computer Engineering, Hanshin University, Osan, South Korea
Biography : He received his master degree from computer science education, and Ph.D. from the department of computer science and engineering at Korea University, Seoul, South Korea. He had worked as a research professor in the department of computer science and engineering at Korea University for one year. He was a Senior Researcher at University of Tartu, Tartu, Estonia. He is currently an assistant professor in the division of computer engineering at Hanshin University. His research interests include information filtering, recommendation system, educational data mining, and deep learning.

References

[1] A. M. Darensbourg and J. J. Blake, “Predictors of achievement in African American students at risk for academic failure: the roles of achievement values and behavioral engagement,” Psychology in the Schools, vol. 50, no. 10, pp. 1044-1059, 2013.
[2] E. B. Costa, B. Fonseca, M. A. Santana, F. F. de Araujo, and J. Rego, “Evaluating the effectiveness of educational data mining techniques for early prediction of students' academic failure in introductory programming courses,” Computers in Human Behavior, vol. 73, pp. 247-256, 2017.
[3] D. Hooshyar, M. Pedaste, and Y. Yang, “Mining educational data to predict students’ performance through procrastination behavior,” Entropy, vol. 22, no. 1, article no. 12, 2020. https://doi.org/10.3390/e22010012
[4] K. Chrysafiadi and M. Virvou, “Student modeling approaches: a literature review for the last decade,” Expert Systems with Applications, vol. 40, no. 11, pp. 4715-4729, 2013.
[5] D. Hooshyar, M. Yousefi, and H. Lim, “Data-driven approaches to game player modeling: a systematic literature review,” ACM Computing Surveys (CSUR), vol. 50, no. 6, pp. 1-19, 2018.
[6] L. A. Buschetto Macarini, C. Cechinel, M. F. Batista Machado, V. Faria Culmant Ramos, and R. Munoz, “Predicting students success in blended learning–evaluating different interactions inside learning management systems,” Applied Sciences, vol. 9, no. 24, article no. 5523, 2019. https://doi.org/10.3390/app9245523
[7] G. Daghan and B. Akkoyunlu, “Modeling the continuance usage intention of online learning environments,” Computers in Human Behavior, vol. 60, pp. 198-211, 2016.
[8] A. S. Sunar, R. A. Abbasi, H. C. Davis, S. White, and N. R. Aljohani, “Modelling MOOC learners' social behaviours,” Computers in Human Behavior, vol. 107, article no. 105835, 2020. https://doi.org/10.1016/j.chb.2018.12.013
[9] N. Iam-On and T. Boongoen, “Generating descriptive model for student dropout: a review of clustering approach,” Human-centric Computing and Information Sciences, vol. 7, article no. 1, 2017. https://doi.org/10.1186/s13673-016-0083-0
[10] D. Hooshyar, Y. Yang, M. Pedaste, and Y. M. Huang, “Clustering algorithms in an educational context: an automatic comparative approach,” IEEE Access, vol. 8, pp. 146994-147014, 2020.
[11] L. Chen, H. N. Liang, F. Lu, K. Papangelis, K. L. Man, and Y. Yue, “Collaborative behavior, performance and engagement with visual analytics tasks using mobile devices,” Human-centric Computing and Information Sciences, vol. 10, article no. 47, 2020. https://doi.org/10.1186/s13673-020-00253-7
[12] W. Li, Y. Ding, Y. Yang, R. S. Sherratt, J. H. Park, and J. Wang, “Parameterized algorithms of fundamental NP-hard problems: a survey,” Human-Centric Computing and Information Sciences, vol. 10, article no. 29, 2020. https://doi.org/10.1186/s13673-020-00226-w
[13] H. J. Jang and B. Kim, “KM-DBSCAN: density-based clustering of massive spatial data with keywords,” Human-centric Computing and Information Sciences, vol. 11, article no. 43, 2021. https://doi.org/10.22967/HCIS.2021.11.043
[14] F. Guo and Q. Lu, “Contextual collaborative filtering recommendation model integrated with drift characteristics of user interest,” Human-centric Computing and Information Sciences, vol. 11, article no. 8, 2021. https://doi.org/10.22967/HCIS.2021.11.007
[15] A. S. Albahri, R. A. Hamid, Z. T. Al-qays, A. A. Zaidan, B. B. Zaidan, A. O. Albahri, et al., “Role of biological data mining and machine learning techniques in detecting and diagnosing the novel coronavirus (COVID-19): a systematic review,” Journal of Medical Systems, vol. 44, article no. 122, 2020. https://doi.org/10.1007/s10916-020-01582-x
[16] A. Dridi, M. M. Gaber, R. M. A. Azad, and J. Bhogal, “Scholarly data mining: a systematic review of its applications,” Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, vol. 11, no. 2, article no. e1395, 2021. https://doi.org/10.1002/widm.1395
[17] T. Lerche and E. Kiel, “Predicting student achievement in learning management systems by log data analysis,” Computers in Human Behavior, vol. 89, pp. 367-372, 2018.
[18] D. Shin and J. Shim, “A systematic review on data mining for mathematics and science education,” International Journal of Science and Mathematics Education, vol. 19, no. 4, pp. 639-659, 2021.
[19] A. Hellas, P. Ihantola, A. Petersen, V. V. Ajanovski, M. Gutica, T. Hynninen, et al., “Predicting academic performance: a systematic literature review,” in Proceedings Companion of the 23rd Annual ACM Conference on Innovation and Technology in Computer Science Education, Larnaca, Cyprus, 2018, pp. 175-199.
[20] C. Marquez-Vera, A. Cano, C. Romero, and S. Ventura, “Predicting student failure at school using genetic programming and different data mining approaches with high dimensional and imbalanced data,” Applied Intelligence, vol. 38, no. 3, pp. 315-330, 2013.
[21] D. Hooshyar and Y. Yang, “Predicting course grade through comprehensive modelling of students’ learning behavioral pattern,” Complexity, vol. 2021, article no. 7463631, 2021. https://doi.org/10.1155/2021/7463631
[22] D. A. Kumar, M. Vijayalakshmi, and D. A. Kumar, “Appraising the significance of self-regulated learning in higher education using neural networks,” International Journal of Engineering Research and Development, vol. 1, no. 1, pp. 9-15, 2012.
[23] S. Parack, Z. Zahid, and F. Merchant, “Application of data mining in educational databases for predicting academic trends and patterns,” in Proceedings of 2012 IEEE International Conference on Technology Enhanced Education (ICTEE), Amritapuri, India, 2012, pp. 1-4.
[24] S. Natek and M. Zwilling, “Student data mining solution–knowledge management system related to higher education institutions,” Expert Systems with Applications, vol. 41, no. 14, pp. 6400-6407, 2014.
[25] T. M. Christian and M. Ayub, “Exploration of classification using NBTree for predicting students' performance,” in Proceedings of 2014 International Conference on Data and Software Engineering (ICODSE), Bandung, Indonesia, 2014, pp. 1-6.
[26] K. F. Li, D. Rusk, and F. Song, F. (2013, July). Predicting student academic performance,” in Proceedings of 2013 7th International Conference on Complex, Intelligent, and Software Intensive Systems, Taichung, Taiwan, 2013, pp. 27-33.
[27] S. Suh, J. Suh, and I. Houston, “Predictors of categorical at‐risk high school dropouts,” Journal of Counseling & Development, vol. 85, no. 2, pp. 196-203, 2007.
[28] A. Elbadrawy, A. Polyzou, Z. Ren, M. Sweeney, G. Karypis, and H. Rangwala, “Predicting student performance using personalized analytics,” Computer, vol. 49, no. 4, pp. 61-69, 2016.
[29] Y. Meier, J. Xu, O. Atan, and M. Van der Schaar, “Predicting grades,” IEEE Transactions on Signal Processing, vol. 64, no. 4, pp. 959-972, 2015.
[30] B. E. Shelton, J. L. Hung, and P. R. Lowenthal, “Predicting student success by modeling student interaction in asynchronous online courses,” Distance Education, vol. 38, no. 1, pp. 9, 2017.
[31] J. Lu and N. Law, “Online peer assessment: effects of cognitive and affective feedback,” Instructional Science, vol. 40, no. 2, pp. 257-275, 2012.
[32] T. Sheeba and R. Krishnan, “Automatic detection of students learning style in learning management system,” in Smart Technologies and Innovation for a Sustainable Future. Cham, Switzerland: Springer, 2019, pp. 45-53.
[33] F. Wu and S. Lai, “Linking prediction with personality traits: a learning analytics approach,” Distance Education, vol. 40, no. 3, pp. 330-349, 2019.
[34] M. A. Abdullah, “Learning style classification based on student's behavior in moodle learning management system,” Transactions on Machine Learning and Artificial Intelligence, vol. 3, no. 1, article no. 28, 2015. https://doi.org/10.14738/tmlai.31.868.
[35] H. A. Fasihuddin, G. D. Skinner, and R. I. Athauda, “Personalizing open learning environments through the adaptation to learning styles,” in Proceedings of the 9th International Conference on Information Technology and Applications (ICITA), Sydney, Australia, 2014.

About this article
Cite this article

Danial Hooshyar1,2, Yueh-Min Huang3 , and Yeongwook Yang4,*, A Three-Layered Student Learning Model for Prediction of Failure Risk in Online Learning, Article number: 12:28 (2022) Cite this article 2 Accesses

Download citation
• Recived19 July 2021
• Accepted21 January 2022
• Published30 June 2022
Share this article

Anyone you share the following link with will be able to read this content:

Provided by the Springer Nature SharedIt content-sharing initiative

Keywords