dc.description.abstract | Graduating on schedule is a critical milestone for students in higher education institutions, reflecting both institutional effectiveness and student success. However, identifying and addressing factors that may delay timely completion poses significant challenges. This study is educational data mining research that aims to investigate the factors of on-time graduation in both students’ demographics and learning performance aspects by integrating machine learning approach. Furthermore, this study also develops a learning analytics dashboard which provides forecasts about on-time graduation and presents data visualization resources useful for various educational stakeholders including school officials, teachers, and students.
The dataset was collected from the academic system of an engineering department at a higher education institution in Indonesia. After the data cleaning process, it used 133 students’ recorded data for over four years of academic calendar years from 2019 to 2023. The method used in the educational data mining process of this research is CRISP-DM (Cross-Industry Standard Process for Data Mining) with the waterfall model implementation on the development of the learning analytics dashboard system. In the educational data mining process, this research used both supervised and unsupervised machine learning. For supervised learning, researchers build several machine learning models to predict on-time graduation, such as Decision Tree, kNN, SVM, Naïve Bayes, Random Forest, Logistic Regression, Gradient Boosting, Stochastic Gradient Descent, and Neural Network. Meanwhile, the unsupervised using K-Means algorithm divides students into three clusters. Furthermore, the developed system which has been deployed on the website was assessed with the ISO/IEC 25010 standard in accordance with the WebQEM standard factors such as usability, functionality, efficiency, and reliability.
The results showed that CGPA, GPA 4th semester, Programming, Social Science, and English proficiency score are variables with the most importance toward on-time graduation from the student’s learning performance information. From the demographic, student’s information about gender, parents’ occupation, high school major, and extracurricular involvement are the relevant variables which have high influence toward on-time graduation. The modeling process showed that Random Forest outperformed other models in the evaluation metrics with 85% Classification Accuracy and 88% AUC (Area Under ROC Curve). For the developed system performance, efficiency test results show 82.2% average Performance Score and 87.6% average Structure Score which give an overall Grade B of GTmetrix. The reliability test conducted stress testing to the deployed website delivered a 100% success rate in various scenarios. The functionality testing using BlackBox testing by experienced software engineers produced a 99.4% success rate. The insights obtained from the usability evaluation, through the administration of a usability questionnaire, provided proof that the developed system is considered beneficial, user-friendly, straightforward to learn, and satisfying for both educators and students. Overall, the result of this study contributes valuable implications toward on-time graduation factors and the developed system along with the conclusions and suggestions for future research. | en_US |