使用機器學習識別課堂中的風險學生：12門課程的實證研究;Applying Machine Learning for Identifying Risk Student in Classroom: An Empirical Study of Twelve Courses

NCU Institutional Repository > 資訊電機學院 > 資訊工程研究所 > 博碩士論文 > Item 987654321/81069

請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/81069

題名:	使用機器學習識別課堂中的風險學生：12門課程的實證研究;Applying Machine Learning for Identifying Risk Student in Classroom: An Empirical Study of Twelve Courses
作者:	呂欣澤;Lu, Hsin-Tse
貢獻者:	資訊工程學系
關鍵詞:	學習分析;學習資料探勘;機器學習;特徵工程;給分策略;Learning analytics;Educational data mining;Machine learning;Feature engineering;Grading policy
日期:	2019-07-10
上傳時間:	2019-09-03 15:32:40 (UTC+8)
出版者:	國立中央大學
摘要:	學習分析的理念上是透過學習者在課堂中產生的數位足跡，促使其在課堂中獲得成功，實作上許多學者們倡導早期識別風險學生，適時給予學習介入為主要手段，因此，在研究的領域中興起了使用機器學習訓練風險學生識別模型的風潮。然而，在系統性探討文獻後，我們發現許多研究中忽略了模型訓練的細節，包含了：學習風險早期預測可行性、降低資料維度以增加模型準確度與找出關鍵因子、風險類別不平衡的影響等。因此，本研究先著手探討學習風險預測的現況定義，再收集了12個來自於線上學習平台的課程資料，並且以監督式/非監督式學習、回歸、分類以及分群等方法，透過資料的切片、特徵工程導入、資料重新取樣等進行探索。最後，透過交叉驗證機制與現況比對，證實了透過學生在平台上的關鍵學習行為，能夠在學期前三分之一針對風險程度進行提前識別，並且歸納出教師常用的鑑別型、嚴格型與寬鬆型等三種給分策略，以及每一種策略對於機器學習的成效影響以及改善方式。在機器學習方法套用於實際的課程之前，我們提出了兩點限制，第一、風險識別模型若需在同類型的課程通用，需限制課程長度、學習教材、作業、小考、學習活動以及給分策略的一致性；第二、對於低數位足跡學生族群無法透過機器進行風險識，教師仍需投入投入合適的干預手段在該族群，若是採取小考成績輔助，需要特別注意考題的鑑別度以提高識別的準確度。;The concept of learning analytics is to motivate student achieving success in the classroom by the support of digital footprint that generated from the learning environment. In practice, many researchers advocate early identification of risk students and timely access to learning intervention as the main approach. Therefore, a trend of adopting machine learning to train risk student identification models has emerged in the field of learning analytics. However, after systematically exploring the recent literature, we identified several details of model training were overlooked by many studies, which including an innovated early warning system for classroom, reducing data dimensions for improving model accuracy and identifying key factors that affected students′ learning performance, and impact of the number of failure students that caused by teachers′ grading policy. Therefore, this thesis collected 12-courses data and adopted supervised/unsupervised learning under the method of classification, regression, and clustering to fill the gap from previous studies. Through the process of feature engineering and resampling, it is confirmed that students′ risk level can be identified by one-third of the semester and three grading policies have been summarized, which is discrimination, stringency, and leniency. Moreover, a resampling process is necessary to avoid issues caused by teachers′ grading policy. Furthermore, we propose two limitations when adopting machine learning into the classroom: the first one is the risk identification model could be applied to different courses only if the course duration, learning materials, homework, exam, learning activities, and grading policy were consistent. Second, machine cannot identify the risk population with a low digital footprint, exam discrimination is necessary if the instructor would consider the exam results as well.
顯示於類別:	[資訊工程研究所] 博碩士論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	157	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....