博碩士論文 104522044 詳細資訊




以作者查詢圖書館館藏 以作者查詢臺灣博碩士 以作者查詢全國書目 勘誤回報 、線上人數:28 、訪客IP:3.145.18.135
姓名 林俊慶(LIN,JUN-QING)  查詢紙本館藏   畢業系所 資訊工程學系
論文名稱 結合離群值偵測與特徵選取改善預測模型性能
(Improving Performance of Prediction Model with Outlier Detection and Feature Selection)
相關論文
★ 應用智慧分類法提升文章發佈效率於一企業之知識分享平台★ 家庭智能管控之研究與實作
★ 開放式監控影像管理系統之搜尋機制設計及驗證★ 資料探勘應用於呆滯料預警機制之建立
★ 探討問題解決模式下的學習行為分析★ 資訊系統與電子簽核流程之總管理資訊系統
★ 製造執行系統應用於半導體機台停機通知分析處理★ Apple Pay支付於iOS平台上之研究與實作
★ 應用集群分析探究學習模式對學習成效之影響★ 應用序列探勘分析影片瀏覽模式對學習成效的影響
★ 一個以服務品質為基礎的網際服務選擇最佳化方法★ 維基百科知識推薦系統對於使用e-Portfolio的學習者滿意度調查
★ 學生的學習動機、網路自我效能與系統滿意度之探討-以e-Portfolio為例★ 藉由在第二人生內使用自動對話代理人來改善英文學習成效
★ 合作式資訊搜尋對於學生個人網路搜尋能力與策略之影響★ 數位註記對學習者在線上學習環境中反思等級之影響
檔案 [Endnote RIS 格式]    [Bibtex 格式]    [相關文章]   [文章引用]   [完整記錄]   [館藏目錄]   至系統瀏覽論文 ( 永不開放)
摘要(中) 為了提升學生的學習成效,提早並準確識別高風險學生,使得教師能夠早期介入輔導,是許多相關研究關注的議題。
混成式課程是一種結合線上與線下學習的課程,有別於傳統的線下學習,學生亦能夠透過線上學習平台,來進行多方面的學習。然而,學生在學習過程當中,會留下許多紀錄,例如學生的作業成績、影片瀏覽行為、線上活動頻率、線上測驗成績等等。因此,本論文透過資料探勘與機器學習技術,收集一門混成式微積分課程的學生學習活動資料,使用多元線性迴歸來預測學生的期末成績。
相關研究指出,預測模型的準確率容易受到離群值的影響。因此,本論文使用RANSAC演算法,作為離群值偵測的方法,將離群值從資料中去除。為了在移除離群值後更進一步改善預測模型的準確率,本論文以T檢定作為特徵選取的方法,保留對期末成績有顯著影響的關鍵特徵,來進一步改善預測模型的準確率。
根據研究結果顯示,透過本論文提出的離群值偵測與特徵選取流程,預測誤差由15.516分降低至4.571分,改善了約70%的預測誤差。
摘要(英)
In order to improve students’ learning performance, early and accurately identify at-risk students, so that teachers can early intervention, is the focus topic of many related research.
Blended course is a course which combine online and offline learning, different from traditional offline learning, students are also able to learn through the online learning platform. However, students will leave a lot of records in the learning process, such as students′ homework grade, video viewing behavior, online activity frequency, online test grade etc. Therefore, this paper based on data mining and machine learning technologies, collects students’ learning activity data from a blended calculus course, uses multiple linear regression to predict students’ final grade.
Related researchs point out the accuracy of the prediction model is easily affected by outliers. Therefore, this paper uses RANSAC algorithm as outlier detection method to remove outliers from data. In order to futher improve accuracy of prediction model after remove outliers, this paper uses T-Test as feature selection method, retains the key features that have a significant impact on the final grade, to futher improve accuracy of prediction model.
According to the results of research, through the outlier detection and feature selection process proposed in this paper, prediction error from 15.516 down to 4.571 points, improving the prediction error about 70 percent.
關鍵字(中) ★ 離群值偵測
★ 特徵選取
★ 多元線性迴歸
★ 學習成效預測
關鍵字(英) ★ Outlier Detection
★ Feature Selection
★ Multiple Linear Regression
★ Learning performance prediction
論文目次
摘要 i
ABSTRACT ii
圖目錄 v
表格目錄 vi
一、 緒論 1
二、 文獻探討 3
2.1 隨機抽樣一致(RANdom SAmple Consensus) 3
2.2 特徵選取 3
2.3 學習風險預測 4
2.4 總結 5
三、 混成式微積分課程 7
四、 方法 8
4.1 資料收集 8
4.2 資料前處理 8
4.2.1 填充缺失值 8
4.2.2 資料整合 9
4.2.3 資料聚合 10
4.3 離群值偵測 10
4.4 特徵選取 11
4.5 殘差分析 11
4.6 迴歸分析 12
4.7 交叉驗證 12
4.8 資料標準化 13
五、 結果及討論 14
5.1 研究問題一 14
5.1.1 未移除Outlier流程-結果 14
5.1.2 移除Outlier流程-結果 15
5.1.3 結果總結 19
5.2 研究問題二 19
5.2.1 加入特徵選取流程-結果 20
5.2.2 結果總結 22
5.3 研究問題三 23
5.3.1 「week 1 ~ week 3」資料集流程-結果 24
5.3.2 結果總結 28
六、 結論 29
參考文獻 31
參考文獻

Arroway, P., Morgan, G., O’Keefe, M., & Yanosky, R. (2015). Learning Analytics in Higher Education (p. 17). Research report. Louisville, CO: ECAR, March 2016. 2016 EDUCAUSE. CC by-nc-nd.
Asif, R., Merceron, A., & Pathan, M. K. (2014). Predicting student academic performance at degree level: a case study. International Journal of Intelligent Systems and Applications, 7(1), 49.
Awang, T. S., & Zakaria, E. (2013). Enhancing students’ understanding in integral calculus through the integration of Maple in learning. Procedia-Social and Behavioral Sciences, 102, 204-211.
Chandrashekar, G., & Sahin, F. (2014). A survey on feature selection methods. Computers & Electrical Engineering, 40(1), 16-28.
Dusmez, S., Heydarzadeh, M., Nourani, M., & Akin, B. (2017). Remaining Useful Lifetime Estimation for Power MOSFETs Under Thermal Stress With RANSAC Outlier Removal. IEEE Transactions on Industrial Informatics.
Ellis, R. A., Pardo, A., & Han, F. (2016). Quality in blended learning environments–Significant differences in how students approach learning collaborations. Computers & Education, 102, 90-102.
Erdem, C. E., Bozkurt, E., Erzin, E., & Erdem, A. T. (2010, October). RANSAC-based training data selection for emotion recognition from spontaneous speech. In Proceedings of the 3rd international workshop on Affective interaction in natural environments (pp. 9-14). ACM.
Fischler, M. A., & Bolles, R. C. (1981). Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6), 381-395.
Hachey, A. C., Wladis, C. W., & Conway, K. M. (2014). Do prior online course outcomes provide more information than GPA alone in predicting subsequent online course grades and retention? An observational study at an urban community college. Computers & Education, 72, 59-67.
Hong, J. C., Hwang, M. Y., Wu, N. C., Huang, Y. L., Lin, P. H., & Chen, Y. L. (2016). Integrating a moral reasoning game in a blended learning setting: effects on students′ interest and performance. Interactive Learning Environments, 24(3), 572-589.
Hu, Y. H., Lo, C. L., & Shih, S. P. (2014). Developing early warning systems to predict students’ online learning performance. Computers in Human Behavior, 36, 469-478.
Huang, S., & Fang, N. (2013). Predicting student academic performance in an engineering dynamics course: A comparison of four types of predictive mathematical models. Computers & Education, 61, 133-145.
Kröger, M., Sauer-Greff, W., Urbansky, R., Lorang, M., & Siegrist, M. (2016, September). Performance evaluation on contour extraction using Hough transform and RANSAC for multi-sensor data fusion applications in industrial food inspection. In Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA), 2016 (pp. 234-237). IEEE.
Kuzilek, J., Hlosta, M., Herrmannova, D., Zdrahal, Z., & Wolff, A. (2015). OU Analyse: analysing at-risk students at The Open University. Learning Analytics Review, 1-16.
Lara, J. A., Lizcano, D., Martínez, M. A., Pazos, J., & Riera, T. (2014). A system for knowledge discovery in e-learning environments within the European Higher Education Area–Application to student data from Open University of Madrid, UDIMA. Computers & Education, 72, 23-36.
Liu, H., Motoda, H., Setiono, R., & Zhao, Z. (2010, May). Feature selection: An ever evolving frontier in data mining. In Feature Selection in Data Mining (pp. 4-13).
Lonn, S., Aguilar, S. J., & Teasley, S. D. (2015). Investigating student motivation in the context of a learning analytics intervention during a summer bridge program. Computers in Human Behavior, 47, 90-97.
Lu, O. H., Huang, J. C., Huang, A. Y., & Yang, S. J. (2017). Applying learning analytics for improving students engagement and learning outcomes in an MOOCs enabled collaborative programming course. Interactive Learning Environments, 25(2), 220-234.
Macfadyen, L. P., & Dawson, S. (2010). Mining LMS data to develop an “early warning system” for educators: A proof of concept. Computers & education, 54(2), 588-599.
Marbouti, F., Diefes-Dux, H. A., & Madhavan, K. (2016). Models for early prediction of at-risk students in a course using standards-based grading. Computers & Education, 103, 1-15.
Meier, Y., Xu, J., Atan, O., & van der Schaar, M. (2016). Predicting grades. IEEE Transactions on Signal Processing, 64(4), 959-972.
Oshin, O., Gilbert, A., & Bowden, R. (2011, March). Capturing the relative distribution of features for action recognition. In Automatic Face & Gesture Recognition and Workshops (FG 2011), 2011 IEEE International Conference on(pp. 111-116). IEEE.
Pal, M., & Foody, G. M. (2010). Feature selection for classification of hyperspectral data by SVM. IEEE Transactions on Geoscience and Remote Sensing, 48(5), 2297-2307.
Romero, C., López, M. I., Luna, J. M., & Ventura, S. (2013). Predicting students′ final performance from participation in on-line discussion forums. Computers & Education, 68, 458-472.
Thammasiri, D., Delen, D., Meesad, P., & Kasap, N. (2014). A critical assessment of imbalanced class distribution problem: The case of predicting freshmen student attrition. Expert Systems with Applications, 41(2), 321-330.
Villagrá Arnedo, C., Gallego-Durán, F. J., Compañ, P., Llorens Largo, F., & Molina-Carmona, R. (2016). Predicting academic performance from behavioural and learning data.
指導教授 楊鎮華 審核日期 2017-7-19
推文 facebook   plurk   twitter   funp   google   live   udn   HD   myshare   reddit   netvibes   friend   youpush   delicious   baidu   
網路書籤 Google bookmarks   del.icio.us   hemidemi   myshare   

若有論文相關問題,請聯絡國立中央大學圖書館推廣服務組 TEL:(03)422-7151轉57407,或E-mail聯絡  - 隱私權政策聲明