應用大數據分析提供自動化數位教學影片分類並改善推薦機制

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：33

、訪客IP：3.147.83.70

姓名

蘇可耘(Ko-Yun Su) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

應用大數據分析提供自動化數位教學影片分類並改善推薦機制
(Applying Big Data Analytics to Improve Digital Teaching Videos Classification and Recommendation Mechanism)

相關論文

★ 應用智慧分類法提升文章發佈效率於一企業之知識分享平台	★ 家庭智能管控之研究與實作
★ 開放式監控影像管理系統之搜尋機制設計及驗證	★ 資料探勘應用於呆滯料預警機制之建立
★ 探討問題解決模式下的學習行為分析	★ 資訊系統與電子簽核流程之總管理資訊系統
★ 製造執行系統應用於半導體機台停機通知分析處理	★ Apple Pay支付於iOS平台上之研究與實作
★ 應用集群分析探究學習模式對學習成效之影響	★ 應用序列探勘分析影片瀏覽模式對學習成效的影響
★ 一個以服務品質為基礎的網際服務選擇最佳化方法	★ 維基百科知識推薦系統對於使用e-Portfolio的學習者滿意度調查
★ 學生的學習動機、網路自我效能與系統滿意度之探討-以e-Portfolio為例	★ 藉由在第二人生內使用自動對話代理人來改善英文學習成效
★ 合作式資訊搜尋對於學生個人網路搜尋能力與策略之影響	★ 數位註記對學習者在線上學習環境中反思等級之影響

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 ( 永不開放)

摘要(中)

隨著大數據時代的來臨，網路上充斥著各種免費的開放資訊，讓人們隨時隨地都能透過網路取得資源，其中開放教育資源提供了豐富的教學資源，是可以提供大眾終身學習的管道，但同時大量的教育資源也導致不好維護平台品質的問題，往往需要教育專家幫忙整理教材。因此，本系統透過影片的相關介紹，自動幫影片分類，期望能降低教育專家的負擔，同時將影片結合教育部訂定的課綱，產生影片的推薦清單，以API的形式提供服務，期望透過服務能夠改善平台的推薦機制。
本研究應用了目前熱門的大數據分析技術提供自動化數位教學影片分類並改善推薦機制，使用Python的機器學習套件Scikit-learn將影片自動分類。影片相關資訊由LearnMode學習吧提供，另外透過網路爬蟲收集知識架構與數學詞庫，藉由詞庫擷取教學影片的特徵。經過比較多種不同分類器的準確度後，選前四個高準確度的分類演算法，比較加入 LDA特徵萃取後準確度的差別，接著再從四種方法中選擇分類準確度前三高的分類器，加入投票機制，成為本系統最終的分類方法。
影片分類後結合知識架構使得每支教學影片具有觀看的先後順序，將每一筆影片的觀看推薦清單存入資料庫中，再設計一個API，只要傳送該影片的ID，即可從資料庫中找到該影片的推薦清單並回傳給使用者。影片推薦清單包含該節點與前後各兩節點中，最多人觀看的兩部教學影片，一共推薦10支，可以提供學生複習與繼續學習的教材，節省搜尋教材的時間，不僅能夠了解教學影片的關聯性，同時也能整合學習平台上的教學影片。

摘要(英)

With the coming of the Big Data, there are lots of free and open resources on the Internet, so people can access resources through the Internet anytime, anywhere. Open Educational Resources(OER) provides a wealth of teaching resources and people can use OER for lifelong learning. It’s very difficult to maintain platform which stores large number of OER, so it needs experts to organize teaching resources on the platform. In this research, I use machine learning to automatically classify teaching videos by video’s information and hope that classifying videos by machine can reduce the burden of education experts. The system produces the video recommendation list by combining classified videos with learning sequence designed from Ministry of Education. The system offers the recommendation service by API. I expect that users can modify the recommendation mechanism by using the API.

關鍵字(中)

★ 大數據分析
★ 多元分類
★ 線性判別分析
★ 支援向量機
★ 隨機森林
★ 類神經網路
★ 結合多個分類器

關鍵字(英)

論文目次

目錄
摘要 I
ABSTRACT II
圖目錄 V
表目錄 VI
一、緒論 1
1.1 開放教育資源(Open Educational Resources, OER) 1
1.2 LearnMode學習吧 2
1.3 研究問題 2
二、文獻探討 3
2.1 特徵萃取(Feature extraction) 3
2.2 多元分類(Multiclass Classification) 4
2.3.1 支持向量機(Support Vector Machine, SVM) 4
2.3.2 隨機森林(Random Forests) 5
2.3.3 類神經網路(Neural Network) 5
2.3.4 邏輯回歸(Logistic Regression) 6
三、系統設計 7
3.1 系統環境 7
3.2 系統架構 11
3.3 資料收集 13
3.3.1 教學影片 13
3.3.2 國中知識架構 14
3.3.3 數學詞庫 15
3.3.4 資料前處理 15
3.4 資料儲存 19
3.5 資訊萃取與分析 20
3.5.1 多元分類(Multi-class classification) 20
3.5.2 Voting Classifier 22
3.6 資訊應用 22
四、實驗設計 23
五、結果與討論 25
實驗一：使用LDA降維是否能提升教材分類準確度 26
實驗二：Ensemble Method分類結果是否能接近專家 27
實驗三：影片推薦服務是否能呈現影片相關性 28
六、結論與未來研究 30
七、參考文獻 32

參考文獻

Adankon, M. M., & Cheriet, M. (2009). Model selection for the LS-SVM. Application to handwriting recognition. Pattern Recognition, 42(12), 3264-3270.
Agarwal, B., & Mittal, N. (2014). Text classification using machine learning methods-a survey. In Proceedings of the Second International Conference on Soft Computing for Problem Solving (SocProS 2012), December 28-30, 2012 (pp. 701-709). Springer, New Delhi.
Alickovic, E., & Subasi, A. (2016). Medical decision support system for diagnosis of heart arrhythmia using DWT and random forests classifier. Journal of medical systems, 40(4), 1.
Amari, S. I., & Wu, S. (1999). Improving support vector machine classifiers by modifying kernel functions. Neural Networks, 12(6), 783-789.
Atenas, J., & Havemann, L. (2013). Quality assurance in the open: an evaluation of OER repositories. INNOQUAL-International Journal for Innovation and Quality in Learning, 1(2), 22-34.
Atenas, J., & Havemann, L. (2014). Questions of quality in repositories of open educational resources: a literature review. Research in Learning Technology, 22(1), 20889.
Biletskiy, Y., Wojcenovic, M., & Baghi, H. (2009). Focused crawling for downloading learning objects–an architectural perspective. Interdisciplinary Journal of E-Learning and Learning Objects, 5, 169-180.
Bissell, A. N. (2009). Permission granted: open licensing for educational resources. Open Learning, 24(1), 97-106.
Bosch, A., Zisserman, A., & Munoz, X. (2007, October). Image classification using random forests and ferns. In Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on (pp. 1-8). IEEE.
Boznar, M., Lesjak, M., & Mlakar, P. (1993). A neural network-based method for short-term predictions of ambient SO2 concentrations in highly polluted industrial areas of complex terrain. Atmospheric Environment. Part B. Urban Atmosphere, 27(2), 221-230.
Breiman, L. (2001). Random forests. Machine learning, 45(1), 5-32.
Buckinx, W., & Van den Poel, D. (2005). Customer base analysis: partial defection of behaviourally loyal clients in a non-contractual FMCG retail setting. European Journal of Operational Research, 164(1), 252-268.
Caswell, T., Henson, S., Jensen, M., & Wiley, D. (2008). Open content and open educational resources: Enabling universal education. The International Review of Research in Open and Distributed Learning, 9(1).
Caswell, T., Henson, S., Jensen, M., & Wiley, D. Open educational resources: Enabling universal education (2008). International Review of Research in Open and Distance Learning, 9(1).
Chen, K., Wang, L., & Chi, H. (1997). Methods of combining multiple classifiers with different features and their applications to text-independent speaker identification. International Journal of Pattern Recognition and Artificial Intelligence, 11(03), 417-445.
Chumerin, N., & Van Hulle, M. M. (2006, September). Comparison of two feature extraction methods based on maximization of mutual information. In Machine Learning for Signal Processing, 2006. Proceedings of the 2006 16th IEEE Signal Processing Society Workshop on (pp. 343-348). IEEE.
Clemen, R. T. (1989). Combining forecasts: A review and annotated bibliography. International journal of forecasting, 5(4), 559-583.
Clements, K. I., & Pawlowski, J. M. (2012). User‐oriented quality for OER: Understanding teachers′ views on re‐use, quality, and trust. Journal of Computer Assisted Learning, 28(1), 4-14.
Col, U. N. E. S. C. O. (2011). Guidelines for open educational resources (OER) in higher education.
Cutler, D. R., Edwards, T. C., Beard, K. H., Cutler, A., Hess, K. T., Gibson, J., & Lawler, J. J. (2007). Random forests for classification in ecology. Ecology, 88(11), 2783-2792.
D’Antoni, S. (2009). Open educational resources: Reviewing initiatives and issues.
Dreiseitl, S., & Ohno-Machado, L. (2002). Logistic regression and artificial neural network classification models: a methodology review. Journal of biomedical informatics, 35(5), 352-359.
Gardner, J. W., Craven, M., Dow, C., & Hines, E. L. (1998). The prediction of bacteria type and culture growth phase by an electronic nose with a multi-layer perceptron network. Measurement Science and Technology, 9(1), 120.
Gardner, M. W., & Dorling, S. R. (1998). Artificial neural networks (the multilayer perceptron)—a review of applications in the atmospheric sciences. Atmospheric environment, 32(14), 2627-2636.
Jin, X., Zhao, M., Chow, T. W., & Pecht, M. (2014). Motor bearing fault diagnosis using trace ratio linear discriminant analysis. IEEE Transactions on Industrial Electronics, 61(5), 2441-2451.
Johnstone, S. M. (2005). Open educational resources serve the world. Educause Quarterly, 28(3), 15.
Lee, S. (2005). Application of logistic regression model and its validation for landslide susceptibility mapping using GIS and remote sensing data. International Journal of Remote Sensing, 26(7), 1477-1491.
Liaw, A., & Wiener, M. (2002). Classification and regression by randomForest. R news, 2(3), 18-22.
Liu, C., & Wechsler, H. (2002). Gabor feature based classification using the enhanced fisher linear discriminant model for face recognition. IEEE Transactions on Image processing, 11(4), 467-476.
Lotte, F., Congedo, M., Lécuyer, A., Lamarche, F., & Arnaldi, B. (2007). A review of classification algorithms for EEG-based brain–computer interfaces. Journal of neural engineering, 4(2), R1.
Izenman, A. J. (2013). Linear discriminant analysis. In Modern multivariate statistical techniques (pp. 237-280). Springer New York.
Khalid, S., Khalil, T., & Nasreen, S. (2014, August). A survey of feature selection and feature extraction techniques in machine learning. In Science and Information Conference (SAI), 2014 (pp. 372-378). IEEE.
Khan, N. M., Ksantini, R., Ahmad, I. S., & Boufama, B. (2012). A novel SVM+ NDA model for classification with an application to face recognition. Pattern Recognition, 45(1), 66-79.
Kim, S. Y., Jung, T. S., Suh, E. H., & Hwang, H. S. (2006). Customer segmentation and strategy development based on customer lifetime value: A case study. Expert systems with applications, 31(1), 101-107.
Kohavi, R. (1995, August). A study of cross-validation and bootstrap for accuracy estimation and model selection. In Ijcai (Vol. 14, No. 2, pp. 1137-1145).
Manek, A. S., Shenoy, P. D., Mohan, M. C., & Venugopal, K. R. (2017). Aspect term extraction for sentiment analysis in large movie reviews using Gini Index feature selection method and SVM classifier. World wide web, 20(2), 135-154.
Mehra, N., & Gupta, S. (2013). Survey on multiclass classification methods.
Morgan, N., & Bourlard, H. (1990, April). Continuous speech recognition using multilayer perceptrons with hidden Markov models. In Acoustics, Speech, and Signal Processing, 1990. ICASSP-90., 1990 International Conference on (pp. 413-416). IEEE.
Motoda, H., & Liu, H. (2002). Feature selection, extraction and construction. Communication of IICM (Institute of Information and Computing Machinery, Taiwan) Vol, 5, 67-72.
Nijhuis, J. A. G., Ter Brugge, M. H., Helmholt, K. A., Pluim, J. P. W., Spaanenburg, L., Venema, R. S., & Westenberg, M. A. (1995, November). Car license plate recognition with neural networks and fuzzy logic. In Neural Networks, 1995. Proceedings., IEEE International Conference on (Vol. 5, pp. 2232-2236). IEEE.
Pawlowski, J. M., & Bick, M. (2012). Open educational resources. Business & Information Systems Engineering, 4(4), 209-212.
Qi, Z., Tian, Y., & Shi, Y. (2013). Robust twin support vector machine for pattern classification. Pattern Recognition, 46(1), 305-316.
Ruiz-Calleja, A., Vega-Gorgojo, G., Asensio-Pérez, J. I., Bote-Lorenzo, M. L., Gómez-Sánchez, E., & Alario-Hoyos, C. (2012). A Linked Data approach for the discovery of educational ICT tools in the Web of Data. Computers & Education, 59(3), 952-962.
Samant, A., & Adeli, H. (2000). Feature extraction for traffic incident detection using wavelet transform and linear discriminant analysis. Computer‐Aided Civil and Infrastructure Engineering, 15(4), 241-250.
Sanjay, G. (2016). A Comparative Study on Face Recognition using Subspace Analysis. In International Conference on Computer Science and Technology Allies in Research-March (p. 82).
Scholkopf, B., Sung, K. K., Burges, C. J., Girosi, F., Niyogi, P., Poggio, T., & Vapnik, V. (1997). Comparing support vector machines with Gaussian kernels to radial basis function classifiers. IEEE transactions on Signal Processing, 45(11), 2758-2765.
Sebastiani, F. (2002). Machine learning in automated text categorization. ACM computing surveys (CSUR), 34(1), 1-47
Statnikov, A., Wang, L., & Aliferis, C. F. (2008). A comprehensive comparison of random forests and support vector machines for microarray-based cancer classification. BMC bioinformatics, 9(1), 319.
Subasi, A., Alickovic, E., & Kevric, J. (2017). Diagnosis of Chronic Kidney Disease by Using Random Forest. In CMBEBIH 2017 (pp. 589-594). Springer, Singapore.
Subasi, A., & Gursoy, M. I. (2010). EEG signal classification using PCA, ICA, LDA and support vector machines. Expert Systems with Applications, 37(12), 8659-8666.
Tong, S., & Koller, D. (2001). Support vector machine active learning with applications to text classification. Journal of machine learning research, 2(Nov), 45-66.
UNESCO., I. F. (2002). Forum on the impact of open courseware for higher education in developing countries.
Uysal, A. K., & Gunal, S. (2014). The impact of preprocessing on text classification. Information Processing & Management, 50(1), 104-112.
Vapnik, V. N. (1998). Statistical learning theory. Adaptive and learning systems for signal processing, communications, and control.
Verbert, K., Ochoa, X., Derntl, M., Wolpers, M., Pardo, A., & Duval, E. (2012). Semi-automatic assembly of learning resources. Computers & Education, 59(4), 1257-1272.
Xanthopoulos, P., Pardalos, P. M., & Trafalis, T. B. (2013). Linear discriminant analysis. In Robust Data Mining (pp. 27-33). Springer New York.

指導教授

楊鎮華

審核日期

2017-7-19

推文