自動學習開源教育知識庫內容的分類方法;AUTOMATED LEARNING CONTENT CLASSIFICATION FOR OPEN EDUCATION REPOSITORIES

NCU Institutional Repository > 資訊電機學院 > 資訊工程研究所 > 博碩士論文 > Item 987654321/83989

請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/83989

題名:	自動學習開源教育知識庫內容的分類方法;AUTOMATED LEARNING CONTENT CLASSIFICATION FOR OPEN EDUCATION REPOSITORIES
作者:	王馬赫;Gunarathne, W K Tharanga Mahesh
貢獻者:	資訊工程學系
關鍵詞:	開源教育知識庫;學習內容;搜尋引擎;資料視覺化;資料擷取;資料轉換;自動學習內容分類;主題模型;多重標籤分類;自動學習內容分類;Open Education Resources (OER);Learning objects;search engine;data visualization;data extraction;data transformation;clustering;probabilistic topic models;multi-label classification;Automatic learning object classification
日期:	2020-07-24
上傳時間:	2020-09-02 17:51:11 (UTC+8)
出版者:	國立中央大學
摘要:	我們認知學習開源教育知識庫(OER)對於教育品質的提升是常好的策略與機會。目前來說，學生、教師或研究人學者是可以透過教材內容的關鍵字的邏輯組合在通用的檢索引擎中尋找資源。但大多數的檢索引擎並沒有辦法準確找到合適的學習內容。這個研究最主要的目的是提出一個用於開放式教育知識庫的自動化的學習內容分類機制。目前MERLOT II (www.merlot.org)是個擁有大量用戶作為獲取或上傳資源的學習平台。因此我們以MERLOT II實驗的場域。第一個的階段，我們提出基於學習內容知識庫(LOR)使用者的單一檢索關鍵字的分層知識圖，實現透過增強式學習內容(LO)檢索引擎的並以數據視覺化去引導使用者獲得合適的學習內容。使用者可以透過此系統網頁進行單一關鍵字的檢索，並獲得一個視覺化的分層知識圖。本系統的後端具備資料提取、資料轉換、資料內聚及資料視覺化的功能。此視覺化的檢索結果表明，本系統能幫助使用者用單一關鍵字進行檢索以獲得學習內容庫的清晰概述。下一個階段，我們重新定位原始計畫，提出一個自動學習開源教育知識庫內容的分類方法。開源教育知識庫主要的價值取決於能透過網頁檢索引擎進行檢索或定位。目前MERLOT II知識庫要求資源提供者在上傳時必須手動選擇其所屬相關的學科類別，這種作法非常耗時，而且容易有人為疏失。如果選擇了不正確的分類，知識庫中就會發生未存入正確類別的情況。可能導致MERLOT的智慧檢索或進階檢索時學習資源並不會被列出。以上的調查。我們發現開發一個開源知識庫的內容自動分類方案的重要性。資料集是採用MERLOT蒐集資料並採用廣為周知的分類方法，如：Logistic Regression、 (Multinomial) Naive Bayes、Linear Support Vector Machine及Random Forest進行初步實驗以測試準確性。我們提出自動學習內容分類模組(LCCM)將學習資源進行其相關學科的分類，並將其添加入MERLOT知識庫中。本階段的目標包含資料集準備、資料預處裡、使用LDA主題模型的特徵擷取並使用預先訓練的詞彙嵌入矩陣計算語意的相似度。這些方法是可以在短時間內更有效率對學習資源進行分類的基礎。 ;Open Educational Resources deliver a strategic opportunity to improve the quality of education. At present, OER users, students, instructors, and scholars can find OERs from general search engines through metadata enrichment and logic extrapolation. Yet, most users of Web search engines today face difficulties when searching for decent and appropriate learning materials. The main goal of this study is to propose an automated learning content classification for Open Education Repositories. Since MERLOT II (www.merlot.org) is used by a large number of users to obtain learning resources and to submit resources, the MERLOT II repository was designated as an experimental domain. In the initial phase, we inspired to propose an enhanced learning object (LO) search engine solution together with a data visualization feature to navigate LOs through a hierarchical knowledge graph based on a single search keyword for LOR users. A Web-based solution was implemented where users could execute a single keyword search and then visualize results on a hierarchical knowledge graph. The back-end of the system was designed with the functions of data extraction, data transformation, data clustering, and data visualization to accomplish our objectives. The outcome of the search and data visualization results indicate that the proposed approach can help users to get a clear overview of the LOs based on a single keyword search. In the next phase, we repositioned with our original plan of proposing an automated learning content classification for Open Education Repositories. The value of OERs mainly depends on how easy they can be searched or located through a web search engine. Currently, the MERLOT II metadata repository requests resource providers to choose the relevant discipline category manually while adding material to its repository. This practice appears very time-consuming and also bound to involve human errors. If a member picks an incorrect discipline category, then the learning resource may not be correctly categorized in the repository. This situation may result in a learning resource not being shortlisted for a given keyword search of the "MERLOT Smart Search" or in the "Advanced search." Above investigations motivated us to recognize the importance of developing an automated learning content classification solution for OER repositories. The dataset was arranged using the MERLOT data collection and carried out the initial experiments with the well-known classifiers: Logistic Regression, (Multinomial) Naive Bayes, Linear Support Vector Machine, and Random Forest to test the accuracy. An automated learning content classification model (LCCM) was proposed to classify learning resources into relevant discipline categories while adding them to the MERLOT repository. The research goal incorporated in this phase includes dataset preparation, data preprocessing, feature extraction using the LDA topic model, and calculating the semantic similarity using a pre-trained word embedding matrix. These methods serve as a base for classifying learning resources more effectively within a short time.
顯示於類別:	[資訊工程研究所] 博碩士論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	132	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....