姓名 王紹宇(Shao-Yu Wang) 畢業系所 資訊工程學系
論文名稱 應用場域對抗式類神經網路預測學生程式碼缺陷與程式碼缺陷與程式編寫風格之間的關聯
(Apply DANN to identify students′ code defects and their relationship with coding style)
摘要(中) 近年程式設計能力逐漸成為教育界發展重點,然而程式語言結合運算思維及編程操作,高入門門檻往往導致學生之間學習進度差異過大使教學困難,評估學生學習狀況成為程式語言教育的重要課題。在此結合線上編程系統、系統日誌過濾整合技術萃取學習參與度,提供學習行為參與度即時視覺化呈現,提供教學端面板進行學習行為監控。
對於作業繳交原始碼,以程式編寫風格評估以及程式碼缺陷預測模型進行程式碼品質分析,提供評量依據以及自我檢查的資訊。為了節省教學負擔,透過過去NASA MDP專案經驗中的資料集作為訓練資料,取代教學端逐一檢視程式碼並且給予是否存在缺陷之人工標記,以遷移式學習中的場域對抗式類神經網路(Domain-Adversarial Neural Network)技術建立程式碼缺陷模型,而結果顯示DANN的場域適應性較傳統機器學習好,DANN模型之proxy A-distance (PAD)數值為0.9,傳統機器學習方法之PAD則都大於1.8。
摘要(英) In recent years, programming skill have gradually become the focus of education development in the education sector. However, programming combines computational thinking and coding operations. High learning thresholds often lead to difficult teaching due to the huge differences in student learning progress. Measure students’ learning becomes an important issue in programming education. We combination of online programming system, system log filter integration technology to extract learning participation, provide immediate visualization of learning behavior engagement to help teachers to monitor students’ learning behavior.
For assignment of the source code, code quality analyzes by coding style measure and code defects prediction model to provide evaluation and self-examination. In order to reduce the burden of teachers, we use NASA MDP project experience as a training data, instead of the teaching side to view lots pieces of code and give the artificial mark of whether the code is defect or not. And use Domain-Adversarial Neural Network to build code defects prediction model. The result shows that DANN′s domain adaptability is better than traditional machine learning. The DANN model’s proxy A-distance (PAD) value in 0.9, and traditional machine learning methods have a PAD greater than 1.8.
Finally, we study the relationship between coding style and code defects. We use Pearson correlation coefficient to find relationship of two sets. We found that its low correlation pairs (coefficient smaller than 0.25) is nearly 70%. After, we use different features and establish multiple linear regression to predict score of coding style model. The model build by coding style features’ pMSE value is 9.349, and the code defects features’ pMSE is 13.686. Through the average prediction error of only 0.7 difference, it is found that although there is no direct correlation between the coding style’s and the code defects’ feature, but two datasets have similar ability to predict score of coding style.
關鍵字(中) ★ 學習分析
★ 程式碼缺陷
★ 程式編寫風格
★ 場域對抗式類神經網路
★ 機器學習
關鍵字(英) ★ Learning Analytics
★ Code defects
★ Coding style
★ Domain Adversarial Neural Network
★ Machine learning
論文目次 摘要 . V
圖目錄 IX
表目錄 . X
一、緒論 1
1.1 程式課程學習能力評估 1
1.2 程式碼品質檢測 2
二、文獻探討 4
2.1 學生程式課程學習分析(Program course analysis of student)  4
2.2 學習分析(Learning Analytics)   5
2.3 機器學習(Machine learning) 7
2.4 NASA MDP 資料集(NASA MDP datasets) 7
2.5 遷移式學習(Transfer learning) 8
2.6 場域對抗式類神經網路(Domain Adversarial Neural Network) 9
三、研究問題 10
四、系統架構 11
五、線上程式語言學習參與度指標 12
5.1. 線上程式撰寫平台 13
5.2. 半結構化日誌過濾與萃取 14
5.3. 資訊倉儲與管理 14
5.4. 資訊視覺化與儀表板創建 15
5.5. 學習參與度儀表板結果 15
六、程式編寫風格檢測 17
七、程式碼缺陷評估模型 17
7.1. NASA MDP訓練資料集 19
7.1.1. NASA MDP資料前處理 22
7.2. 場域對抗類神經網路 25
7.3. DANN參數設定 27
7.4. 模型成效評估指標(Proxy A-distance, PAD) 29
7.5. 透過PAD比較DANN與其他機器學習方法的場域適應性 31
八、程式碼缺陷與程式編寫風格之間的關聯 32
8.1.程式碼缺陷與程式編寫風格指標的皮爾森相關係數檢定 32
8.2.以程式碼缺陷指標透過回歸模型預測程式編寫風格分數 33
九、結論與未來工作 35
參考文獻
指導教授 楊鎮華(Stephen J.H. Yang) 審核日期 2018-7-16
