應用混合式前處理與 IPF 過濾器之集成式學習 於軟體缺陷預測;An Application of Hybrid-Sampling and Iterative-Partitioning Filters for Ensemble Learning in Software Defect Predictio

NCU Institutional Repository > 管理學院 > 資訊管理研究所 > 博碩士論文 > Item 987654321/93155

請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/93155

題名:	應用混合式前處理與 IPF 過濾器之集成式學習於軟體缺陷預測;An Application of Hybrid-Sampling and Iterative-Partitioning Filters for Ensemble Learning in Software Defect Predictio
作者:	林庭伊;Lin, Ting-Yi
貢獻者:	資訊管理學系
關鍵詞:	軟體缺陷預測;混合採樣;集成學習;迭代分層過濾器;欠採樣;過採樣;Software Defect Prediction;Synthetic Sampling;Ensemble Learning;Iterative Partitioning Filter;Under-sampling;Over-sampling
日期:	2023-07-11
上傳時間:	2024-09-19 16:44:46 (UTC+8)
出版者:	國立中央大學
摘要:	隨著軟體規模的增長，測試成本也會越來越高，為避免測試階段造成軟體缺陷的檢查遺漏而導致嚴重後果，機器學習開始被使用於軟體缺陷預測（Software Defect Prediction ，簡稱 SDP）並嘗試與現今的自動化測試工具結合，利用機器學習協助且及早定位容易出現錯誤的模組，藉此將測試資源集中於特定的專案模組上，讓企業得以利用更低成本，產出更高品質的產品。本研究使用 EE-IPF（EasyEnsemble +Iterative Partitioning Filter, IPF 迭代分層過濾器）架構與三種不同過採樣方式結合，分別為 Polynom-fit-SMOTE 、ProWsyn 、SMOTEIPF 形成 Hybrid-EE-IPF 架構應用於 SDP 領域。希望藉由此方式改善 EasyEnsemble 模型中單一隨機欠採樣上可能造成資訊缺失與少類學習特徵不足的問題，且不同於過往 SDP 研究使用單一 IPF 過濾器過濾雜訊資料點，而是將多個過濾器與集成模型結合，以提升各基底分類的多樣性，進而改善軟體缺陷上的預測表現。;As software scales become larger, the cost of testing also increases. To avoid the risk of missing software defects during the testing phase and resulting serious consequences, machine learning has been applied to software defect prediction (SDP) to assist in early identification of defect modules. This enables testing resources to be focused on specific project modules, allowing enterprises to produce higher-quality products at lower costs. In this study, the EE IPF (EasyEnsemble + Iterative-Partitioning Filter) architecture is combined with three different oversampling methods, namely Polynom-fit-SMOTE, ProWsyn, and SMOTEIPF, to form the Hybrid-EE-IPF structure for SDP. This study aims to alleviate the problem of data loss and insufficient learning features caused by single random under-sampling in the EasyEnsemble model and noisy data points in the SDP dataset. Unlike previous SDP studies that used a single IPF filter to filter noisy data points, multiple filters are integrated with the ensemble model to improve the diversity of base classifiers and enhance the prediction performance of software defects.
顯示於類別:	[資訊管理研究所] 博碩士論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	14	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....