博碩士論文 110423060 完整後設資料紀錄

DC 欄位 語言
DC.contributor資訊管理學系zh_TW
DC.creator吳冠諭zh_TW
DC.creatorKuan-Yu Wuen_US
dc.date.accessioned2023-7-14T07:39:07Z
dc.date.available2023-7-14T07:39:07Z
dc.date.issued2023
dc.identifier.urihttp://ir.lib.ncu.edu.tw:88/thesis/view_etd.asp?URN=110423060
dc.contributor.department資訊管理學系zh_TW
DC.description國立中央大學zh_TW
DC.descriptionNational Central Universityen_US
dc.description.abstract軟體測試是軟體開發生命週期中一項重要的工作,其在整個週期中佔了大量的時間,如果能針對容易出現缺陷的模組進行有效預測並事先修復,將可節省許多成本並交付更高品質的產品,因此軟體缺陷預測技術被應用於幫助開發人員降低其測試成本,其中,軟體度量是一種獲得原始碼客觀特徵描述的方法,所產生的指標也常被用於軟體偵錯。本研究使用NASA MDP與PROMISE的軟體缺陷預測資料集,這些資料集透過提取原始碼的多項靜態軟體度量指標作為機器學習模型的輸入特徵,然而因資料集屬於高維度資料,容易導致訓練上的複雜性及過擬合(Overfitting)問題。為解決此問題,本研究採用集成式特徵選擇,降低資料集維度再進行訓練,且不同於過往軟體缺陷預測領域的研究,本研究結合三種不同類型的特徵選擇技術,分別為過濾法(Filter)、包裝法(Wrapper)和內嵌法(Embedded),並搭配三種聚合方法來產生特徵子集,包括交集(Intersection)、聯集(Union)和多重交集(Multi-intersection),希望藉此克服單一特徵選擇方法的局限性,進而提升軟體缺陷預測的性能表現。研究結果顯示,基於聯集的集成式特徵選擇方法相較於單一特徵選擇擁有更高的預測準確率,同時也維持了良好的特徵縮減率。zh_TW
dc.description.abstractSoftware testing is an important stage in the software development life cycle, which takes significant time. Therefore, if we can predict and fix modules prone to defects in advance, it can save a considerable amount of costs and deliver higher-quality products. Therefore, software defect prediction techniques are applied to assist developers in reducing testing costs, software metrics are one of the methods to obtain objective descriptions of the source code, and the metrics are often used for software debugging. In this study, the NASA MDP dataset and PROMISE datasets were used. These datasets extract multiple static software metrics from the source code as input features for machine learning models. However, the datasets’ high dimensionality can lead to training complexity and overfitting issues. An ensemble feature selection method was adopted in this research to reduce the dimensionality of the datasets before training. Distinct from previous studies in software defect prediction, our research integrates three types of feature selection techniques: filter, wrapper, and embedded methods. Furthermore, three aggregation methods are employed to generate feature subsets, including union, intersection, and multi-intersection. This combination aims to overcome the limitations of a single feature selection method, and to enhance software defect prediction performance. The result of this study indicated that the ensemble feature selection based on the union method, provides higher accuracy of prediction compared to single feature selection methods, while maintaining a good feature reduction rate.en_US
DC.subject軟體缺陷預測zh_TW
DC.subject機器學習zh_TW
DC.subject特徵選擇zh_TW
DC.subject集成式特徵選擇zh_TW
DC.subject高維度資料zh_TW
DC.subjectSoftware Defect Predictionen_US
DC.subjectMachine Learningen_US
DC.subjectFeature Selectionen_US
DC.subjectEnsemble Feature Selectionen_US
DC.subjectHigh-dimensional Dataen_US
DC.title結合過濾法、包裝法及嵌入法之集成式特徵選擇於軟 體缺陷預測中之應用zh_TW
dc.language.isozh-TWzh-TW
DC.titleIntegrating Filter, Wrapper, and Embedded Methods for Ensemble Feature Selection in Software Defect Predictionen_US
DC.type博碩士論文zh_TW
DC.typethesisen_US
DC.publisherNational Central Universityen_US

若有論文相關問題,請聯絡國立中央大學圖書館推廣服務組 TEL:(03)422-7151轉57407,或E-mail聯絡  - 隱私權政策聲明