| 摘要: | 隨著網路快速發展與攻擊技術持續演進,入侵檢測系統(Intrusion Detection System, IDS)已成為資安防禦中的關鍵技術。為提升IDS在惡意流量辨識上的效能,深度學習(Deep Learning)逐漸被應用於該領域。然而現有公開網路流量資料集普遍存在資料不平衡問題,導致分類器在辨識少數類攻擊時表現不佳。為解決此問題,現有合成少數過採樣技術(Synthetic Minority Over-sampling Technique, SMOTE)及生成對抗網路(Generative Adversarial Networks, GAN)以增強少數類別樣本。然而SMOTE則僅基於線性內插生成樣本,無法有效捕捉複雜資料分布,而GAN類模型常面臨模式崩潰(Mode Collapse)與訓練不穩定的問題。 本論文提出一種基於去雜訊擴散機率模型(Denoising Diffusion Probabilistic Models, DDPM)之惡意網路流量生成方法,命名為Malicious Network Traffic Generation DDPM (MNTG-DDPM),該方法透過逐步雜訊還原機制,並整合特徵轉換模組以強化生成資料與原始資料的分布一致性。實驗結果顯示整合特徵轉換模組使生成資料與原始資料之平均Wasserstein距離下降57.94%,特徵標準差之差異平均減少了46.63%,特徵相關性偏差降低了45.93%。在應用於入侵偵測模型時,使用MNTG-DDPM增強的資料對Brute Force與Web-based兩類占比不到0.1%的少數攻擊樣本之F1-Score分別提升19.25%與8.92%,並且在CICIoT2023與UNSW-NB15資料集對IDS整體效能之影響實驗,相較於DDPM其Macro average最高提升6.16%、Recall提升7.90%、Precision提升7.27%,與GW-GRU相比,Macro average提升達4.67%、Recall提升8.54%、Precision提升4.68%。於多樣性評估中,MNTG-DDPM達到51.31%的Recall,而在可靠性評估中,MNTG-DDPM達到60.89%的F1-score,顯示其具備良好的類別涵蓋性與資料一致性,並有效克服傳統生成方法易出現的模式崩潰問題。
 ;With the rapid advancement of network technologies and the continuous evolution of cyber-attack strategies, Intrusion Detection Systems (IDS) have become a critical component in cybersecurity defense. To improve the detection of malicious traffic, deep learning techniques have been increasingly applied. However, most publicly available network traffic datasets suffer from severe class imbalance, resulting in poor performance when identifying minority attack classes. Existing augmentation techniques, such as the Synthetic Minority Over-sampling Technique (SMOTE) and Generative Adversarial Networks (GAN), have been widely explored. Nevertheless, SMOTE is constrained by linear interpolation, and GAN-based approaches often face mode collapse and unstable training behavior.
 This study proposes a malicious network traffic generation method based on Denoising Diffusion Probabilistic Models (DDPM), named Malicious Network Traffic Generation DDPM (MNTG-DDPM). The method incorporates a stepwise denoising mechanism and integrates a Feature Transformation (FT) module to enhance the distributional consistency between generated and original data. Experimental results show that the FT module reduces the average Wasserstein distance by 57.94%, decreases the average feature-wise standard deviation difference by 46.63%, and lowers the feature correlation deviation by 45.93%. When applied to intrusion detection systems (IDS), MNTG-DDPM improves the F1-scores of Brute Force and Web-based attacks—both accounting for less than 0.1% of total samples—by 19.25% and 8.92%, respectively. It also increases the average precision (AP) of the precision-recall curve by 0.20 and 0.17. In diversity evaluation, MNTG-DDPM achieves 51.31% recall, and in reliability evaluation, it reaches an F1-score of 60.89%, demonstrating strong category coverage and data consistency, while effectively mitigating the mode collapse problem commonly observed in traditional generative methods.
 |