使用XAI優化圖神經網路模型於網路惡意流量偵測之研究

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：23

、訪客IP：18.117.142.248

姓名

吳晨緯(Chen-Wei Wu) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

使用XAI優化圖神經網路模型於網路惡意流量偵測之研究
(A Study of Malicious Network Traffic Detection Based on Graph Neural Network and Using eXplainable Artificial Intelligence to Optimize Model)

相關論文

★ 無線行動隨意網路上穩定品質服務路由機制之研究	★ 應用多重移動式代理人之網路管理系統
★ 應用移動式代理人之網路協同防衛系統	★ 鏈路狀態資訊不確定下QoS路由之研究
★ 以訊務觀察法改善光突發交換技術之路徑建立效能	★ 感測網路與競局理論應用於舒適性空調之研究
★ 以搜尋樹為基礎之無線感測網路繞徑演算法	★ 基於無線感測網路之行動裝置輕型定位系統
★ 多媒體導覽玩具車	★ 以Smart Floor為基礎之導覽玩具車
★ 行動社群網路服務管理系統－應用於發展遲緩兒家庭	★ 具位置感知之穿戴式行動廣告系統
★ 調適性車載廣播	★ 車載網路上具預警能力之車輛碰撞避免機制
★ 應用於無線車載網路上之合作式交通資訊傳播機制以改善車輛擁塞	★ 智慧都市中應用車載網路以改善壅塞之調適性虛擬交通號誌

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 ( 永不開放)

摘要(中)

現今網路技術的多樣化、5G 網路的蓬勃發展以及各式各樣雲端服務的出現，促使智慧型手機、智慧型穿戴裝置及物聯網（Internet of Thing）設備的數量大量的成長，使網路攻擊（Cyberattack）防護的重要性隨之提高。在一般的入侵檢測系統（Intrusion Detection System）上，流量檢測方式只有使用到網路中兩個節點之間的流量資訊，但是在整個網路中是同時的有大量的流量在傳遞，使用一般的方法並無法使用到網路中多個節點的流量資訊。將網路流量轉換為圖（Graph）的方式，可以利用到整個網路中更多的流量資訊，但同時在有如此豐富的資料上，要如何有效的去使用以及計算如此龐大的資料將會是一個挑戰。
本論文為了解決在一般入侵檢測系統只使用到兩節點之流量問題，提出一個專門用於邊特徵的圖神經網路演算法 EdgeSAGE（Edge SAmple and aggreGatE），並用其建立惡意流量分類模型，透過將流量轉換為圖並利用圖的結構性將流量特徵進行傳遞與聚合，使流量在預測時可以有效的使用到鄰近的流量資訊，讓模型在攻擊流量的分類上變得更加準確。此外使用XAI（eXplainable Artificial Intelligence）技術去分析模型的輸入特徵，計算出每個特徵對於模型的重要性，並利用分析結果降低模型輸入之維度同時減少模型中的參數，以達到降低模型計算成本之優化效果。使用EdgeSAGE相較於相同架構的DNN模型F1-Score可以提升15.48%，在EdgeSAGE模型優化結果上可以在幾乎不影響模型準確度的情況下，降低12.9%的預測時間與提升14.8%的Throughput，另外在降低 36.1% 的預測時間和提升 57.2% 的 Throughput 的情況下，EdgeSAGE模型仍可保有 92.4 %的F1-Score。

摘要(英)

The diversification of today′s network technologies, the vigorous development of 5G networks, and the emergence of various cloud services have led to massive growth in the number of smartphones, smart wearable devices, and Internet of Things (Internet of Things) devices. The importance of cyberattack protection has increased accordingly. In general intrusion detection systems, the traffic detection method only uses the traffic information between two nodes in the network, but a large amount of traffic is transmitted simultaneously in the entire network. Traffic information to multiple nodes in the network is not available using general methods. The way of converting network traffic into graph can utilize more traffic information in the entire network, but at the same time, with such abundant data, how to effectively use and calculate such huge data will be a challenge.
To solve the problem that only two nodes are used in general intrusion detection systems, this paper proposes a graph neural network algorithm EdgeSAGE (Edge SAmple and aggreGatE) specially used for edge features. And use it to build a malicious traffic classification model. By converting traffic into a graph and using the structure of the graph to transmit and aggregate traffic features, the adjacent traffic information can be effectively utilized in traffic prediction, and the model can be used in the classification of attack traffic to become more accurate. In addition, XAI (eXplainable Artificial Intelligence) technology is used to analyze the input features of the model, calculate the importance of each feature to the model, and use the analysis results to reduce the dimension of the model input and reduce the parameters in the model. To achieve the effect of optimizing the model and reducing the calculation cost. Compared with the DNN model of the same architecture, the F1-Score of EdgeSAGE can be improved by 15.48%. The optimization results of EdgeSAGE model can reduce the prediction time by 12.9% and improve the throughput by 14.8% with hardly change in the accuracy of the model. With a 36.1% reduction in prediction time and a 57.2% improvement in throughput, the EdgeSAGE model still retains a 92.4% F1-Score.

關鍵字(中)

★ 入侵檢測系統
★ 流量分類
★ 圖神經網路
★ 可解釋人工智慧
★ 特徵分析
★ 模型優化

關鍵字(英)

★ Intrusion Detection System
★ Traffic Classification
★ Graph Neural Network
★ Explainable Artificial Intelligence
★ Feature Analysis
★ Optimize Model

論文目次

摘要 i
Abstract ii
誌謝 iv
目錄 v
圖目錄 viii
表目錄 x
第一章緒論 1
1.1. 概要 1
1.2. 研究動機 2
1.3. 研究目的 3
1.4. 章節架構 4
第二章背景知識與相關研究 5
2.1. 入侵檢測系統（Intrusion Detection System） 5
2.1.1. 基於網路的入侵檢測系統（Network-based Intrusion Detection System） 5
2.1.2. 基於圖的入侵檢測系統（Graph-based Intrusion Detection System） 7
2.2. 圖神經網路（Graph Neural Network） 8
2.3. 可解釋AI（eXplainable Artificial Intelligence） 10
2.4. 相關研究 13
第三章研究方法 16
3.1. 資料前處理與流量轉換圖格式 16
3.1.1. 網路惡意流量蒐集 17
3.1.2. 資料前處理 18
3.1.3. 流量轉換圖格式 21
3.2. EdgeSAGE與模型訓練 25
3.2.1. EdgeSAGE演算法 25
3.2.2. EdgeSAGE模型 31
3.3. 模型特徵分析 36
3.4. 系統架構與運作流程 40
3.5. 系統實作 44
第四章實驗與討論 45
4.1. 情境一：EdgeSAGE流量分類成效與參數比較 45
4.1.1. 實驗一：攻擊與正常流量分類成效 45
4.1.2. 實驗二：時間局部性對於分類成效之影響 48
4.1.3. 實驗三：圖之方向性對於分類成效之影響 52
4.1.4. 實驗四：邊卷積之Hop數量對於分類成效之影響 54
4.2. 情境二：EdgeSAGE與DNN模型比較 57
4.2.1. 實驗五：流量分類成效比較 57
4.2.2. 實驗六：模型預測時間比較 59
4.3. 情境三：特徵分析與模型優化 60
4.3.1. 實驗七：特徵重要性分析 60
4.3.2. 實驗八：模型優化準確度比較 64
4.3.3. 實驗九：模型優化性能比較 66
第五章結論與未來研究方向 69
5.1. 結論 69
5.2. 未來研究 70
參考文獻 73
附錄 78

參考文獻

[1] Cisco, “Cisco Annual Internet Report (2018–2023) White Paper”, 2022, Accessed on April 10, 2022. [Online]. Available: https://www.cisco.com/c/en/us/solutions/collateral/executive-perspectives/annual-internet-report/white-paper-c11-741490.html
[2] Wikipedia, “Cyberattack”, 2022, Accessed on April 10, 2022. [Online]. Available: https://en.wikipedia.org/wiki/Cyberattack
[3] Geeksforgeeks, “Intrusion Detection System (IDS)”, 2022, Accessed on April 12, 2022. [Online]. Available: https://www.geeksforgeeks.org/intrusion-detection-system-ids/
[4] Alaidaros, Hashem, Massudi Mahmuddin, and Ali Al Mazari. "An overview of flow-based and packet-based intrusion detection performance in high speed networks." Proceedings of the International Arab Conference on Information Technology. 2011, pp. 1-9.
[5] Wikipedia, “Deep learning”, 2022, Accessed on April 11, 2022. [Online]. Available: https://en.wikipedia.org/wiki/Deep_learning
[6] Wikipedia, “Convolutional neural network”, 2022, Accessed on April 11, 2022. [Online]. Available: https://en.wikipedia.org/wiki/Convolutional_neural_network
[7] Wikipedia, “Recurrent neural network”, 2022, Accessed on April 11, 2022. [Online]. Available: https://en.wikipedia.org/wiki/Recurrent_neural_network
[8] Distill, “A Gentle Introduction to Graph Neural Networks”, 2022, Accessed on April 10, 2022. [Online]. Available: https://distill.pub/2021/gnn-intro/
[9] Christophm, “Interpretable Machine Learning”, 2022, Accessed on April 11, 2022. [Online]. Available: https://christophm.github.io/interpretable-ml-book/
[10] Patcha, Animesh, and Jung-Min Park. "An overview of anomaly detection techniques: Existing solutions and latest technological trends." Computer networks 51.12 (2007): 3448-3470.
[11] Staniford-Chen, Stuart, et al. "GrIDS-a graph based intrusion detection system for large networks." Proceedings of the 19th national information systems security conference. Vol. 1. 1996, pp. 361-370.
[12] Z. Mingqiang, H. Hui and W. Qian, "A graph-based clustering algorithm for anomaly intrusion detection," 2012 7th International Conference on Computer Science & Education (ICCSE), 2012, pp. 1311-1314.
[13] Gardner, Matt W., and S. R. Dorling. "Artificial neural networks (the multilayer perceptron)—a review of applications in the atmospheric sciences." Atmospheric environment 32.14-15 (1998): 2627-2636.
[14] Hamilton, Will, Zhitao Ying, and Jure Leskovec. "Inductive representation learning on large graphs." Advances in neural information processing systems 30 (2017).
[15] Ribeiro, Marco Tulio, Sameer Singh, and Carlos Guestrin. "′ Why should i trust you?′ Explaining the predictions of any classifier." Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. 2016, pp. 1135-1144.
[16] Lundberg, Scott M., and Su-In Lee. "A unified approach to interpreting model predictions." Advances in neural information processing systems 30 (2017).
[17] Q. Huang, M. Yamada, Y. Tian, D. Singh and Y. Chang, "GraphLIME: Local Interpretable Model Explanations for Graph Neural Networks," IEEE Transactions on Knowledge and Data Engineering, 2022.
[18] Ying, Zhitao, et al. "Gnnexplainer: Generating explanations for graph neural networks." Advances in neural information processing systems 32 (2019).
[19] Wikipedia, “Shapley value”, 2022, Accessed on April 12, 2022. [Online]. Available: https://en.wikipedia.org/wiki/Shapley_value
[20] A. Sperotto, G. Schaffrath, R. Sadre, C. Morariu, A. Pras and B. Stiller, "An Overview of IP Flow-Based Intrusion Detection," IEEE Communications Surveys & Tutorials, vol. 12, no. 3, pp. 343-356, Third Quarter 2010.
[21] Wang, Jin-Fa, et al. "Using complex network theory for temporal locality in network traffic flows." Physica A: Statistical Mechanics and its Applications 524 (2019): 722-736.
[22] X. Yuan, C. Li and X. Li, "DeepDefense: Identifying DDoS Attack via Deep Learning," 2017 IEEE International Conference on Smart Computing (SMARTCOMP), 2017, pp. 1-8.
[23] C. Yin, Y. Zhu, J. Fei and X. He, "A Deep Learning Approach for Intrusion Detection Using Recurrent Neural Networks," IEEE Access, vol. 5, pp. 21954-21961, 2017.
[24] W. W. Lo, S. Layeghy, M. Sarhan, M. Gallagher and M. Portmann, "E-GraphSAGE: A Graph Neural Network based Intrusion Detection System for IoT," NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, 2022, pp. 1-9.
[25] Jin Kim, Nara Shin, S. Y. Jo and Sang Hyun Kim, "Method of intrusion detection using deep neural network," 2017 IEEE International Conference on Big Data and Smart Computing (BigComp), 2017, pp. 313-316.
[26] Min, Erxue, et al. "TR-IDS: Anomaly-based intrusion detection through text-convolutional neural network and random forest." Security and Communication Networks 2018 (2018).
[27] Tang, Duyu, et al. "Learning sentiment-specific word embedding for twitter sentiment classification." ACL (1). 2014, pp. 1555-1565.
[28] Chen, Yahui. "Convolutional neural network for sentence classification." MS thesis. University of Waterloo, 2015.
[29] Mahbooba, Basim, et al. "Explainable artificial intelligence (XAI) to enhance trust management in intrusion detection systems using decision tree model." Complexity 2021 (2021).
[30] Quinlan, J. Ross. "Induction of decision trees." Machine learning 1.1 (1986): 81-106.
[31] Norton, Steven W. "Generating better decision trees." IJCAI. Vol. 89. 1989.
[32] UNB, “Intrusion Detection Evaluation Dataset (CIC-IDS2017)”, 2022, Accessed on April 21, 2022. [Online]. Available: https://www.unb.ca/cic/datasets/ids-2017.html
[33] Sharafaldin, Iman, Arash Habibi Lashkari, and Ali A. Ghorbani. "Toward generating a new intrusion detection dataset and intrusion traffic characterization." ICISSp 1 (2018): 108-116.
[34] UNB, “CICFlowMeter (formerly ISCXFlowMeter)”, 2022, Accessed on April 21, 2022. [Online]. Available: https://www.unb.ca/cic/research/applications.html#CICFlowMeter
[35] Paszke, Adam, et al. "Pytorch: An imperative style, high-performance deep learning library." Advances in neural information processing systems 32 (2019).
[36] NVIDIA Corporation, “CUDA TOOLKIT”, 2022, April 24, 2022. [Online]. Available: https://developer.nvidia.com/cuda-toolkit
[37] NVIDIA Corporation, “NVIDIA cuDNN”, 2022, April 24, 2022. [Online]. Available: https://developer.nvidia.com/cudnn
[38] Wang, Minjie, et al. "Deep graph library: A graph-centric, highly-performant package for graph neural networks." arXiv preprint arXiv:1909.01315 (2019).
[39] Weiss, Karl, Taghi M. Khoshgoftaar, and DingDing Wang. "A survey of transfer learning." Journal of Big data 3.1 (2016): 1-40.
[40] Chawla, Nitesh V., et al. "SMOTE: synthetic minority over-sampling technique." Journal of artificial intelligence research 16 (2002): 321-357.
[41] Wikipedia, “Oversampling and undersampling in data analysis”, 2022, Accessed on April 25, 2022. [Online]. Available: https://en.wikipedia.org/wiki/Oversampling_and_undersampling_in_data_analysis#Tomek_links
[42] Github, “CICFlowMeter”, 2022, Accessed on April 30, 2022. [Online]. Available: https://github.com/CanadianInstituteForCybersecurity/CICFlowMeter

指導教授

周立德(Li-Der Chou)

審核日期

2022-8-11

推文