無特徵圖神經網路分類模型以偵測Android惡意軟體為例

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：14

、訪客IP：3.16.30.154

姓名

張祐綸(Yu-Lun Chang) 查詢紙本館藏

畢業系所

軟體工程研究所

論文名稱

無特徵圖神經網路分類模型以偵測Android惡意軟體為例
(Non-attributed Graph Classification Model Using GNN - A Case Study of Android Malware Detection)

相關論文

★ 基於SVM之訊務分類機制及其於SDN網路之應用	★ SD-WAN中一維卷積自編碼之流量分類與強化學習之服務導向多路徑路由
★ 基於 Sidecar 的異質函數鏈無伺服器平台	★ 無線行動隨意網路上穩定品質服務路由機制之研究
★ 應用多重移動式代理人之網路管理系統	★ 應用移動式代理人之網路協同防衛系統
★ 鏈路狀態資訊不確定下QoS路由之研究	★ 以訊務觀察法改善光突發交換技術之路徑建立效能
★ 感測網路與競局理論應用於舒適性空調之研究	★ 以搜尋樹為基礎之無線感測網路繞徑演算法
★ 基於無線感測網路之行動裝置輕型定位系統	★ 多媒體導覽玩具車
★ 以Smart Floor為基礎之導覽玩具車	★ 行動社群網路服務管理系統－應用於發展遲緩兒家庭
★ 具位置感知之穿戴式行動廣告系統	★ 調適性車載廣播

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 ( 永不開放)

摘要(中)

在手機、平板及各式各樣的IoT（Internet of Things）裝置中，Android系統的市占率維持在第一名，Android系統相較於iOS系統能夠更自由的安裝軟體，透過網路取得APK檔案進行下載即可安裝，然而這樣的方便性也帶來了不少風險，為了因應這些風險，許多針對Android 惡意軟體偵測（malware detection）的方法也因此產生，如靜態分析、動態分析、混合方法及網路分析，這些方法能夠確保使用者安裝的APK是安全無害的。在靜態分析方法中，使用程式碼（Source code）來做分析是常見的方法，其中在程式碼分析中可以透過APK檔案取得函式呼叫圖（Function Call Graph，FCG），在FCG中可以看到函式之間彼此的呼叫關係即先後順序，也可以觀察到特定函式的使用次數及頻率，由函式建構圖可以做為偵測惡意軟體的分析，然而若將這些函式呼叫的名稱直接公開可能會讓有惡意的人有機可乘，因此把函式呼叫的名稱去除能夠防止一些資料洩漏。此外FCG有數以萬計個節點，透過人眼難以觀察與辨識，因此使用圖神經網路的方式能夠快速且自動分類出該惡意軟體。
本論文針對無特徵圖分類問題，提出GNeP（GIN with ENhance Android DEgree Profile）框架，基於圖神經網路（Graph Neural Network，GNN）並結合處理無特徵圖的（Enhance Android Degree Profile，EADP）方法能夠解決無特徵圖的問題，本論文使用圖同構網路（Graph Isomorphic Network，GIN）作為GNN的模型，由實驗結果顯示在MalNet資料集，GNeP在FCG分類中有93.12%的準確率，優於圖卷積網路（Graph Convolution Network）的80.12%的準確率；本論文提出分類方法不僅適用於偵測Android惡意軟體也適用於其他的圖分類問題。

摘要(英)

Among mobile phones, tablets and various Internet of Things（IoT）devices, the market share of the Android system maintains the first place. Compared with the iOS system, the Android system can install software more freely, and the APK file can be downloaded through the Internet. However, this convenience also brings a lot of risks. In order to cope with these risks, many methods for Android malware detection have been developed, such as static analysis, dynamic analysis, hybrid methods and network analysis, these methods can ensure that the APK installed by the user is safe and harmless. In the static analysis method, using of code (Source code) for analysis is a common method. In the code analysis, the function call graph (FCG) can be obtained through the APK file and code analysis tool. The calling relationship between functions is represented as a side. It is difficult to observe the usage times and frequency of a specific function by human. The entire graph constructed by the function can be used as an analysis to detect malware. However, if the names of these function calls are directly exposed, malicious people may take advantage, so removing the names of the function calls can prevent the leakage of these data. In addition, the FCG has tens of thousands of nodes, which are difficult to observe and identify through the human eye. Therefore, the method of using graph neural network can quickly and automatically classify the malware.
In order to solve the problem of featureless graph classification, this paper proposes the main mechanism: GNeP, based on the Graph Neural Network (GNN), which has developed rapidly in recent years, combined with the method of dealing with featureless graphs（Enhance Android Degree Profile，EADP）can solve the problem of non-feature graphs. For the problem of graph classfication, this paper uses Graph Isomorphic Network (GIN) as the model of GNN. GNeP has an accuracy rate of 93.12% in the classification of function call graph, which is better than the highest accuracy rate of 80.02% for Graph Convolution Network; the classification method proposed in this paper is not only suitable for Android malware detection but also for other graph classification problems.

關鍵字(中)

★ Android惡意軟體偵測
★ 圖神經網路
★ 圖分類
★ 圖卷積網路
★ 圖同構網路

關鍵字(英)

★ Android Malware Detection
★ Graph Classification
★ Graph Convolutional Networks
★ Graph Isomorphic Networks
★ Graph Neural Networks

論文目次

目錄
摘要 i
Abstract ii
誌謝 iv
目錄 v
圖目錄 ix
表目錄 xiv
第一章緒論 1
1.1. 概要 1
1.2. 研究動機 4
1.3. 研究目的 4
1.4. 章節架構 5
第二章背景知識與相關研究 6
2.1. 偵測Android惡意軟體（Malware detection） 6
2.1.1. 偵測Android惡意軟體（Malware detection）的方法 6
2.1.2. 函式呼叫圖（Function call graph） 12
2.2. 深度學習（Deep Learning）之神經網路運作 14
2.2.1. 深度神經網路（DNN） 14
2.2.2. 卷積神經網路（CNN） 20
2.3. 圖神經網路（GNN） 23
2.3.1. 圖卷積神經網路（GCN） 26
2.3.2. 圖同構網路（GIN） 29
2.3.3. 效能評估指標 31
2.4. 相關研究 33
第三章研究方法 35
3.1. 圖資料處理和GNeP框架 35
3.1.1. 建立函式呼叫圖（Function call graph builder） 36
3.1.2. 圖特徵提取（Graph features extractor） 41
3.1.3. 資料前處理（Data Preprocessing） 46
3.1.4. GNeP框架 51
3.2. Android malware detection數據管道與運作流程 55
3.2.1. Function call graph builder 55
3.2.2. Graph features extrator 55
3.2.3. Data preprocessing module 56
3.2.4. Graph learning module 56
第四章實驗與討論 57
4.1. 系統實作環境 57
4.2. 情境A：MalNet資料集分類成效 58
4.2.1. 實驗一：使用GIN分類MalNet的結果探討 58
4.2.2. 實驗二：GIN的神經元個數及層數超參數 62
4.2.3. 實驗三：Drop rate對於GIN模型的影響 63
4.2.4. 實驗四：有向圖與無向圖的結果比較 65
4.2.5. 實驗五：資料預處理前與後的分類成效 67
4.2.6. 實驗六：透過交叉驗證評估模型 70
4.2.7. 實驗七：透過ROC評估模型 71
4.2.8. 實驗八：模型訓練之優化器比較 73
4.2.9. 實驗九：取樣數量對模型訓練的影響 74
4.3. 情境B：GIN和GCN的比較 75
4.3.1. 實驗十：GIN和GCN之準確率比較 76
4.3.2. 實驗十一：GIN和GCN之F1-score比較 77
4.3.3. 實驗十二：GIN和GCN推論時間比較 79
4.3.4. 實驗十三：GIN和GCN訓練時間比較 80
4.4. 情境C：使用REDDIT-BINARY資料集 80
4.4.1. 實驗十四：使用GIN分類REDDIT-BINARY的結果探討 81
4.4.2. 實驗十五：GIN的神經元個數及層數超參數 83
第五章結論與未來研究方向 85
5.1. 結論 85
5.2. 研究限制 86
5.3. 未來研究 86
參考文獻 88

參考文獻

[1] Statcounter, "Mobile Operating System Market Share Worldwide" Accessed on: July 25, 2022. [Online]. Available: https://gs.statcounter.com/os-market-share/mobile/worldwide
[2] AppBrain, "Android and Google Play statistics" Accessed: July. 4, 2022. [Online]. Available: https://www.appbrain.com/stats
[3] Google Play Help, "Use Google Play Protect to help keep your apps safe and your data private", Accessed on: July 25, 2022. [Online]. Available: https://support.google.com/googleplay/answer/2812853?hl=en
[4] Y. Yang, X. Du, and Z. Yang, et al. "Android malware detection based on structural features of the function call graph." Electronics 10.2 (2021): 186.
[5] S. Ruder. "An overview of gradient descent optimization algorithms." arXiv preprint arXiv:1609.04747 (2016)
[6] S. Patro, and K. Sahu. "Normalization: A preprocessing stage." arXiv preprint arXiv:1503.06462 (2015).
[7] W. Yang, Y. Zhang, J. Li, et al. "Appspear: Bytecode decrypting and dex reassembling for packed android malware." International Symposium on Recent Advances in Intrusion Detection. Springer, Cham, 2015.
[8] N. McLaughlin, J. Martinez del Rincon, B. Kang, et al. "Deep android malware detection." In Proceedings of the seventh ACM on conference on data and application security and privacy. 2017. p. 301-308.
[9] M.T. Hagan, H.B. Demuth, and M. Beale. Neural network design. PWS Publishing Co., 1997.
[10] W. Zhang, N. Luktarhan, C. Ding, et al. "Android malware detection using tcn with bytecode image." Symmetry 13.7 (2021): 1107.
[11] A. Ghasempour, N.F.M. Sani, and JA. Ovye. "Permission extraction framework for android malware detection." International Journal of Advanced Computer Science and Applications 11.11 (2020).
[12] Control-flow graph Wiki, "Control-flow graph" Accessed on: July 25, 2022. [Online]. Available: https://en.wikipedia.org/wiki/Control-flow_graph
[13] A. Mahindru, and P. Singh "Dynamic permissions based android malware detection using machine learning techniques." Proceedings of the 10th innovations in software engineering conference. 2017.
[14] T. Bhatia, and R. Kaushal. "Malware detection in android based on dynamic analysis." 2017 International Conference on Cyber Security And Protection Of Digital Services (Cyber Security). IEEE, 2017.
[15] Q. Wu, X. Zhu, and B. Liu. "A survey of android malware static detection technology based on machine learning. " Mobile Information Systems, 2021.
[16] M.N.U.R. Chowdhury, Q.E. Alahy, and H. Soliman. "Advanced Android Malware Detection Utilizing API Calls and Permissions." IT Convergence and Security. Springer, Singapore, 2021. 123-134.
[17] V. Sihag, G. Choudhary, M. Vardhan, et al. "PICAndro: Packet InspeCtion-Based Android Malware Detection." Security and Communication Networks, 2021
[18] M.M. Alani, and A.I. Awad. "AdStop: Efficient flow-based mobile adware detection using machine learning." Computers & Security 117 (2022): 102718.
[19] P. Agrawal, and B. Trivedi. "Machine learning classifiers for Android malware detection." Data Management, Analytics and Innovation. Springer, Singapore, 2021. 311-322.
[20] P. Yadav, N. Menon, V. Ravi, et al. "EfficientNet convolutional neural networks-based Android malware detection." Computers & Security 115 (2022): 102622.
[21] K. Nivedha, I. Gandhi, S. Shibi, et al. "Deep Learning Based Static Analysis of Malwares in Android Applications." Advances in Parallel Computing Technologies and Applications 40 (2021): 133.
[22] S. Dong, P. Wang, and K. Abbas. "A survey on deep learning and its applications." Computer Science Review 40 (2021): 100379.
[23] M.C. Su, and X.D. Chang. "Machine Learning: Neural Networks, Fuzzy Systems, and Genetic Algorithms. " CHWA Publication, Taipei City (2004).
[24] S.H. Haji, and A.M. Abdulazeez. "Comparison of optimization techniques based on gradient descent algorithm: A review." PalArch′s Journal of Archaeology of Egypt/Egyptology 18.4 (2021): 2715-2743.
[25] D.R. Sarvamangala, and R.V. Kulkarni. "Convolutional neural networks in medical image understanding: a survey." Evolutionary intelligence (2021): 1-22.
[26] A. Krizhevsky, I. Sutskever, and G.E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems 25 (2012).
[27] Y. Liu, Y. Ma, Z. Mao, et al. "TD-GAT: Graph Neural Network for Fault Diagnosis Knowledge Graph." 2021 China Automation Congress (CAC). IEEE, 2021.
[28] Y.H. Feng, and S.W. Zhang. "Prediction of Drug-Drug Interaction Using an Attention-Based Graph Neural Network on Drug Molecular Graphs." Molecules 27.9 (2022): 3004.
[29] Acetic acid Wiki, "Acetic acid" Accessed on: July 25, 2022. [Online]. Available: https://en.wikipedia.org/wiki/Acetic_acid
[30] S. Min, Z. Gao, J. Peng, et al. "STGSN—A Spatial–Temporal Graph Neural Network framework for time-evolving social networks." Knowledge-Based Systems 214 (2021): 106746.
[31] Stanford, "CS224W: Machine Learning with Graphs" Accessed on: July 25, 2022. [Online]. Available: http://web.stanford.edu/class/cs224w/
[32] A. Derrow-Pinion, J. She, D. Wong, et al. "Eta prediction with graph neural networks in google maps." Proceedings of the 30th ACM International Conference on Information & Knowledge Management. 2021. (pp. 3767-3776).
[33] E. Alsentzer, S. Finlayson, M Li, et al. "Subgraph neural networks." Advances in Neural Information Processing Systems 33 (2020): 8017-8029.
[34] T. Kasanishi, W. Xueting, and W. Toshihiko. "Edge-Level Explanations for Graph Neural Networks by Extending Explainability Methods for Convolutional Neural Networks." 2021 IEEE International Symposium on Multimedia (ISM). IEEE, 2021.
[35] M. Xu, H. Wang, B. Ni, et al. "Self-supervised graph-level representation learning with local and global structure." International Conference on Machine Learning. PMLR, 2021, pp. 11548-11558.
[36] Z. Hu, Y. Dong, K. Wang, et al. "Gpt-gnn: Generative pre-training of graph neural networks." Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2020, pp. 1857-1867
[37] X. Zeng, X. Tu, Y. Liu, et al. "Toward better drug discovery with knowledge graph." Current opinion in structural biology 72 (2022): 114-126.
[38] T.N. Kipf, and M. Welling. "Semi-supervised classification with graph convolutional networks." arXiv preprint arXiv:1609.02907 (2016).
[39] C. Cai, and Y. Wang. "A simple yet effective baseline for non-attributed graph classification." arXiv preprint arXiv:1811.03508 (2018).
[40] M. Cai, Y. Jiang, C. Gao, et al. "Learning features from enhanced function call graphs for Android malware detection." Neurocomputing 423 (2021): 301-307.
[41] K. Xu, Y. Li, R.H. Deng, et al. "Deeprefiner: Multi-layer android malware detection system applying deep neural networks." 2018 IEEE European Symposium on Security and Privacy (EuroS&P). IEEE, 2018.
[42] Y. Yang, X. Du, Z. Yang, et al. "Android malware detection based on structural features of the function call graph." Electronics 10.2 (2021): 186.
[43] J. Fairbanks, A. Orbe, C. Patterson, et al. "Identifying ATT&CK Tactics in Android Malware Control Flow Graph Through Graph Representation Learning and Interpretability." 2021 IEEE International Conference on Big Data (Big Data). IEEE, 2021.
[44] S.H. Haji, and A.M. Abdulazeez. "Comparison of optimization techniques based on gradient descent algorithm: A review." PalArch′s Journal of Archaeology of Egypt/Egyptology 18.4 (2021): 2715-2743.
[45] R. Kohavi. "A study of cross-validation and bootstrap for accuracy estimation and model selection." Ijcai. Vol. 14. No. 2. 1995.
[46] J.D. Rodriguez, P. Aritz, and Jose A. Lozano. "Sensitivity analysis of k-fold cross validation in prediction error estimation." IEEE transactions on pattern analysis and machine intelligence 32.3 (2009): 569-575.
[47] J.A. Hanley. "Receiver operating characteristic (ROC) methodology: the state of the art." Crit Rev Diagn Imaging 29.3 (1989): 307-335.
[48] S. Freitas, Y. Dong, J. Neil, et al. "A large-scale database for graph representation learning." arXiv preprint arXiv:2011.07682 (2020).

指導教授

周立德(Li-Der Chou)

審核日期

2022-9-22

推文