摘要(英) |
Among mobile phones, tablets and various Internet of Things(IoT)devices, the market share of the Android system maintains the first place. Compared with the iOS system, the Android system can install software more freely, and the APK file can be downloaded through the Internet. However, this convenience also brings a lot of risks. In order to cope with these risks, many methods for Android malware detection have been developed, such as static analysis, dynamic analysis, hybrid methods and network analysis, these methods can ensure that the APK installed by the user is safe and harmless. In the static analysis method, using of code (Source code) for analysis is a common method. In the code analysis, the function call graph (FCG) can be obtained through the APK file and code analysis tool. The calling relationship between functions is represented as a side. It is difficult to observe the usage times and frequency of a specific function by human. The entire graph constructed by the function can be used as an analysis to detect malware. However, if the names of these function calls are directly exposed, malicious people may take advantage, so removing the names of the function calls can prevent the leakage of these data. In addition, the FCG has tens of thousands of nodes, which are difficult to observe and identify through the human eye. Therefore, the method of using graph neural network can quickly and automatically classify the malware.
In order to solve the problem of featureless graph classification, this paper proposes the main mechanism: GNeP, based on the Graph Neural Network (GNN), which has developed rapidly in recent years, combined with the method of dealing with featureless graphs(Enhance Android Degree Profile,EADP)can solve the problem of non-feature graphs. For the problem of graph classfication, this paper uses Graph Isomorphic Network (GIN) as the model of GNN. GNeP has an accuracy rate of 93.12% in the classification of function call graph, which is better than the highest accuracy rate of 80.02% for Graph Convolution Network; the classification method proposed in this paper is not only suitable for Android malware detection but also for other graph classification problems. |
參考文獻 |
[1] Statcounter, "Mobile Operating System Market Share Worldwide" Accessed on: July 25, 2022. [Online]. Available: https://gs.statcounter.com/os-market-share/mobile/worldwide
[2] AppBrain, "Android and Google Play statistics" Accessed: July. 4, 2022. [Online]. Available: https://www.appbrain.com/stats
[3] Google Play Help, "Use Google Play Protect to help keep your apps safe and your data private", Accessed on: July 25, 2022. [Online]. Available: https://support.google.com/googleplay/answer/2812853?hl=en
[4] Y. Yang, X. Du, and Z. Yang, et al. "Android malware detection based on structural features of the function call graph." Electronics 10.2 (2021): 186.
[5] S. Ruder. "An overview of gradient descent optimization algorithms." arXiv preprint arXiv:1609.04747 (2016)
[6] S. Patro, and K. Sahu. "Normalization: A preprocessing stage." arXiv preprint arXiv:1503.06462 (2015).
[7] W. Yang, Y. Zhang, J. Li, et al. "Appspear: Bytecode decrypting and dex reassembling for packed android malware." International Symposium on Recent Advances in Intrusion Detection. Springer, Cham, 2015.
[8] N. McLaughlin, J. Martinez del Rincon, B. Kang, et al. "Deep android malware detection." In Proceedings of the seventh ACM on conference on data and application security and privacy. 2017. p. 301-308.
[9] M.T. Hagan, H.B. Demuth, and M. Beale. Neural network design. PWS Publishing Co., 1997.
[10] W. Zhang, N. Luktarhan, C. Ding, et al. "Android malware detection using tcn with bytecode image." Symmetry 13.7 (2021): 1107.
[11] A. Ghasempour, N.F.M. Sani, and JA. Ovye. "Permission extraction framework for android malware detection." International Journal of Advanced Computer Science and Applications 11.11 (2020).
[12] Control-flow graph Wiki, "Control-flow graph" Accessed on: July 25, 2022. [Online]. Available: https://en.wikipedia.org/wiki/Control-flow_graph
[13] A. Mahindru, and P. Singh "Dynamic permissions based android malware detection using machine learning techniques." Proceedings of the 10th innovations in software engineering conference. 2017.
[14] T. Bhatia, and R. Kaushal. "Malware detection in android based on dynamic analysis." 2017 International Conference on Cyber Security And Protection Of Digital Services (Cyber Security). IEEE, 2017.
[15] Q. Wu, X. Zhu, and B. Liu. "A survey of android malware static detection technology based on machine learning. " Mobile Information Systems, 2021.
[16] M.N.U.R. Chowdhury, Q.E. Alahy, and H. Soliman. "Advanced Android Malware Detection Utilizing API Calls and Permissions." IT Convergence and Security. Springer, Singapore, 2021. 123-134.
[17] V. Sihag, G. Choudhary, M. Vardhan, et al. "PICAndro: Packet InspeCtion-Based Android Malware Detection." Security and Communication Networks, 2021
[18] M.M. Alani, and A.I. Awad. "AdStop: Efficient flow-based mobile adware detection using machine learning." Computers & Security 117 (2022): 102718.
[19] P. Agrawal, and B. Trivedi. "Machine learning classifiers for Android malware detection." Data Management, Analytics and Innovation. Springer, Singapore, 2021. 311-322.
[20] P. Yadav, N. Menon, V. Ravi, et al. "EfficientNet convolutional neural networks-based Android malware detection." Computers & Security 115 (2022): 102622.
[21] K. Nivedha, I. Gandhi, S. Shibi, et al. "Deep Learning Based Static Analysis of Malwares in Android Applications." Advances in Parallel Computing Technologies and Applications 40 (2021): 133.
[22] S. Dong, P. Wang, and K. Abbas. "A survey on deep learning and its applications." Computer Science Review 40 (2021): 100379.
[23] M.C. Su, and X.D. Chang. "Machine Learning: Neural Networks, Fuzzy Systems, and Genetic Algorithms. " CHWA Publication, Taipei City (2004).
[24] S.H. Haji, and A.M. Abdulazeez. "Comparison of optimization techniques based on gradient descent algorithm: A review." PalArch′s Journal of Archaeology of Egypt/Egyptology 18.4 (2021): 2715-2743.
[25] D.R. Sarvamangala, and R.V. Kulkarni. "Convolutional neural networks in medical image understanding: a survey." Evolutionary intelligence (2021): 1-22.
[26] A. Krizhevsky, I. Sutskever, and G.E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems 25 (2012).
[27] Y. Liu, Y. Ma, Z. Mao, et al. "TD-GAT: Graph Neural Network for Fault Diagnosis Knowledge Graph." 2021 China Automation Congress (CAC). IEEE, 2021.
[28] Y.H. Feng, and S.W. Zhang. "Prediction of Drug-Drug Interaction Using an Attention-Based Graph Neural Network on Drug Molecular Graphs." Molecules 27.9 (2022): 3004.
[29] Acetic acid Wiki, "Acetic acid" Accessed on: July 25, 2022. [Online]. Available: https://en.wikipedia.org/wiki/Acetic_acid
[30] S. Min, Z. Gao, J. Peng, et al. "STGSN—A Spatial–Temporal Graph Neural Network framework for time-evolving social networks." Knowledge-Based Systems 214 (2021): 106746.
[31] Stanford, "CS224W: Machine Learning with Graphs" Accessed on: July 25, 2022. [Online]. Available: http://web.stanford.edu/class/cs224w/
[32] A. Derrow-Pinion, J. She, D. Wong, et al. "Eta prediction with graph neural networks in google maps." Proceedings of the 30th ACM International Conference on Information & Knowledge Management. 2021. (pp. 3767-3776).
[33] E. Alsentzer, S. Finlayson, M Li, et al. "Subgraph neural networks." Advances in Neural Information Processing Systems 33 (2020): 8017-8029.
[34] T. Kasanishi, W. Xueting, and W. Toshihiko. "Edge-Level Explanations for Graph Neural Networks by Extending Explainability Methods for Convolutional Neural Networks." 2021 IEEE International Symposium on Multimedia (ISM). IEEE, 2021.
[35] M. Xu, H. Wang, B. Ni, et al. "Self-supervised graph-level representation learning with local and global structure." International Conference on Machine Learning. PMLR, 2021, pp. 11548-11558.
[36] Z. Hu, Y. Dong, K. Wang, et al. "Gpt-gnn: Generative pre-training of graph neural networks." Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2020, pp. 1857-1867
[37] X. Zeng, X. Tu, Y. Liu, et al. "Toward better drug discovery with knowledge graph." Current opinion in structural biology 72 (2022): 114-126.
[38] T.N. Kipf, and M. Welling. "Semi-supervised classification with graph convolutional networks." arXiv preprint arXiv:1609.02907 (2016).
[39] C. Cai, and Y. Wang. "A simple yet effective baseline for non-attributed graph classification." arXiv preprint arXiv:1811.03508 (2018).
[40] M. Cai, Y. Jiang, C. Gao, et al. "Learning features from enhanced function call graphs for Android malware detection." Neurocomputing 423 (2021): 301-307.
[41] K. Xu, Y. Li, R.H. Deng, et al. "Deeprefiner: Multi-layer android malware detection system applying deep neural networks." 2018 IEEE European Symposium on Security and Privacy (EuroS&P). IEEE, 2018.
[42] Y. Yang, X. Du, Z. Yang, et al. "Android malware detection based on structural features of the function call graph." Electronics 10.2 (2021): 186.
[43] J. Fairbanks, A. Orbe, C. Patterson, et al. "Identifying ATT&CK Tactics in Android Malware Control Flow Graph Through Graph Representation Learning and Interpretability." 2021 IEEE International Conference on Big Data (Big Data). IEEE, 2021.
[44] S.H. Haji, and A.M. Abdulazeez. "Comparison of optimization techniques based on gradient descent algorithm: A review." PalArch′s Journal of Archaeology of Egypt/Egyptology 18.4 (2021): 2715-2743.
[45] R. Kohavi. "A study of cross-validation and bootstrap for accuracy estimation and model selection." Ijcai. Vol. 14. No. 2. 1995.
[46] J.D. Rodriguez, P. Aritz, and Jose A. Lozano. "Sensitivity analysis of k-fold cross validation in prediction error estimation." IEEE transactions on pattern analysis and machine intelligence 32.3 (2009): 569-575.
[47] J.A. Hanley. "Receiver operating characteristic (ROC) methodology: the state of the art." Crit Rev Diagn Imaging 29.3 (1989): 307-335.
[48] S. Freitas, Y. Dong, J. Neil, et al. "A large-scale database for graph representation learning." arXiv preprint arXiv:2011.07682 (2020). |