使用混合RGB圖像擴增技術提升Android小樣本惡意家族分類能力

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：16

、訪客IP：18.188.66.142

姓名

丁翊軒(Yi-Hsuan Ting) 查詢紙本館藏

畢業系所

資訊管理學系

論文名稱

使用混合RGB圖像擴增技術提升Android小樣本惡意家族分類能力
(RGB-based Hybrid Augmentation for Android Minor Malware Family Classification)

相關論文

★ 應用數位版權管理機制於數位影音光碟內容保護之研究	★ 以應用程式虛擬化技術達成企業軟體版權管理之研究
★ 以IAX2為基礎之網頁電話架構設計	★ 應用機器學習技術協助警察偵辦詐騙案件之研究
★ 擴充防止詐欺及保護隱私功能之帳戶式票務系統研究-以大眾運輸為例	★ 網際網路半結構化資料之蒐集與整合研究
★ 電子商務環境下網路購物幫手之研究	★ 網路安全縱深防護機制之研究
★ 國家寬頻實驗網路上資源預先保留與資源衝突之研究	★ 以樹狀關聯式架構偵測電子郵件病毒之研究
★ 考量地區差異性之隨選視訊系統影片配置研究	★ 不信任區域網路中數位證據保留之研究
★ 入侵偵測系統事件說明暨自動增加偵測規則之整合性輔助系統研發	★ 利用程序追蹤方法關聯分散式入侵偵測系統之入侵警示研究
★ 一種網頁資訊擷取程式之自動化產生技術研發	★ 應用XML/XACML於工作流程管理系統之授權管制研究

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

隨著電腦運算速度的提升，許多研究透過深度學習方法來進行Android惡意程式檢測，但是除了惡意程式的二元檢測外，惡意程式家族分類更能夠使惡意程式研究人員了解其惡意家族的行為進而優化檢測方式及預防其變體。然而新出現的惡意程式家族數量少，容易導致分類效果不理想，而基於生成對抗網路的方法來進行擴增雖然可以提升分類效果，但是少量的資料還是會導致生成對抗網路方法所生成出的樣本品質不穩定，進而使分類效果提升有限。因此，本研究提出一種混合擴增方法，首先將提取惡意程式特徵並轉換成RGB圖像，再將樣本數過少的家族先經過高斯雜訊擴增方法（Gaussian Noise），再結合對於圖像擴增有更好效果的深度捲積生成對抗網路（Deep Convolutional Generative Adversarial Network，DCGAN）來擴增少數樣本的惡意程式家族，最後輸入至CNN（Convolutional Neural Network）進行家族分類。實驗結果顯示，使用本研究所提出的混合擴增方法，相較於未擴增以及只使用深度捲積生成對抗網路進行擴增，其F1-Score分別提升7~34%以及2%~7%。

摘要(英)

With the improvement of computer computing speed, many researches use deep learning for Android malware detection. In addition to malware detection, malware family classification will help malware researchers understand the behavior of the malware families to optimize detection and prevent variants. However, the new malware family has few samples, which lead to poor classification results. Although the deep learning augmentation method (GAN-based) can improve the classification results, but minor data will still lead to the unstable quality of the data generated by the deep learning augmentation method, which will limit the improvement of classification results. In this study, we will propose a hybrid augmentation method, first extracting malware features and converting them into RGB images, and then the minor families will augment by the gaussian noise augmentation method, and then combined with the deep convolutional generative adversarial network (DCGAN) which have better effect on image augmentation, and finally input to CNN for family classification. The experimental results show that using the hybrid augmentation method proposed in this study, compared to no augmentation and augmentation with only using the deep convolutional generative adversarial network, the F1-Score increased between 7%~34% and 2%~7%.

關鍵字(中)

★ Android
★ 惡意程式檢測
★ 惡意家族分類
★ 資料擴增
★ 混合擴增
★ 深度學習

關鍵字(英)

★ Android
★ Malware detection
★ Malware family classification
★ Data augmentation
★ Hybrid augmentation
★ Deep learning

論文目次

一、緒論 1
1-1 研究背景 1
1-2 研究動機 3
1-3 研究目的與貢獻 5
1-4 章節架構 6
二、相關研究 7
2-1 Android惡意程式分析方法 7
2-2 程式碼圖像化 11
2-3 擴增技術 17
三、研究方法 23
3-1 資料前處理 24
3-1-1 反編譯模組 24
3-1-2 RGB圖像轉換模組 24
3-2 混合擴增 25
3-2-1 基本擴增模組 25
3-2-2 深度學習擴增模組 29
3-3 家族分類 32
3-3-1 惡意程式家族分類模組 32
3-4 評估指標 32
3-5 系統運作流程 34
四、實驗結果 35
4-1 實驗環境以及資料集 35
4-1-1 實驗環境 35
4-1-2 資料集 35
4-2 實驗一（混合擴增實驗） 37
4-2-1 混合擴增實驗於Drebin資料集 37
4-2-2 混合擴增實驗於AMD資料集 39
4-2-3 混合擴增實驗於CICMalDroid2020資料集 40
4-3 實驗二（不同加噪方法比較實驗） 43
4-3-1 不同加噪方法比較實驗於Drebin資料集 43
4-3-2 不同加噪方法比較實驗於AMD資料集 44
4-4 實驗三不同基本擴增方法結合DCGAN實驗 47
4-4-1 不同基本擴增方法結合DCGAN實驗於Drebin資料集 47
4-4-2 不同基本擴增方法結合DCGAN實驗於AMD資料集 49
4-5 實驗四極小樣本實驗 51
五、結論 52
5-1 結論與貢獻 52
5-2 研究限制 53
5-3 未來研究 54
參考文獻 55

參考文獻

[1] statcounter. (2022). Desktop vs Mobile vs Tablet Market Share Worldwide. Available: https://gs.statcounter.com/platform-market-share/desktop-mobile-tablet/worldwide/#monthly-202103-202202
[2] statcounter. (2022). Mobile Operating System Market Share Worldwide. Available: https://gs.statcounter.com/os-market-share/mobile/worldwide/#monthly-202103-202202
[3] E. Willems. (2022). Android malware: An underestimated problem. Available: https://www.gdatasoftware.com/blog/2022/02/android-malware-an-underestimated-problem
[4] A. Al Zaabi and D. Mouheb, "Android malware detection using static features and machine learning," in 2020 International Conference on Communications, Computing, Cybersecurity, and Informatics (CCCI), 2020, pp. 1-5: IEEE.
[5] A. Razgallah, R. Khoury, S. Hallé, and K. Khanmohammadi, "A survey of malware detection in Android apps: Recommendations and perspectives for future research," Computer Science Review, vol. 39, 2021.
[6] T. Bhatia and R. Kaushal, "Malware detection in android based on dynamic analysis," in 2017 International Conference on Cyber Security And Protection Of Digital Services (Cyber Security), 2017, pp. 1-6: IEEE.
[7] O. N. Elayan and A. M. Mustafa, "Android malware detection using deep learning," Procedia Computer Science, vol. 184, pp. 847-852, 2021.
[8] L. Chan et al., "Survey of AI in cybersecurity for information technology management," in 2019 IEEE technology & engineering management conference (TEMSCON), 2019, pp. 1-8: IEEE.
[9] F. Alswaina and K. J. E. Elleithy, "Android malware family classification and analysis: Current status and future directions," electronics, vol. 9, no. 6, p. 942, 2020.
[10] S. Türker and A. B. Can, "Andmfc: Android malware family classification framework," in 2019 IEEE 30th International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC Workshops), 2019, pp. 1-6: IEEE.
[11] G. Iadarola, F. Martinelli, F. Mercaldo, and A. Santone, "Evaluating deep learning classification reliability in android malware family detection," in 2020 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW), 2020, pp. 255-260: IEEE.
[12] C. Shorten and T. M. Khoshgoftaar, "A survey on image data augmentation for deep learning," Journal of big data, vol. 6, no. 1, pp. 1-48, 2019.
[13] Y. Lu and J. Li, "Generative adversarial network for improving deep learning based malware classification," in 2019 Winter Simulation Conference (WSC), 2019, pp. 584-593: IEEE.
[14] L. Taylor and G. Nitschke, "Improving deep learning with generic data augmentation," in 2018 IEEE Symposium Series on Computational Intelligence (SSCI), 2018, pp. 1542-1547: IEEE.
[15] P. Chaudhari, H. Agrawal, and K. Kotecha, "Data augmentation using MG-GAN for improved cancer classification on gene expression data," Soft Computing, vol. 24, no. 15, pp. 11381-11391, 2019.
[16] R. Huang, S. Zhang, T. Li, and R. He, "Beyond face rotation: Global and local perception gan for photorealistic and identity preserving frontal view synthesis," in Proceedings of the IEEE international conference on computer vision, 2017, pp. 2439-2448.
[17] W. Hu and Y. Tan, "Generating adversarial malware examples for black-box attacks based on GAN," arXiv preprint, vol. arXiv:1702.05983, 2017.
[18] R. Burks, K. A. Islam, Y. Lu, and J. Li, "Data augmentation with generative models for improved malware detection: A comparative study," in 2019 IEEE 10th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON), 2019, pp. 0660-0665: IEEE.
[19] P.-C. Chiu, "Effective Android minor malware family detection using multiple feature integration approach and deep learning augmentation technique," Master Thesis, National Central University, Department of Information Management, 2020.
[20] Y. Pan, X. Ge, C. Fang, and Y. Fan, "A systematic literature review of android malware detection using static analysis," IEEE Access, vol. 8, pp. 116363-116379, 2020.
[21] W. Enck, M. Ongtang, and P. McDaniel, "On lightweight mobile phone application certification," in Proceedings of the 16th ACM conference on Computer and communications security, 2009, pp. 235-245.
[22] Z. Wang, K. Li, Y. Hu, A. Fukuda, and W. Kong, "Multilevel permission extraction in android applications for malware detection," in 2019 International Conference on Computer, Information and Telecommunication Systems (CITS), 2019, pp. 1-5: IEEE.
[23] J. Jiang et al., "Android malware family classification based on sensitive opcode sequence," in 2019 IEEE Symposium on Computers and Communications (ISCC), 2019, pp. 1-7: IEEE.
[24] A. Pektaş and T. Acarman, "Deep learning to detect Android malware via opcode sequences," Neurocomputing, 2018.
[25] J.-S. Ko, J.-S. Jo, D.-H. Kim, S.-K. Choi, and J. Kwak, "Real time android ransomware detection by analyzed android applications," in 2019 International Conference on Electronics, Information, and Communication (ICEIC), 2019, pp. 1-5: IEEE.
[26] X. Xiao, S. Zhang, F. Mercaldo, G. Hu, and A. K. Sangaiah, "Android malware detection based on system call sequences and LSTM," Multimedia Tools, vol. 78, no. 4, pp. 3979-3999, 2019.
[27] R. Thangavelooa, W. W. Jinga, C. K. Lenga, and J. Abdullaha, "Datdroid: Dynamic analysis technique in android malware detection," Int. J. Adv. Sci. Eng. Inf. Technol, vol. 10, pp. 536-541, 2020.
[28] K. Liu, S. Xu, G. Xu, M. Zhang, D. Sun, and H. Liu, "A review of android malware detection approaches based on machine learning," IEEE Access, vol. 8, pp. 124579-124607, 2020.
[29] M. Odusami, O. Abayomi-Alli, S. Misra, O. Shobayo, R. Damasevicius, and R. Maskeliunas, "Android malware detection: A survey," in International conference on applied informatics, 2018, pp. 255-266: Springer.
[30] P. Agrawal and B. Trivedi, "Machine learning classifiers for android malware detection," in Data Management, Analytics and Innovation: Springer, 2021, pp. 311-322.
[31] J. Jung et al., "Android malware detection based on useful API calls and machine learning," in 2018 IEEE First International Conference on Artificial Intelligence and Knowledge Engineering (AIKE), 2018, pp. 175-178: IEEE.
[32] A. Naway and Y. Li, "A review on the use of deep learning in android malware detection," arXiv preprint, vol. arXiv:1812.10360, 2018.
[33] M. Gohari, S. Hashemi, and L. Abdi, "Android Malware Detection and Classification Based on Network Traffic Using Deep Learning," in 2021 7th International Conference on Web Research (ICWR), 2021, pp. 71-77: IEEE.
[34] M. K. Alzaylaee, S. Y. Yerima, and S. Sezer, "DL-Droid: Deep learning based android malware detection using real devices," Computers & Security, vol. 89, p. 101663, 2020.
[35] N. Zhang, Y.-a. Tan, C. Yang, and Y. Li, "Deep learning feature exploration for android malware detection," Applied Soft Computing, vol. 102, p. 107069, 2021.
[36] A. H. E. Fiky, A. E. Shenawy, and M. A. Madkour, "Android Malware Category and Family Detection and Identification using Machine Learning," arXiv preprint, vol. arXiv:.01927, 2021.
[37] A. Darwaish and F. Naït-Abdesselam, "Rgb-based android malware detection and classification using convolutional neural network," in GLOBECOM 2020-2020 IEEE Global Communications Conference, 2020, pp. 1-6: IEEE.
[38] B. Kang, S. Y. Yerima, K. McLaughlin, and S. Sezer, "N-opcode analysis for android malware classification and categorization," in 2016 International conference on cyber security and protection of digital services (cyber security), 2016, pp. 1-7: IEEE.
[39] H. Zhou, W. Zhang, F. Wei, and Y. Chen, "Analysis of Android malware family characteristic based on isomorphism of sensitive API call graph," in 2017 IEEE Second International Conference on Data Science in Cyberspace (DSC), 2017, pp. 319-327: IEEE.
[40] W. Zhang, N. Luktarhan, C. Ding, and B. Lu, "Android malware detection using tcn with bytecode image," Symmetry, vol. 13, no. 7, p. 1107, 2021.
[41] L. Nataraj, S. Karthikeyan, G. Jacob, and B. S. Manjunath, "Malware images: visualization and automatic classification," in Proceedings of the 8th international symposium on visualization for cyber security, 2011, pp. 1-7.
[42] F. O. Catak, J. Ahmed, K. Sahinbas, and Z. H. Khand, "Data augmentation based malware detection using convolutional neural networks," PeerJ Computer Science, vol. 7, p. e346, 2021.
[43] Y. Jian, H. Kuang, C. Ren, Z. Ma, and H. Wang, "A novel framework for image-based malware detection with a deep neural network," Computers Security, vol. 109, p. 102400, 2021.
[44] F. Mercaldo and A. Santone, "Deep learning for image-based mobile malware detection," Journal of Computer Virology and Hacking Techniques, vol. 16, no. 2, pp. 157-171, 2020.
[45] X. Xiao and S. Yang, "An image-inspired and cnn-based android malware detection approach," in 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE), 2019, pp. 1259-1261: IEEE.
[46] 楊竣憲, "Using Generative Adversarial Networks for Data Augmentation in Android Malware Detection," Master Thesis, National Central University, Department of Information Management, 2020.
[47] C. Sun, A. Shrivastava, S. Singh, and A. Gupta, "Revisiting unreasonable effectiveness of data in deep learning era," in Proceedings of the IEEE international conference on computer vision, 2017, pp. 843-852.
[48] W. Li, C. Chen, M. Zhang, H. Li, and Q. Du, "Data augmentation for hyperspectral image classification with deep CNN," IEEE Geoscience and Remote Sensing Letters, vol. 16, no. 4, pp. 593-597, 2018.
[49] M. Nisa et al., "Hybrid malware classification method using segmentation-based fractal texture analysis and deep convolution neural network features," Applied Sciences, vol. 10, no. 14, p. 4966, 2020.
[50] A. F. Costa, G. Humpire-Mamani, and A. J. M. Traina, "An efficient algorithm for fractal analysis of textures," in 2012 25th SIBGRAPI Conference on Graphics, Patterns and Images, 2012, pp. 39-46: IEEE.
[51] H. Inoue, "Data augmentation by pairing samples for images classification," 2018.
[52] I. Goodfellow et al., "Generative adversarial nets," vol. 27, 2014.
[53] J. A. Pandian, G. Geetharamani, and B. Annette, "Data augmentation on plant leaf disease image dataset using image manipulation and deep learning techniques," in 2019 IEEE 9th International Conference on Advanced Computing (IACC), 2019, pp. 199-204: IEEE.
[54] A. Radford, L. Metz, and S. Chintala, "Unsupervised representation learning with deep convolutional generative adversarial networks," arXiv preprint, vol. arXiv:1511.06434, 2016.
[55] G. Iadarola, F. Martinelli, F. Mercaldo, and A. Santone, "Image-based Malware Family Detection: An Assessment between Feature Extraction and Classification Techniques," in IoTBDS, 2020, pp. 499-506.
[56] D. Arp, M. Spreitzenbarth, M. Hubner, H. Gascon, K. Rieck, and C. Siemens, "Drebin: Effective and explainable detection of android malware in your pocket," in Ndss, 2014, vol. 14, pp. 23-26.
[57] F. Wei, Y. Li, S. Roy, X. Ou, and W. Zhou, "Deep ground truth analysis of current android malware," in International conference on detection of intrusions and malware, and vulnerability assessment, 2017, pp. 252-276: Springer.
[58] S. Mahdavifar, A. F. A. Kadir, R. Fatemi, D. Alhadidi, and A. A. Ghorbani, "Dynamic android malware category classification using semi-supervised deep learning," in 2020 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech), 2020, pp. 515-522: IEEE.
[59] K. Elish, M. Elish, and H. Almohri, "Lightweight, Effective Detection and Characterization of Mobile Malware Families," IEEE Transactions on Computers, 2022.

指導教授

陳奕明(Yi-Ming Chen)

審核日期

2022-7-29

推文