應用生成對抗網路於資料擴增之Android惡意程式分析研究

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：12

、訪客IP：3.145.40.119

姓名

楊竣憲(Chun-Hsien Yang) 查詢紙本館藏

畢業系所

資訊管理學系

論文名稱

應用生成對抗網路於資料擴增之Android惡意程式分析研究
(Using Generative Adversarial Networks for Data Augmentation in Android Malware Detection)

相關論文

★ 應用數位版權管理機制於數位影音光碟內容保護之研究	★ 以應用程式虛擬化技術達成企業軟體版權管理之研究
★ 以IAX2為基礎之網頁電話架構設計	★ 應用機器學習技術協助警察偵辦詐騙案件之研究
★ 擴充防止詐欺及保護隱私功能之帳戶式票務系統研究-以大眾運輸為例	★ 網際網路半結構化資料之蒐集與整合研究
★ 電子商務環境下網路購物幫手之研究	★ 網路安全縱深防護機制之研究
★ 國家寬頻實驗網路上資源預先保留與資源衝突之研究	★ 以樹狀關聯式架構偵測電子郵件病毒之研究
★ 考量地區差異性之隨選視訊系統影片配置研究	★ 不信任區域網路中數位證據保留之研究
★ 入侵偵測系統事件說明暨自動增加偵測規則之整合性輔助系統研發	★ 利用程序追蹤方法關聯分散式入侵偵測系統之入侵警示研究
★ 一種網頁資訊擷取程式之自動化產生技術研發	★ 應用XML/XACML於工作流程管理系統之授權管制研究

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

隨著惡意攻擊手法不斷推陳出新，面對這些層出不窮的新穎惡意程式，資料集中經常出現樣本不平衡的問題，使得分類器在訓練過程無法透過足夠數據學習某些類別其潛在惡意特徵。本研究將應用生成對抗網路於Android惡意程式分析領域，生成對抗網路是一種針對圖像進行訓練和生成數據的深度學習架構，已經被廣泛用為資料擴增於其他的機器視覺圖像辨識研究中。本論文透過將Android程式特徵轉為圖像化表達，並將數量稀少的惡意家族由該方法進行資料生成，藉此平衡、擴增原有資料集。同時本研究也比較了其他傳統的資料擴增技術，探討是否有益於辨識出少量的惡意類別樣本。測試證實不論是傳統圖像擴增方法或是生成對抗網路皆能提升分類的準確率，但生成對抗網路能更有效提高分類模型檢測出資料集中原本因數量較少而辨識準確率較低的惡意家族，實驗結果表示在Drebin四千筆與AMD兩萬筆資料的不同資料集中，對於樣本數量較稀少的類別經由生成對抗網路擴增後，相較於擴增前，兩者準確率的差異可達5%~20%。

摘要(英)

As malicious attack techniques continue to evolve, in the face of these endless new malicious programs, the problem of sample imbalance often occurs in the dataset, making the classifier unable to learn certain categories of its potential malicious features through sufficient data during the training process. In this study, will apply the Generative Adversarial Networks(GAN), which is a kind of deep learning architecture that trains and generates data for images, to the field of Android malware analysis. GAN has been widely used as data augmentation for other machine vision image recognition researching. In this paper, the characteristics of Android programs are converted into graphical expressions, and a few malicious families are generated by this method to balance and expand the original data set. At the same time, this study also compared other traditional data amplification techniques to explore whether it is beneficial to identify a small number of malicious category samples. Tests have confirmed that both traditional image amplification methods and GAN can improve the accuracy of classification, but the GAN can more effectively improve the classification model. The detection accuracy of the data set was originally low due to the small number of data. The malicious family, the experimental results show that in the different data sets of Drebin′s 4,000 and AMD′s 20,000 samples, the accuracy of the two types of samples with a relatively small number of samples is amplified by the generation of the anti-network, compared to before the amplification. The difference can reach 5%~20%.

關鍵字(中)

★ 生成對抗網路
★ 資料擴增
★ 深度學習
★ Android

關鍵字(英)

★ GAN
★ Data augmentation
★ Deep learning
★ Android

論文目次

第一章緒論………………………………………………………………………………………………1
1-1 研究背景…………………………………………………………………………………1
1-2 研究動機…………………………………………………………………………………3
1-3 研究貢獻…………………………………………………………………………………5
1-4 章節架構…………………………………………………………………………………6
第二章相關研究………………………………………………………………………………………7
2-1 程式碼圖像化之研究……………………………………………………………7
2-1-1 於Windows惡意程式分析…………………………………7
2-1-2 於Android惡意程式分析…………………………………8
2-2 基於捲積神經網路檢測惡意程式之研究…………………10
2-2-1 使用不同特徵………………………………………………………10
2-2-2 使用不同捲積模型………………………………………………12
2-3 生成對抗網路…………………………………………………………………………13
2-4 資料擴增於惡意程式分析之研究…………………………………15
2-5 小結……………………………………………………………………………………………16
第三章系統設計………………………………………………………………………………………18
3-1 系統架構……………………………………………………………………………………18
3-1-1 資料前處理………………………………………………………………19
3-1-2 資料擴增模組…………………………………………………………21
3-1-3 分類模組……………………………………………………………………21
3-1-4 評估指標……………………………………………………………………22
3-2 系統之訓練與使用流程………………………………………………………23
第四章實驗結果………………………………………………………………………………………25
4-1實驗環境與使用資料集…………………………………………………………25
4-1-1 實驗環境……………………………………………………………………25
4-1-2 資料集…………………………………………………………………………26
4-2 實驗設計……………………………………………………………………………………28
4-2-1 實驗一…………………………………………………………………………28
4-2-2 實驗二…………………………………………………………………………30
4-2-3 實驗三…………………………………………………………………………32
4-2-4 實驗四…………………………………………………………………………33
4-2-5 實驗五…………………………………………………………………………35
4-2-6 實驗六…………………………………………………………………………37
4-2-7 實驗七…………………………………………………………………………38
4-3 實驗結果與討論……………………………………………………………………40
第五章結論與未來研究…………………………………………………………………………43
5-1結論與貢獻…………………………………………………………………………………43
5-2研究限制………………………………………………………………………………………44
5-3未來研究………………………………………………………………………………………45
參考文獻…………………………………………………………………………………………………………46

參考文獻

[參考網站]
[1] Statcounter. (2020). "Mobile vs Tablet Market Share Worldwide," Available：
https://gs.statcounter.com/platform-market-share/mobile-tablet/worldwide/#monthly-
201906-202006
[3] PURPLESEC. (2019). "The Ultimate List Of Cyber Security Statistics For 2019,"
Available：https://purplesec.us/resources/cyber-security-statistics/
[13] FIREEYE."What is a Zero-Day Exploit?," Available：https://www.fireeye.com/current-
threats/what-is-a-zero-day-exploit.html
[26] Wiki. "Generative model," Available：https://en.wikipedia.org/wiki/Generative_model
[45] Github. "the-gan-zoo," Available：https://github.com/hindupuravinash/the-gan-zoo.
[55] Apktool. " A tool for reverse engineering 3rd party," Available：
https://ibotpeaches.github.io/Apktool
[59] A. M. Dataset, Available： http://amd.arguslab.org/
[60] A. D. Project, Available： https://www.sec.cs.tu-bs.de/~danarp/drebin/
[中文文獻]
[57] 張櫻瀞, "整合注意力機制與圖像化操作碼之 Android 惡意程式分析研究," 國立中
央大學資訊管理所碩士論文, 2019.
[英文文獻]
[2] S. Karthick and S. Binu, "Android security issues and solutions," in 2017 International
Conference on Innovative Mechanisms for Industry Applications (ICIMIA), 2017: IEEE,
pp. 686-689.
[4] F. Wei, S. Roy, and X. Ou, "Amandroid: A precise and general inter-component data flow
analysis framework for security vetting of android apps," in Proceedings of the 2014 ACM
SIGSAC conference on computer and communications security, 2014, pp. 1329-1341.
[5] A. Martín, A. Calleja, H. D. Menéndez, J. Tapiador, and D. Camacho, "ADROIT: Android
malware detection using meta-information," in 2016 IEEE Symposium Series on
Computational Intelligence (SSCI), 2016: IEEE, pp. 1-8.
[6] Z. Qu, S. Alam, Y. Chen, X. Zhou, W. Hong, and R. Riley, "Dydroid: Measuring dynamic
code loading and its security implications in android applications," in 2017 47th Annual
IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), 2017:
IEEE, pp. 415-426.
[7] C. K. Chang, "Situation analytics: a foundation for a new software engineering paradigm,"
Computer, vol. 49, no. 1, pp. 24-33, 2016.
[8] Z. Lin, R. Wang, X. Jia, S. Zhang, and C. Wu, "Classifying Android malware with
dynamic behavior dependency graphs," in 2016 IEEE Trustcom/BigDataSE/ISPA, 2016:
IEEE, pp. 378-385.
[9] B. Chen, Z. Ren, C. Yu, I. Hussain, and J. Liu, "Adversarial examples for CNN-based
malware detectors," IEEE Access, vol. 7, pp. 54360-54371, 2019.
[10] T. Abou-Assaleh, N. Cercone, V. Keselj, and R. Sweidan, "N-gram-based detection of
new malicious code," in Proceedings of the 28th Annual International Computer
Software and Applications Conference, 2004. COMPSAC 2004., 2004, vol. 2: IEEE, pp.
41-42.
[11] G. Yan, N. Brown, and D. Kong, "Exploring discriminatory features for automated
malware classification," in International Conference on Detection of Intrusions and
Malware, and Vulnerability Assessment, 2013: Springer, pp. 41-61.
[12] J. Yan, Y. Qi, and Q. Rao, "LSTM-based hierarchical denoising network for Android
malware detection," Security and Communication Networks, vol. 2018, 2018.
[14] Y. LeCun, Y. Bengio, and G. Hinton, "Deep learning," nature, vol. 521, no. 7553, pp.
436-444, 2015.
[15] A. Bacci, A. Bartoli, F. Martinelli, E. Medvet, and F. Mercaldo, "Detection of
obfuscation techniques in Android applications," in Proceedings of the 13th International
Conference on Availability, Reliability and Security, 2018, pp. 1-9.
[16] X. Xiao, S. Zhang, F. Mercaldo, G. Hu, and A. K. Sangaiah, "Android malware detection
based on system call sequences and LSTM," Multimedia Tools and Applications, vol. 78,
no. 4, pp. 3979-3999, 2019.
[17] Y. Ding, R. Wu, and F. Xue, "Detecting Android Malware Using Bytecode Image," in International Conference on Cognitive Computing, 2018: Springer, pp. 164-169.
[18] W. Wang, M. Zhao, and J. Wang, "Effective android malware detection with a hybrid
model based on deep autoencoder and convolutional neural network," Journal of
Ambient Intelligence and Humanized Computing, vol. 10, no. 8, pp. 3035-3043, 2019.
[19] E. B. Karbab, M. Debbabi, A. Derhab, and D. Mouheb, "MalDozer: Automatic
framework for android malware detection using deep learning," Digital Investigation,
vol. 24, pp. S48-S59, 2018.
[20] J. Wang and L. Perez, "The effectiveness of data augmentation in image classification
using deep learning," Convolutional Neural Networks Vis. Recognit, vol. 11, 2017.
[21] A. Krizhevsky, I. Sutskever, and G. E. Hinton, "Imagenet classification with deep
convolutional neural networks," in Advances in neural information processing systems,
2012, pp. 1097-1105.
[22] H. Rizk, A. Shokry, and M. Youssef, "Effectiveness of data augmentation in cellular -based localization using deep learning," in 2019 IEEE Wireless Communications and Networking Conference (WCNC), 2019: IEEE, pp. 1-6.
[23] D. P. Kingma and M. Welling, "Auto-Encoding Variational Bayes," stat, vol. 1050, p. 1,
2014.
[24] I. Goodfellow et al., "Generative adversarial nets," in Advances in neural information
processing systems, 2014, pp. 2672-2680.
[25] A. Y. Ng and M. I. Jordan, "On discriminative vs. generative classifiers: A comparison of
logistic regression and naive bayes," in Advances in neural information processing
systems, 2002, pp. 841-848.
[27] J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, "Unpaired image-to-image translation using
cycle-consistent adversarial networks," in Proceedings of the IEEE international
conference on computer vision, 2017, pp. 2223-2232.
[28] R. Huang, S. Zhang, T. Li, and R. He, "Beyond face rotation: Global and local perception
gan for photorealistic and identity preserving frontal view synthesis," in Proceedings of
the IEEE International Conference on Computer Vision, 2017, pp. 2439-2448.
[29] M. Frid-Adar, E. Klang, M. Amitai, J. Goldberger, and H. Greenspan, "Synthetic data
augmentation using GAN for improved liver lesion classification," in 2018 IEEE 15th
international symposium on biomedical imaging (ISBI 2018), 2018: IEEE, pp. 289-293.
[30] C. Bermudez, A. J. Plassard, L. T. Davis, A. T. Newton, S. M. Resnick, and B. A.
Landman, "Learning implicit brain MRI manifolds with deep learning," in Medical
Imaging 2018: Image Processing, 2018, vol. 10574: International Society for Optics and
Photonics, p. 105741L.
[31] L. Nataraj, S. Karthikeyan, G. Jacob, and B. S. Manjunath, "Malware images:
visualization and automatic classification," in Proceedings of the 8th international
symposium on visualization for cyber security, 2011, pp. 1-7.
[32] M. Yang and Q. Wen, "Detecting android malware by applying classification techniques
on images patterns," in 2017 IEEE 2nd International Conference on Cloud Computing
and Big Data Analysis (ICCCBDA), 2017: IEEE, pp. 344-347.
[33] A. Makandar and A. Patrot, "Malware class recognition using image processing
techniques," in 2017 International Conference on Data Management, Analytics and
Innovation (ICDMAI), 2017: IEEE, pp. 76-80.
[34] N. McLaughlin et al., "Deep android malware detection," in Proceedings of the Seventh
ACM on Conference on Data and Application Security and Privacy, 2017, pp. 301-308. [35] W. Guo, T. Wang, and J. Wei, "Malware detection with convolutional neural network
using hardware events," in CCF National Conference on Compujter Engineering and
Technology, 2017: Springer, pp. 104-115.
[36] T. Hsien-De Huang and H.-Y. Kao, "R2-D2: color-inspired convolutional neural
network (CNN)-based android malware detections," in 2018 IEEE International
Conference on Big Data (Big Data), 2018: IEEE, pp. 2633-2642.
[37] X. Liu, J. Zhang, Y. Lin, and H. Li, "Atmpa: Attacking machine learning-based malware
visualization detection methods via adversarial examples," in 2019 IEEE/ACM 27th
International Symposium on Quality of Service (IWQoS), 2019: IEEE, pp. 1-10.
[38] D. Vasan, M. Alazab, S. Wassan, H. Naeem, B. Safaei, and Q. Zheng, "IMCFN: Image-
based malware classification using fine-tuned convolutional neural network
architecture," Computer Networks, vol. 171, p. 107138, 2020.
[39] K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale
image recognition," 2015.
[40] K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in
Proceedings of the IEEE conference on computer vision and pattern recognition, 2016,
pp. 770-778.
[41] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, "Rethinking the inception
architecture for computer vision," in Proceedings of the IEEE conference on computer
vision and pattern recognition, 2016, pp. 2818-2826.
[42] M. Mirza and S. Osindero, "Conditional generative adversarial nets," arXiv preprint
arXiv:1411.1784, 2014.
[43] X. Chen, Y. Duan, R. Houthooft, J. Schulman, I. Sutskever, and P. Abbeel, "Infogan:
Interpretable representation learning by information maximizing generative adversarial
nets," in Advances in neural information processing systems, 2016, pp. 2172-2180.
[44] A. Radford, L. Metz, and S. Chintala, "Unsupervised representation learning with deep
convolutional generative adversarial networks," arXiv preprint arXiv:1511.06434, 2015.
[46] L. Chen, S. Hou, Y. Ye, and S. Xu, "Droideye: Fortifying security of learning-based
classifier against adversarial android malware attacks," in 2018 IEEE/ACM
International Conference on Advances in Social Networks Analysis and Mining
(ASONAM), 2018: IEEE, pp. 782-789.
[47] W. Hu and Y. Tan, "Generating Adversarial Malware Examples for Black-Box Attacks
Based on GAN," arXiv, p. arXiv: 1702.05983, 2017.
[48] J. W. Stokes, D. Wang, M. Marinescu, M. Marino, and B. Bussone, "Attack and defense
of dynamic analysis-based, adversarial neural malware detection models," in MILCOM
2018-2018 IEEE Military Communications Conference (MILCOM), 2018: IEEE, pp. 1-8.
[49] I. Rosenberg, A. Shabtai, Y. Elovici, and L. Rokach, "Query-efficient gan based black-
box attack against sequence based machine and deep learning classifiers," arXiv preprint
arXiv:1804.08778, 2018.
[50] K. Grosse, N. Papernot, P. Manoharan, M. Backes, and P. McDaniel, "Adversarial
perturbations against deep neural networks for malware classification," arXiv preprint
arXiv:1606.04435, 2016.
[51] W. Hu and Y. Tan, "Generating adversarial malware examples for black-box attacks
based on gan," arXiv preprint arXiv:1702.05983, 2017.
[52] J.-Y. Kim, S.-J. Bu, and S.-B. Cho, "Zero-day malware detection using transferred
generative adversarial networks based on deep autoencoders," Information Sciences, vol.
460, pp. 83-102, 2018.
[53] Y. Lu and J. Li, "Generative adversarial network for improving deep learning based
malware classification," in 2019 Winter Simulation Conference (WSC), 2019: IEEE, pp.
584-593.
[54] Q. Jerome, K. Allix, R. State, and T. Engel, "Using opcode-sequences to detect malicious
Android applications," in 2014 IEEE International Conference on Communications
(ICC), 2014: IEEE, pp. 914-919.
[56] J. Yan, Y. Qi, and Q. Rao, "Detecting malware with an ensemble method based on deep
neural network," Security and Communication Networks, vol. 2018, 2018.
[61] D. Arp, M. Spreitzenbarth, M. Hubner, H. Gascon, K. Rieck, and C. Siemens, "Drebin:
Effective and explainable detection of android malware in your pocket," in Ndss, 2014,
vol. 14, pp. 23-26.
[62] C. Hasegawa and H. Iyatomi, "One-dimensional convolutional neural networks for
android malware detection," in 2018 IEEE 14th International Colloquium on Signal
Processing & Its Applications (CSPA), 2018: IEEE, pp. 99-102.

指導教授

陳奕明(Yi-Ming Chen)

審核日期

2020-7-29

推文