結合函式呼叫圖語意特徵及域適應技術之Android 抗混淆惡意軟體檢測模型研究

DC 欄位	值	語言
DC.contributor	資訊管理學系	zh_TW
DC.creator	楊蕙瑄	zh_TW
DC.creator	Hui-Hsuan Yang	en_US
dc.date.accessioned	2023-7-28T07:39:07Z
dc.date.available	2023-7-28T07:39:07Z
dc.date.issued	2023
dc.identifier.uri	http://ir.lib.ncu.edu.tw:88/thesis/view_etd.asp?URN=110423001
dc.contributor.department	資訊管理學系	zh_TW
DC.description	國立中央大學	zh_TW
DC.description	National Central University	en_US
dc.description.abstract	近年來人工智慧技術被廣泛應用在Android惡意程式檢測研究中。但是惡意軟體開發人員也會透過不同方式逃避檢測，一種常見的方式叫做混淆攻擊，透過這種攻擊方式可以改變APK結構，使得檢測系統提取之特徵改變，導致模型判斷錯誤。根據先前的研究，一個原本可以達到97.7%的惡意軟體檢測模型，在接收到經過API Call Obfuscation技術混淆之資料後準確率會只剩下50.3%。本研究從特徵面與模型面思考如何防禦混淆問題，從特徵面來看，APK經過混淆後特徵雖然被改變，但還是要能夠表現出混淆前的行為，所以如果在特徵前處理的過程中可以表達軟體的行為將降低混淆對檢測系統的影響。本研究選擇函式呼叫圖（Function Call Graph）做為特徵基礎，並利用節點崁入（Node Embedding）技術學習節點之間表達的語意訊息，以建模軟體的行為特徵。而從模型面思考，儘管Node Embedding可以學習到APK的語意訊息，一些進階的混淆技術會透過修改程式碼的方式使得不同語意可以表達出相同行為。所以在模型面，本研究將使用遷移學習（Transfer Learning）中的域適應（Domain Adaptation）技術訓練模型，讓模型可以拉近混淆前後資料集在特徵空間中的距離，使得模型能夠判斷經過混淆之資料集，以達到抗混淆之目的。本研究所提出的檢測系統在未經混淆的情況下可以達到0.9888的檢測準確率，而在受到多種混淆技術的情況下可以維持平均0.9672的檢測準確率。其中Domain Adaptation技術將經過CallIndirection混淆影響的檢測準確率從87%提升到95%。	zh_TW
dc.description.abstract	Artificial intelligence（AI）is widely used in Android malware detection. However, malware developers will use different methods to evade detection. A common method is called obfuscate attack. APK structure can be changed through the attack, resulting in model misjudgment. According to other research, a malware detection model that can reach 97.7% accuracy only have an accuracy rate of 51.3% after receiving the APK obfuscated by API Call Obfuscation. This research shows how to defend obfuscation in two aspects. From the sight of features, although the characteristics of APK will change after obfuscation, it still needs to keep the behavior before obfuscation. Therefore, if the behavior of an APK can be extracted in the process of feature preprocessing, the impact of obfuscation will reduce. As a result, this study chooses Function Call Graph（FCG）as a feature and uses Node Embedding to learn the semantic information between functions. From the perspective of the model, some advanced obfuscation attacks will modify code structure letting different semantics express the same behavior. Therefore, this study uses Domain Adaptation to train the model, so that the model can shorten the distance between different domains. Resulting the model to classify the obfuscated dataset to achieve the purpose of anti-obfuscation. My detection system can achieve 98% accuracy without obfuscated attacks. When facing multiple types of obfuscation attacks, it can maintain an average accuracy of 96%. In addition, Domain Adaptation improves the detection accuracy affected by CallIndirection from 87% to 95%.	en_US
DC.subject	混淆攻擊	zh_TW
DC.subject	深度學習	zh_TW
DC.subject	遷移學習	zh_TW
DC.subject	Android惡意軟體檢測	zh_TW
DC.subject	靜態分析	zh_TW
DC.subject	obfuscate attack	en_US
DC.subject	deep learning	en_US
DC.subject	transfer learning	en_US
DC.subject	Android malware detection	en_US
DC.subject	static analysis	en_US
DC.title	結合函式呼叫圖語意特徵及域適應技術之Android 抗混淆惡意軟體檢測模型研究	zh_TW
dc.language.iso	zh-TW	zh-TW
DC.title	A Research of Android Anti-Obfuscated Malware Detection Combined with Function Call Graph Semantic Feature and Domain Adaptation	en_US
DC.type	博碩士論文	zh_TW
DC.type	thesis	en_US
DC.publisher	National Central University	en_US

博碩士論文 110423001 完整後設資料紀錄