dc.description.abstract | In recent years, with the popularity of smartphones, Android malware has become an increasingly serious problem, leading to the leakage of many users′ private information, which in turn leads to real property loss. To solve this problem, many researchers have adopted various methods to identify and classify malware, including static analysis, dynamic analysis, and machine learning techniques. However, there are many obfuscated malware on the market that can often bypass existing detection methods, resulting in a decrease in detection rates. In this context, many researchers have started to use dynamic analysis to address the problem of obfuscated malware. Dynamic analysis requires the actual execution of the application to capture dynamic features, and the pre-processing time can be very long when the dataset is relatively large. In contrast, static analysis does not require actual application execution, and the preprocessing time is much more streamlined, but common features such as API_CALL are susceptible to obfuscation techniques, thus reducing the accuracy of the model. To overcome this problem, this study proposes a special preprocessing method that performs vector transformation on static features, thus minimizing the effect of obfuscation techniques on these static features. This study also combines the taint analysis technique to improve the accuracy and efficiency of Android malware detection.The accuracy of 99% is achieved in the unobfuscated dataset and 97.8 in the obfuscated dataset, and the pre-processing time is improved by nearly 20 times compared to the dynamic analysis. | en_US |