摘要: | 網頁木馬(WebShell) 攻擊長期以來一直是網路管理員的困擾。由於雲端服務的可擴展性和分散式的特性可能加劇 WebShell 攻擊的潛在風險和影響,因此,此類攻擊也成為雲端環境中的主要安全問題之一。因此,近年來,就有多種策略被提出來防範WebShell 的攻擊。本篇基於深度學習技術,提出了兩種有效偵測 WebShell 的方法。這兩種方法皆使用位元組對編碼(Byte Pair Encoding, BPE)對 WebShell 的原始碼進行字串編 碼,將輸入資料分割成 tokens。在生成詞嵌入向量(Word Embedding Vector)方面,方法一使用 CodeBERT,而方法二使用 GraphCodeBERT。這兩種預訓練的 CodeBERT 與GraphCodeBERT 模型使用相同的架構基底並有效理解程式碼,但 GraphCodeBERT 藉由考慮程式碼之間的關聯與內部結構,進一步提升對程式碼的理解能力。此外,方法一與方法二均使用門控遞迴單元(GRU)或雙向門控遞迴單元(雙向 GRU)來檢測程式碼中是否含有 WebShell。在實驗階段,透過使用不同的超參數對這兩種方法進行了訓練,並以K-Fold 交叉驗證來確認最優的結果和相應的模型。之後,利用測試資料集對方法一與方法二的模型進行了實驗,並將結果與相關文獻進行了比較。從實驗結果中觀察到,方法一的準確率達到了 99.54%,精確率為 98.42%,召回率為 99.29%,而 F1 分數為98.85%。方法二則表現更佳,其準確率為 99.65%,精確率為 99.29%,召回率同為99.29%,F1 分數也達到了 99.29%。這些結果顯示,本篇所提出的方法相較於先前的方法有顯著的提升。此外,與其他開源或商業工具相比,本研究所提出的方法在各項指標上都表現出色。特別值得一提的是,本研究提出的方法對於陌生資料和混淆程式碼都具有出色的準確率和精確率,表現出其優越的檢測能力和實用價值。;WebShell attacks have long been a significant challenge for website administrators. Due to the scalability and distributed nature of cloud services, these factors exacerbate the potential risks and impacts of WebShell attacks, making them one of the main security threats in cloud environments. Consequently, in recent years, various strategies have been proposed to guard against WebShell attacks. This paper presents two effective methods for detecting WebShell, based on deep learning technology. Both methods employ Byte Pair Encoding (BPE) to encode the string of the WebShell source code, split input data into tokens. For generating word embedding vectors, Method 1 uses CodeBERT, while Method 2 employs GraphCodeBERT. These methods effectively understand code using pre-trained CodeBERT and GraphCodeBERT models, and both share the same architecture. GraphCodeBERT, in particular, enhances code comprehension by considering the relationships and internal structures among the code. Additionally, both methods utilize GRU and Bidirectional GRU to detect the presence of WebShell in the code. During the experimental phase, training was conducted on these two methods using various hyperparameters, and the best results and corresponding models were confirmed through K-Fold cross-validation. Subsequently, experiments were performed on models from Methods 1 and 2 using a test dataset, and the results were compared with related works. The experimental results show that Method 1 achieved an accuracy of 99.54%, a precision of 98.42%, a recall of 99.29%, and an F1 score of 98.85%. Method 2 performed even better, with an accuracy of 99.65%, a precision of 99.29%, a recall of 99.29%, and an F1 score of 99.29%. These results demonstrate significant improvements over previous methods. Moreover, compared to other open-source or commercial tools, the methods proposed in this paper are better in all metrics. Notably, the methods introduced here show outstanding accuracy and precision on unseen data and obfuscated code, showcasing their superior detection capabilities and practical value. |