利用BERT語言模型辨識社群媒體資源之資安威脅預警系統;To identify cybersecurity threat of social media and notification solution by BERT

NCU Institutional Repository > 管理學院 > 資訊管理學系碩士在職專班 > 博碩士論文 > Item 987654321/89858

請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/89858

題名:	利用BERT語言模型辨識社群媒體資源之資安威脅預警系統;To identify cybersecurity threat of social media and notification solution by BERT
作者:	童張銓;Tung, Chang-Chuan
貢獻者:	資訊管理學系在職專班
關鍵詞:	BERT;深度學習;資訊安全;自然語言處理;命名實體識別;BERT;Deep learning;cybersecurity;Natural Language Processing;Named Entity Recognition
日期:	2022-07-29
上傳時間:	2022-10-04 12:02:32 (UTC+8)
出版者:	國立中央大學
摘要:	不斷提高網路威脅意識及建立預防機制為確保企業資訊安全的一項重要任務，企業內部的網路及資安專家必須能夠即時獲得與企業內部軟硬體相關之最新安全事件及資安威脅的訊息，以進一步在資安事件發生前提早採取相對應措施，此一過程仰賴企業內部資安專家所獲得訊息的來源範圍及工作效率，處於被動接收訊息之模式。隨著社群媒體的發展及相關開源情報的廣泛使用，Twitter等社群媒體亦提供了資安威脅事件的最新訊息，其即時性及平台上所包含之訊息數量預期將能彌補訊息來源匱乏及人工效率之不足處，本研究即希望透過收集Twitter社群媒體上的最新資安威脅事件，匯集眾多與資安威脅相關之關鍵字進行自然語言處理，並透過專有名詞識別出與電腦軟硬體相關之命名實體標籤，並且與企業內部現行所使用之軟硬體環境進行比對，再進一步提供使用者或管理者相關之資安威脅事件期能提早採取因應措施。本研究為使用資訊安全相關之關鍵字收集Twitter平台之內容後利用多層級雙向編碼技術（Bidirectional Encoder Representations from Transformers, BERT）及進行微調，再以命名實體標籤識別出電腦軟硬體之廠商、系統名稱、版本、威脅等專有名詞，並以此與現有之電腦環境進行比對並發送預警訊息給使用者或管理者，以達到即時偵測及告警之目的。本研究並與其他學者所提出之方法進行比較，實驗結果顯示本研究所採用之BERT優於多位學者曾提出之CNN+BiLSTM機器學習方法，本研究之方法於Precision, Recall, F1 Score皆可達到96%以上，且可依據上下文正確識別出未在訓練集內之單詞，以達到正確標示及即時預警之目的。;Continuously promoting the awareness of cybersecurity threats and establishing the preventive methods are important measures to ensure the cyber security for an enterprise. Cybersecurity experts in the enterprise must be able to sense the newest vulnerabilities and threats in the virtual environment. The information identifying and collecting process relies on the source range the experts hold and the work efficiency of the personnel, in which the data is received passively and time consuming. With the development of social media and the open-source intelligence such as Twitter, brings the instant updates and concern of cybersecurity to the public, and its immediacy and the post amount on the platform are expected to make up for the lack of sources and handling efficiency. This research is expected to provide notification for users and managers to early response measures by collecting cybersecurity information on Twitter and through machine learning to identify related entity of software or hardware, and compared with the current virtual environment. This research collects keywords of cybersecurity on Twitter and being processed by the BERT (Bidirectional Encoder Representations from Transformers) for named entity recognition to identify vendor, software, version and relevant term, and compare with the existing environment to send the warning message for users and managers to achieve the purpose of real-time detection and warning. In this research, F1-Score is 0.96 and it is superior to CNN+BiLSTM, and BERT can correctly identify words that are not in the training set according to the context, to achieve the purpose of correct identity and immediate warning.
顯示於類別:	[Executive Master of Information Management] Electronic Thesis & Dissertation

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	156	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....