姓名 |
陳柊瑄(Chung-Hsuan Chen)
查詢紙本館藏 |
畢業系所 |
資訊工程學系 |
論文名稱 |
MITREtrieval: 藉由融合深度學習與知識本體庫 的情資威脅報告之 MITRE 技術分類 (MITREtrieval: Retrieving MITRE Techniques from Unstructured Threat Reports by Fusion of Deep Learning and Ontology)
|
相關論文 | |
檔案 |
[Endnote RIS 格式]
[Bibtex 格式]
[相關文章] [文章引用] [完整記錄] [館藏目錄] 至系統瀏覽論文 ( 永不開放)
|
摘要(中) |
隨著資安事件的層出不窮,網路威脅情資(CTI)已經被廣泛地用來作為了解與 抵禦威脅的一個重要方法,而這類情資常常以非結構化的文章方式來分享也稱作 情資威脅報告,通常在這類報告中會隱含著許多威脅組織的重要資訊像是攻擊行 為 與 攻 擊 模 式 , MITRE ATT&CK 提 供 了 一 個 定 義 了 tactic, technique, 與 procedure(TTP)的框架,防禦者可透過 TTP 了解威脅的目的與手法進而進行滲透 測試或模擬在他們的系統上,然而 TTP 幾乎都被隱藏在情資威脅報告中,要透過 人工的方法閱讀過越來越多的情資威脅報告是非常耗時耗人力的,然而現今研究 因為忽視了語意的相依性與標注資料的稀缺問題導致表現不如預期,因此在本研 究中我們提出一套自動化的情資威脅報告分類系統 MITREtrieval 來將情資威脅 報告中的技術取出,本系統透過融合深度學習與知識本體庫的方法來分類,我們 提出了以句子為基礎的 BERT 來在分類 MITRE 技術時代入語意的關係,並透過 融合知識本體庫來幫助訓練樣本數不多的 MITRE 技術,在效能方面我們將分類 問題分成 113 分類、46 分類與 23 分類來證明系統效能無論在多少分類都能夠勝 過現有研究,在 113 分類可以達到 58%的 F2 Score、62%在 46 分類與 69%在 23 分類皆勝過現有論文 15%以上,此自動分類系統可協助資安人員分類與分析情資 威脅報告並在資安專家與 MITREtrieval 合作下能更快速地產生精確的情資 |
摘要(英) |
Cyber threat intelligence(CTI) has been widely used to understand and defense proactively on incoming threat. CTIs are usually shared as unstructured reports which always implicit significant information such as threat action and attack patterns about threat actor. TTP(Technique ,Tactic ,Procedure) is representation about attacker′s goal and ways to achieve goals. Defenders can utilize TTP to comprehend attackers and perform penetration test and simulation on their system. However, TTP is often described in CTI report so that it is inefficient to read and analyze manually if there are big amount and lengthy documents. Therefore, in this paper, we propose an automatic retrieval system, MITREtrieval, to retrieve MITRE Techniques from unstructured CTI reports by fusion of ontology and deep learning. We evaluate performance on different technique thresholds to show that our system can get good performance not only on techniques that have sufficient samples but also on techniques with few ones. The result shows that MITREtrieval achieve 58% F2 score on 113 multi-label classification task, 62% on 46 multi-label classification task and 69% on 23 multi-label classification task, which outperforms state-of-the-art work. MITREtrieval can speed up the time on analyzing CTI reports manually and finally provide high quality threat intelligence to cybersecurity company. |
關鍵字(中) |
★ 資訊安全 ★ 情資 ★ MITRE ATT&CK ★ 自然語言處理 ★ 深度學習 ★ 知識本體庫 |
關鍵字(英) |
★ Cybersecurity, ★ Threat Intelligence, ★ MITRE ATT&CK ★ Natural Language Processing ★ Deep Learning ★ Ontology |
論文目次 |
摘要 i
Abstract ii
致謝 iii
1 Introduction 1
2 Background 6
2.1 CTI Reports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 MITRE ATT&CK . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3 Problem Formulation 11
4 Methodology 13
4.1 Overall Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.2 Topic Classifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.3 Knowledge Expansion . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.3.1 Query Node Extraction . . . . . . . . . . . . . . . . . . . . 16
4.3.2 COMAT Inference . . . . . . . . . . . . . . . . . . . . . . . 17
4.4 TTP Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.5 Knowledge Fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.6 Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
5 Evaluation 24
5.1 Evaluation Metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.2 Overall Performance . . . . . . . . . . . . . . . . . . . . . . . . . . 26
5.3 The effect of sentence-based BERT model . . . . . . . . . . . . . . 27
5.4 The effect of Fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
5.5 The effect of Topic Classifier . . . . . . . . . . . . . . . . . . . . . . 32
6 Conclusion 34
References 35 |
參考文獻 |
1] Nikos Virvilis and Dimitris Gritzalis. “The Big Four - What We Did Wrong
in Advanced Persistent Threat Detection?” In: 2013 International Conference
on Availability, Reliability and Security. 2013, pp. 248–254. DOI: 10.1109/
ARES.2013.32.
[2] Ghaith Husari et al. “Ttpdrill: Automatic and accurate extraction of threat
actions from unstructured text of cti sources”. In: Proceedings of the 33rd
Annual Computer Security Applications Conference. 2017, pp. 103–115.
[3] Ghaith Husari et al. “Using entropy and mutual information to extract
threat actions from cyber threat intelligence”. In: 2018 IEEE International
Conference on Intelligence and Security Informatics (ISI). IEEE. 2018, pp. 1–6.
[4] Shengping Zhou et al. “Automatic identification of indicators of com-
promise using neural-based sequence labelling”. In: arXiv preprint
arXiv:1810.10156 (2018).
[5] Ziyun Zhu and Tudor Dumitras. “Chainsmith: Automatically learning the
semantics of malicious campaigns by mining threat intelligence reports”.
In: 2018 IEEE European Symposium on Security and Privacy (EuroS&P). IEEE.
2018, pp. 458–472.
[6] Yali Gao et al. “Hincti: A cyber threat intelligence modeling and identi-
fication system based on heterogeneous information network”. In: IEEE
Transactions on Knowledge and Data Engineering (2020).
[7] Wiem Tounsi and Helmi Rais. “A survey on technical threat intelligence in
the age of sophisticated cyber attacks”. In: Computers & security 72 (2018),
pp. 212–233.
[8] Sean Barnum. “Standardizing cyber threat intelligence information with
the structured threat information expression (stix)”. In: Mitre Corporation
11 (2012), pp. 1–22.
[9] Julie Connolly, Mark Davidson, and Charles Schmidt. “The trusted auto-
mated exchange of indicator information (taxii)”. In: The MITRE Corpora-
tion (2014), pp. 1–20.
[10] VirusTotal. YARA. The pattern matching knife. (accessed 05.19.2022). URL:
https://github.com/VirusTotal/yara.
[11] Sara Qamar et al. “Data-driven analytics for cyber-threat intelligence and
information sharing”. In: Computers & Security 67 (2017), pp. 35–58.
35
[12] Thomas D Wagner et al. “Cyber threat intelligence sharing: Survey and
research directions”. In: Computers & Security 87 (2019), p. 101589.
[13] Why TTPs are key in cyberintelligence. (accessed 04.26.2022). URL:
https : / / www . cytomic . ai / trends / ttp - advantages -
cyberintelligence/.
[14] Peng Gao et al. Enabling Efficient Cyber Threat Hunting With Cyber Threat
Intelligence. 2021. arXiv: 2010.13637 [cs.CR].
[15] Fernando Maymi et al. “Towards a definition of cyberspace tactics, tech-
niques and procedures”. In: Dec. 2017, pp. 4674–4679. DOI: 10 . 1109 /
BigData.2017.8258514.
[16] MITRE, Adversarial tactics, techniques and common knowledge. (accessed
02.16.2022). URL: https://attack.mitre.org/.
[17] TTP Cyber Security. (accessed 05.03.2022). URL: https : / / www .
trustnetinc.com/ttp-cyber-security/.
[18] Valentine Legoy et al. “Automated retrieval of att&ck tactics and tech-
niques for cyber threat reports”. In: arXiv preprint arXiv:2004.14322 (2020).
[19] Gbadebo Ayoade et al. “Automated Threat Report Classification over
Multi-Source Data”. In: 2018 IEEE 4th International Conference on Collabo-
ration and Internet Computing (CIC). 2018, pp. 236–245. DOI: 10 . 1109 /
CIC.2018.00040.
[20] Yumna Ghazi et al. “A Supervised Machine Learning Based Approach for
Automatically Extracting High-Level Threat Intelligence from Unstruc-
tured Sources”. In: 2018 International Conference on Frontiers of Information
Technology (FIT). 2018, pp. 129–134. DOI: 10.1109/FIT.2018.00030.
[21] Siqi Peng et al. “A Threat Actions Extraction Method Based on The Condi-
tional Co-occurrence Degree”. In: 2020 7th International Conference on Infor-
mation Science and Control Engineering (ICISCE). 2020, pp. 1633–1637. DOI:
10.1109/ICISCE50968.2020.00323.
[22] Mengming Li et al. “Extraction of Threat Actions from Threat-related Arti-
cles using Multi-Label Machine Learning Classification Method”. In: 2019
2nd International Conference on Safety Produce Informatization (IICSPI). 2019,
pp. 428–431. DOI: 10.1109/IICSPI48186.2019.9095885.
[23] Amirreza Niakanlahiji, Jinpeng Wei, and Bei-Tseng Chu. “A Natural Lan-
guage Processing Based Trend Analysis of Advanced Persistent Threat
Techniques”. In: 2018 IEEE International Conference on Big Data (Big Data).
2018, pp. 2995–3000. DOI: 10.1109/BigData.2018.8622255.
[24] MITRE. Threat Report ATT&CK Mapping. 2022. URL: https://github.
com/center-for-threat-informed-defense/tram.
36
[25] Yizhe You et al. “TIM: threat context-enhanced TTP intelligence mining on
unstructured threat data”. In: Cybersecurity 5.1 (2022), pp. 1–17.
[26] Aaron Tuor et al. “Deep learning for unsupervised insider threat detection
in structured cybersecurity data streams”. In: Workshops at the Thirty-First
AAAI Conference on Artificial Intelligence. 2017.
[27] Aniket Kesari. “Predicting Cybersecurity Incidents Through Mandatory
Disclosure Regulation”. In: Available at SSRN 3700243 (2020).
[28] Kris Oosthoek and Christian Doerr. “Cyber Threat Intelligence: A Prod-
uct Without a Process?” In: International Journal of Intelligence and Counter-
Intelligence 34.2 (2021), pp. 300–315. DOI: 10 . 1080 / 08850607 . 2020 .
1780062. eprint: https : / / doi . org / 10 . 1080 / 08850607 . 2020 .
1780062. URL: https : / / doi . org / 10 . 1080 / 08850607 . 2020 .
1780062.
[29] Facundo Munõz Alexis Dorais-Joncas. JUMPING THE AIR GAP: 15
years of nation-state effort. Accessed: 4-25-2022. URL: https : / / www .
welivesecurity . com / wp - content / uploads / 2021 / 12 / eset _
jumping_the_air_gap_wp.pdf.
[30] Jacob Devlin et al. “Bert: Pre-training of deep bidirectional transformers
for language understanding”. In: arXiv preprint arXiv:1810.04805 (2018).
[31] Kiavash Satvat, Rigel Gjomemo, and V.N. Venkatakrishnan. “Extractor:
Extracting Attack Behavior from Threat Reports”. In: 2021 IEEE European
Symposium on Security and Privacy (EuroS P). 2021, pp. 598–615. DOI: 10.
1109/EuroSP51992.2021.00046.
[32] Xiaojing Liao et al. “Acing the ioc game: Toward automatic discovery
and analysis of open-source cyber threat intelligence”. In: Proceedings of
the 2016 ACM SIGSAC conference on computer and communications security.
2016, pp. 755–766.
[33] Yinhan Liu et al. “Roberta: A robustly optimized bert pretraining ap-
proach”. In: arXiv preprint arXiv:1907.11692 (2019).
[34] Hu Xu et al. “BERT post-training for review reading comprehension
and aspect-based sentiment analysis”. In: arXiv preprint arXiv:1904.02232
(2019).
[35] Bhavika Bhutani et al. “Fake news detection using sentiment analysis”. In:
2019 twelfth international conference on contemporary computing (IC3). IEEE.
2019, pp. 1–5.
37
[36] Ming Ding et al. “CogLTX: Applying BERT to Long Texts”. In: Ad-
vances in Neural Information Processing Systems. Ed. by H. Larochelle
et al. Vol. 33. Curran Associates, Inc., 2020, pp. 12792–12804. URL:
https : / / proceedings . neurips . cc / paper / 2020 / file /
96671501524948bc3937b4b30d0e57b9-Paper.pdf.
[37] Jinghui Lu et al. “A Sentence-Level Hierarchical BERT Model for Docu-
ment Classification with Limited Labelled Data”. In: International Confer-
ence on Discovery Science. Springer. 2021, pp. 231–241.
[38] CyberMonitor. APT & Cybercriminals Campaign Collection. 2022. URL:
https : / / github . com / CyberMonitor / APT _ CyberCriminal _
Campagin_Collections.
[39] Zichao Yang et al. “Hierarchical attention networks for document classi-
fication”. In: Proceedings of the 2016 conference of the North American chapter
of the association for computational linguistics: human language technologies.
2016, pp. 1480–1489 |
指導教授 |
吳曉光(Eric Hsiao-Kuang Wu)
|
審核日期 |
2022-8-3 |
推文 |
facebook plurk twitter funp google live udn HD myshare reddit netvibes friend youpush delicious baidu
|
網路書籤 |
Google bookmarks del.icio.us hemidemi myshare
|