針對特定領域任務—基於常識的BERT模型之應用

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：17

、訪客IP：18.224.0.25

姓名

葉詠心(Ip Weng Sam) 查詢紙本館藏

畢業系所

資訊管理學系

論文名稱

針對特定領域任務—基於常識的BERT模型之應用
(The Application of Common-Sense Knowledge-based BERT on Domain-Specific Tasks)

相關論文

★ 基於社群媒體使用者之硬體設備差異分析文本情緒強烈程度

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

在今天競爭激烈的商業環境中，組織可以從通過文本分類進行主題分析中獲益良多。雖然有多種方法可供選擇，但BERT是自然語言處理中最有效的技術之一。BERT通常被用作特定領域的分類模型，但因模型通常沒有超出其訓練數據的知識，例如像人類一般的對事情之常識及事物之間的關聯性認知，因此限制了它與人類智能的相似度。
為了解決這個限制，本研究討探了把BERT與另一個有價值的工具—知識圖譜相結合，以擴展分類模型的能力。通過融入知識圖谱，BERT模型可以像人類一樣獲得一般知識，提升其分類能力。BERT和知識圖谱的結合有潛力顯著提升組織從大量文本數據中提取有價值的洞察力的能力。經過實驗測試，本研究發現BERT模型在加入了不同種類的知識圖譜後，對於不同的分類任務帶來的成效不一。另外，本研究亦發現加入知識圖譜的BERT模型會面臨著不同的挑戰：如訓練模型的複雜度提高、長短文本應用上的挑戰、及確保句子與知識表示模型—知識三元組之關聯性。

摘要(英)

In today′s highly competitive business environment, organizations can benefit greatly from subject analysis through text classification. While there are several methods available, BERT is one of the most effective techniques for natural language processing. However, BERT is typically used as a domain-specific classification model and may not possess knowledge beyond its training data, limiting its similarity to human intelligence.
To address this limitation, this research is exploring the combination of BERT with another valuable instrument, the knowledge graph. By incorporating the knowledge graph, the BERT model can acquire general knowledge as humans do, enhancing its classification capabilities.
The study found that the BERT model has different performances on classification tasks after adding various types of knowledge graphs. In addition, the study also found that the model will face different challenges: such as the increase in the complexity of the training model, challenges in the application of long and short texts, and ensuring the relevance of sentences and the knowledge representation models—knowledge triples.

關鍵字(中)

★ 文本分類
★ 知識圖譜
★ 情感分析
★ 主題分類

關鍵字(英)

★ text classification
★ knowledge graph
★ sentiment analysis
★ topic categorization

論文目次

中文摘要 i
Abstract ii
致謝辭 iii
Table of Contents iv
Figures vi
Tables viii
1. INTRODUCTION 1
1.1 Background 1
1.2 Motivation 2
1.3 Objective 3
1.4 Structure 4
2. LITERATURE REVIEW 5
2.1 Text Classification 5
2.1.1 Text Preprocessing 5
2.1.2 Feature Extraction 7
2.2 Method of Text Classification 8
2.2.1 The Machine Learning Approaches 8
2.2.2 Lexicon-based Approach 9
2.3 Knowledge Graph 10
2.3.1 Knowledge Graph Representation 10
2.3.2 Knowledge Graph Reasoning Methods 12
2.3.3 The Integration of the Knowledge Graphs 13
2.4 Works of the Combination of Knowledge Graph and Text Classification 17
2.4.1 Previous Works of the Combination 17
2.4.2 The K-BERT Model 18
3. RESEARCH METHOD 20
3.1 Overview 20
3.2 Knowledge Graph Layer 21
3.3 The Knowledge Layer 23
3.3.1 Embedding Layer 23
3.3.2 Seeing Layer 24
3.4 Mask-Transformer Encoder 25
4. EXPERIMENT 27
4.1 Datasets 28
4.1.1 Sentiment-Related 28
4.1.2 Topic-Related 31
4.1.3 Question-Related 32
4.2 Knowledge Graph 33
4.2.1 The Common-Sense Knowledge Graph (CSKG) 33
4.2.2 Each Knowledge Graphs Used in the CSKG 33
4.3 Baseline 34
4.4 Parameters 35
5. RESULTS 36
5.1 BERT-CSKG 36
5.2 BERT-AT 37
5.3 BERT-CN 39
5.4 BERT-FN 39
5.5 BERT-RG 40
5.6 BERT-VG 41
5.7 BERT-WD 41
5.8 BERT-WN 43
5.9 Inspiration of the Experiment 44
5.9.1 Important Findings of the Experiment 44
5.9.2 Answers to the Research Questions 45
6. CONCLUSION AND FUTURE PERSPECTIVES 48
REFERENCES 49
APPENDIX A 55
APPENDIX B 58
APPENDIX C 61
APPENDIX D 64
APPENDIX E 67
APPENDIX F 70
APPENDIX G 73
APPENDIX H 76
APPENDIX I 79
APPENDIX J 80
APPENDIX K 88

參考文獻

1. Aggarwal, C. C., & Zhai, C. (2012). A survey of text classification algorithms. In Springer eBooks (pp. 163–222).
2. Almatarneh, S., & Gamallo, P. (2018). A lexicon based method to search for extreme opinions. PLOS ONE, 13(5), e0197816.
3. Anand, R., & Jeffrey David, U. (2011). Mining of massive datasets. Cambridge university press.
4. Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., & Ives, Z. (2007). Dbpedia: A nucleus for a web of open data. In The Semantic Web: 6th International Semantic Web Conference, 2nd Asian Semantic Web Conference, ISWC 2007+ ASWC 2007, Busan, Korea, November 11-15, 2007. Proceedings (pp. 722-735). Springer Berlin Heidelberg.
5. Bai, J., Wang, Y., Chen, Y., Yang, Y., Bai, J., Yu, J., & Tong, Y. (2021). Syntax-BERT: Improving pre-trained transformers with syntax trees.
6. Baker, C. F., Fillmore, C. J., & Lowe, J. (1998). The Berkeley FrameNet Project.
7. Bandy, J., & Vincent, N. (2021). Addressing" documentation debt" in machine learning research: A retrospective datasheet for bookcorpus.
8. Berger, A. C., Della Pietra, V. J., & Della Pietra, S. A. (1996). A maximum entropy approach to natural language processing. Computational Linguistics, 22(1), 39–71.
9. Birjali, M., Kasri, M., & Beni-Hssane, A. (2021). A comprehensive survey on sentiment analysis: Approaches, challenges and trends. Knowledge Based Systems, 226, 107134.
10. Blitzer, J., Dredze, M., & Pereira, F. (2007, June). Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In Proceedings of the 45th annual meeting of the association of computational linguistics (pp. 440-447).
11. Brown, E. W., & Coden, A. R. (2001, September). Capitalization recovery for text. In Workshop on Information Retrieval Techniques for Speech Applications (pp. 11-22). Berlin, Heidelberg: Springer Berlin Heidelberg.
12. Bejan, C. A., & Harabagiu, S. M. (2008, May). A linguistic resource for discovering event structures and resolving event coreference. In Language Resources and Evaluation Conference.
13. Chen, X., Jia, S., & Xiang, Y. (2020). A review: Knowledge reasoning over knowledge graph. Expert Systems With Applications, 141, 112948.
14. Chen, Y., Li, H., Li, H., Liu, W., Wu, Y., Huang, Q., & Wan, S. (2022). An overview of knowledge graph reasoning: key technologies and applications. Journal of Sensor and Actuator Networks, 11(4), 78.
15. Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273–297.
16. Dang, N. C., Moreno-García, M. N., & De la Prieta, F. (2020). Sentiment analysis based on deep learning: A comparative study. Electronics, 9(3), 483.
17. Deng, J., Dong, W., Socher, R., Li, L., Li, K., & Fei-Fei, L. (2009). ImageNet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition.
18. Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 1, 4171-4186.
19. Ettinger, A. (2020). What BERT is not: Lessons from a new suite of psycholinguistic diagnostics for language models. Transactions of the Association for Computational Linguistics, 8, 34-48.
20. Fagin, R., Halpern, J. Y., Moses, Y., & Vardi, M. (2004). Reasoning about knowledge. MIT press.
21. Färber, M., Ell, B., Menne, C., & Rettinger, A. (2015). A comparative survey of dbpedia, freebase, opencyc, wikidata, and yago. Semantic Web Journal, 1(1), 1-5.
22. Fox, C. P. (1989). A stop list for general text. Sigir Forum, 24(1–2), 19–21.
23. Ghidini, C., & Serafini, L. (1999). A context-based logic for distributed knowledge representation and reasoning. In Modeling and Using Context: Second International and Interdisciplinary Conference, CONTEXT’99 Trento, Italy, September 9–11, 1999 Proceedings 2 (pp. 159-172). Springer Berlin Heidelberg.
24. Grefenstette, G. (1999). Tokenization. Syntactic Wordclass Tagging, 117-133.
25. Grosan, C., Abraham, A., Grosan, C., & Abraham, A. (2011). Rule-based expert systems. Intelligent systems: A modern approach, 149-185.
26. Hogan, A., Blomqvist, E., Cochez, M., d’Amato, C., Melo, G. d., Gutierrez, C., Kirrane, S., Gayo, J. E. L., Navigli, R., & Neumaier, S. (2021). Knowledge graphs. ACM Computing Surveys (CSUR), 54(4), 1-37.
27. Hwang, J. D., Bhagavatula, C., Le Bras, R., Da, J., Sakaguchi, K., Bosselut, A., & Choi, Y. (2021, May). (comet-) atomic 2020: On symbolic and neural commonsense knowledge graphs. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 35, No. 7, pp. 6384-6392).
28. Ikonomakis, M., Kotsiantis, S., & Tampakas, V. (2005). Text classification using machine learning techniques. WSEAS transactions on computers, 4(8), 966-974.
29. Ilievski, F., Szekely, P., & Zhang, B. (2021). Cskg: The commonsense knowledge graph. In The Semantic Web: 18th International Conference, ESWC 2021, Virtual Event, June 6–10, 2021, Proceedings 18 (pp. 680-696). Springer International Publishing.
30. Jesus, J., Araújo, D., & Canuto, A. (2016, October). Fusion approaches of feature selection algorithms for classification problems. In 2016 5th Brazilian Conference on Intelligent Systems (BRACIS) (pp. 379-384). IEEE.
31. Ji, S., Pan, S., Cambria, E., Marttinen, P., & Philip, S. Y. (2021). A survey on knowledge graphs: Representation, acquisition, and applications. IEEE transactions on neural networks and learning systems, 33(2), 494-514.
32. Jović, A., Brkić, K., & Bogunović, N. (2015, May). A review of feature selection methods with applications. In 2015 38th international convention on information and communication technology, electronics and microelectronics (MIPRO) (pp. 1200-1205). Ieee.
33. Jacovi, A., Shalom, O. S., & Goldberg, Y. (2018). Understanding convolutional neural networks for text classification. arXiv preprint arXiv:1809.08037.
34. Kowsari, K., Jafari Meimandi, K., Heidarysafa, M., Mendu, S., Barnes, L., & Brown, D. (2019). Text classification algorithms: A survey. Information, 10(4), 150.
35. Kamps, J., & Marx, M. (2002). Visualizing wordnet structure. In Proc. of the 1st International Conference on Global WordNet (pp. 182-186).
36. Krishna, R., Zhu, Y., Groth, O., Johnson, J., Hata, K., Kravitz, J., Chen, S., Kalantidis, Y., Li, L.-J., & Shamma, D. A. (2017). Visual genome: Connecting language and vision using crowdsourced dense image annotations. International journal of computer vision, 123, 32-73.
37. Li, X., & Roth, D. (2002). Learning question classifiers. In COLING 2002: The 19th International Conference on Computational Linguistics.
38. Ligthart, A., Catal, C., & Tekinerdogan, B. (2021). Systematic reviews in sentiment analysis: a tertiary study. Artificial Intelligence Review, 1-57.
39. Lin, Y., Liu, Z., Sun, M., Liu, Y., & Zhu, X. (2015, February). Learning entity and relation embeddings for knowledge graph completion. In Proceedings of the AAAI conference on artificial intelligence (Vol. 29, No. 1).
40. Liu, B. (2012). Sentiment analysis and opinion mining. Synthesis Lectures on Human Language Technologies, 5(1), 1–167.
41. Liu, H., & Singh, P. (2004). ConceptNet—a practical commonsense reasoning tool-kit. BT technology journal, 22(4), 211-226.
42. Liu, W., Zhou, P., Zhao, Z., Wang, Z., Ju, Q., Deng, H., & Wang, P. (2020, April). K-bert: Enabling language representation with knowledge graph. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 34, No. 03, pp. 2901-2908).
43. Liu, P., Qiu, X., & Huang, X. (2016). Recurrent neural network for text classification with multi-task learning.
44. Lovera, F. A., Cardinale, Y. C., & Homsi, M. N. (2021). Sentiment analysis in twitter based on knowledge graph and deep learning classification. Electronics, 10(22), 2739.
45. Lilleberg, J., Zhu, Y., & Zhang, Y. (2015, July). Support vector machines and word2vec for text classification with semantic features. In 2015 IEEE 14th International Conference on Cognitive Informatics & Cognitive Computing (ICCI* CC) (pp. 136-140). IEEE.
46. Liu, G., & Guo, J. (2019). Bidirectional LSTM with attention mechanism and convolutional layer for text classification. Neurocomputing, 337, 325-338.
47. Liu, J., Lu, Z., & Du, W. (2019). Combining enterprise knowledge graph and news sentiment analysis for stock price prediction.
48. Leskovec, J., Rajaraman, A., & Ullman, J. D. (2020). Mining of massive data sets. Cambridge university press.
49. Ma, X., Xu, P., Wang, Z., Nallapati, R., & Xiang, B. (2019, November). Domain adaptation with BERT-based domain classification and data selection. In Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo 2019) (pp. 76-83).
50. Maas, A., Daly, R. E., Pham, P. T., Huang, D., Ng, A. Y., & Potts, C. (2011, June). Learning word vectors for sentiment analysis. In Proceedings of the 49th annual meeting of the association for computational linguistics: Human language technologies (pp. 142-150).
51. Manago, M., & Kodratoff, Y. (1987, August). Noise and Knowledge Acquisition. In IJCAI (pp. 348-354). Medhat, W., Hassan, A., & Korashy, H. (2014). Sentiment analysis algorithms and applications: A survey. Ain Shams engineering journal, 5(4), 1093-1113.
52. Miller, G. A. (1995). WordNet: a lexical database for English. Communications of the ACM, 38(11), 39-41.
53. Marin, A., Holenstein, R., Sarikaya, R., & Ostendorf, M. (2014). Learning phrase patterns for text classification using a knowledge graph and unlabeled data. In Fifteenth annual conference of the international speech communication association.
54. Ostendorff, M., Bourgonje, P., Berger, M., Moreno-Schneider, J., Rehm, G., & Gipp, B. (2019). Enriching bert with knowledge graph embeddings for document classification.
55. Quinlan, J. R. (1986). Induction of decision trees. Machine learning, 1, 81-106.
56. Rish, I. (2001, August). An empirical study of the naive Bayes classifier. In IJCAI 2001 workshop on empirical methods in artificial intelligence (Vol. 3, No. 22, pp. 41-46).
57. Roget, P. M. (1911). Roget′s Thesaurus of English Words and Phrases. TY Crowell Company.
58. Sap, M., Le Bras, R., Allaway, E., Bhagavatula, C., Lourie, N., Rashkin, H., Roof, B., Smith, N. A., & Choi, Y. (2019). Atomic: An atlas of machine commonsense for if-then reasoning. In Proceedings of the AAAI conference on artificial intelligence (Vol. 33, No. 01, pp. 3027-3035).
59. Singh, J., & Gupta, V. (2016). Text stemming: Approaches, applications, and challenges. ACM Computing Surveys (CSUR), 49(3), 1-46.
60. Sun, C., Qiu, X., Xu, Y., & Huang, X. (2019). How to fine-tune bert for text classification?. In Chinese Computational Linguistics: 18th China National Conference, CCL 2019, Kunming, China, October 18–20, 2019, Proceedings 18 (pp. 194-206). Springer International Publishing.
61. Singhal, Amit (2012). Introducing the knowledge graph: Things, not strings. Google Official Blog.
62. Speer, R., Chin, J., & Havasi, C. (2017, February). Conceptnet 5.5: An open multilingual graph of general knowledge. In Proceedings of the AAAI conference on artificial intelligence (Vol. 31, No. 1).
63. Taboada, M., Brooke, J., Tofiloski, M., Voll, K., & Stede, M. (2011). Lexicon-based methods for sentiment analysis. Computational linguistics, 37(2), 267-307.
64. Tong, S., & Koller, D. (2001). Support vector machine active learning with applications to text classification. Journal of machine learning research, 2(Nov), 45-66.
65. Uysal, A. K., & Gunal, S. (2014). The impact of preprocessing on text classification. Information processing & management, 50(1), 104-112.
66. Vizcarra, J., Kozaki, K., Torres Ruiz, M., & Quintero, R. (2021). Knowledge-based sentiment analysis and visualization on social networks. New Generation Computing, 39, 199-229.
67. Vrandečić, D., & Krötzsch, M. (2014). Wikidata: a free collaborative knowledgebase. Communications of the ACM, 57(10), 78-85.
68. Wankhade, M., Rao, A. C. S., & Kulkarni, C. (2022). A survey on sentiment analysis methods, applications, and challenges. Artificial Intelligence Review, 55(7), 5731-5780.
69. Wadawadagi, R., & Pagi, V. (2020). Sentiment analysis with deep neural networks: comparative study and performance assessment. Artificial Intelligence Review, 53(8), 6155-6195.
70. Xian, Y., Lampert, C. H., Schiele, B., & Akata, Z. (2018). Zero-shot learning—a comprehensive evaluation of the good, the bad and the ugly. IEEE transactions on pattern analysis and machine intelligence, 41(9), 2251-2265.
71. Yadav, A., & Vishwakarma, D. K. (2020). Sentiment analysis using deep learning architectures: a review. Artificial Intelligence Review, 53(6), 4335-4385.
72. Zhang, L., Ghosh, R., Dekhil, M., Hsu, M., & Liu, B. (2011). Combining lexicon-based and learning-based methods for Twitter sentiment analysis. HP Laboratories, Technical Report HPL-2011, 89, 1-8.
73. Zhang, T., Fan, S., Hu, J., Guo, X., Li, Q., Zhang, Y., & Wulamu, A. (2021). A feature fusion method with guided training for classification tasks. Computational Intelligence and Neuroscience, 2021.
74. Zhang, X., Zhao, J., & LeCun, Y. (2015). Character-level convolutional networks for text classification. Advances in neural information processing systems, 28.
75. Zhong, Y., Zhang, Z., Zhang, W., & Zhu, J. (2021). BERT-KG: a short text classification model based on knowledge graph and deep semantics. In Natural Language Processing and Chinese Computing: 10th CCF International Conference, NLPCC 2021, Qingdao, China, October 13–17, 2021, Proceedings, Part I 10 (pp. 721-733). Springer International Publishing.

指導教授

周惠文柯士文(Huey-Wen Chou Shih-Wen Ke)

審核日期

2023-6-30

推文