A Novel Reinforced Contrastive Learning on Sentence Semantic Matching

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：47

、訪客IP：3.140.198.3

姓名

蔡琇鈞(Hsiu-Chun Tsai) 查詢紙本館藏

畢業系所

資訊管理學系

論文名稱

(A Novel Reinforced Contrastive Learning on Sentence Semantic Matching)

相關論文

★ 台灣50走勢分析：以多重長短期記憶模型架構為基礎之預測	★ 以多重遞迴歸神經網路模型為基礎之黃金價格預測分析
★ 增量學習用於工業4.0瑕疵檢測	★ 遞回歸神經網路於電腦零組件銷售價格預測之研究
★ 長短期記憶神經網路於釣魚網站預測之研究	★ 基於深度學習辨識跳頻信號之研究
★ Opinion Leader Discovery in Dynamic Social Networks	★ 深度學習模型於工業4.0之機台虛擬量測應用
★ A Novel NMF-Based Movie Recommendation with Time Decay	★ 以類別為基礎sequence-to-sequence模型之POI旅遊行程推薦
★ A DQN-Based Reinforcement Learning Model for Neural Network Architecture Search	★ Neural Network Architecture Optimization Based on Virtual Reward Reinforcement Learning
★ 生成式對抗網路架構搜尋	★ 以漸進式基因演算法實現神經網路架構搜尋最佳化
★ Enhanced Model Agnostic Meta Learning with Meta Gradient Memory	★ 遞迴類神經網路結合先期工業廢水指標之股價預測研究

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 (2029-7-1以後開放)

摘要(中)

句子語義配對是自然語言處理的重要任務之一，主要被廣泛應用在比較多個句子的語義並獲得他們的相似度以進行篩選或排名，常被用於搜尋引擎、問答系統，以找出最合適的回覆。過去的研究通常考慮不同的文字特徵萃取方法，卻忽略不同語義的句子會提供不同的交互知識，對於句子語義配對任務而言，相似的句子間仍會存在不同構面，傳統的方法中只能表明其相關性，不足以分出更適當的候選，導致系統表現有限。為解決這種難題，我們開發了一種新的強化對比學習（RCL）模型來產生語義特徵，該模型結合了交叉注意力機制和對比學習來輔助判斷相鄰特徵。我們也將 RCL 運用至真實世界的資料集中，並驗證其表現皆優於基準模型。

摘要(英)

Sentence Semantic Matching (SSM) is a crucial component in natural language processing (NLP) tasks. It involves comparing the semantics of multiple sentences and ranking their similarities to identify the most similar one. Recently, contrastive learning has been proven to be beneficial in generating complex semantic features and promoting performance. Early research usually considers the different data construction, but ignoring the different semantic sentences will give variant knowledge of the interaction to sentence anchor, which might not be enough to capture the comprehensive observation of semantic features and lead to limited performance. We developed a new Reinforced Contrastive Learning (RCL) model to generate contextual features, which combined a cross-attention mechanism and contrastive learning to assist the adjacent feature. RCL was applied to numerous real-world datasets and it demonstrated state-of-the-art experimental results on the SSM task.

關鍵字(中)

★ 自然語言處理
★ 句子語義配對
★ 對比學習

關鍵字(英)

★ Natural Language Processing
★ Sentence Semantic Matching
★ Contrastive Learning

論文目次

摘要 i
Abstract ii
Table of Contents iii
List of Figures iv
List of Tables v
I. Introduction 1
II. Related Works 6
2.1 Sentence Semantic Matching 6
2.2 Contrastive Representation Learning 8
III. Proposed Model: RCL 12
3.1 RCL Encoder 12
3.2 RCL classifier 15
IV. Experiments 18
4.1 Environment Settings & Datasets 18
4.2 Evaluation Metrics & Baselines 19
4.3 Performance Comparison 22
4.4 Effectiveness of cross-attention module on CL 25
4.5 Pretrained Language Model on CL 26
4.6 Ablation Study 27
4.7 The Discussion on Delta Threshold Setting 29
4.8 The Sensitivity of Parameter Setting 30
4.9 Case Study 35
V. Conclusions 38
Reference 39

參考文獻

[1] L. Wang, J. Zhang, G. Chen, and D. Qiao, “Identifying comparable entities with indirectly associative relations and word embeddings from web search logs,” Decision Support Systems, 2021.
[2] Y. Wang, G. Chen, Y. (Calvin) Xu, and X. Lu, “Novel role-wise attention mechanism for predicting purchases made through chat-based online customer services,” Decision Support Systems, 2023.
[3] S. Wang and D. Dang, “A Generative Answer Aggregation Model for Sentence-Level Crowdsourcing Tasks,” IEEE Transactions Knowledge Data Eng,ineering, 2023.
[4] L. Wang, X. Zhao, N. Liu, Z. Shen, and C. Zou, “Cognitive process-driven model design: A deep learning recommendation model with textual review and context,” Decision Support Systems, 2024.
[5] L. Fang, Y. Luo, K. Feng, K. Zhao, and A. Hu, “A Knowledge-Enriched Ensemble Method for Word Embedding and Multi-Sense Embedding,” IEEE Transactions Knowledge Data Engineering, 2023.
[6] X. Zhang, Q. Li, C. Li, and H. Chen, “ConPhrase: Enhancing Context-Aware Phrase Mining from Text Corpora,” IEEE Transactions Knowledge Data Engineering, 2022.
[7] D. Zhang, Y. Liu, Z. Yuan, Y. Fu, H. Chen, and H. Xiong, “Multi-Faceted Knowledge-Driven Pre-Training for Product Representation Learning,” IEEE Transactions Knowledge Data Engineering, 2022.
[8] Y. yeon Sung and S. B. Kim, “Topical keyphrase extraction with hierarchical semantic networks,” Decision Support Systems, 2020.
[9] H. Huang, X. Liu, G. Shi, and Q. Liu, “Event Extraction With Dynamic Prefix Tuning and Relevance Retrieval,” IEEE Transactions Knowledge Data Engineering, 2023.
[10] Q. Wang, C. Zhu, Y. Zhang, H. Zhong, J. Zhong, and V. S. Sheng, “Short Text Topic Learning Using Heterogeneous Information Network,” IEEE Transactions Knowledge Data Engineering, 2023.
[11] B. Furlan, V. Batanović, and B. Nikolić, “Semantic similarity of short texts in languages with a deficient natural language processing support,” Decision Support Systems, 2013.
[12] E. Tello-Leal, A. B. Rios-Alvarado, and A. Diaz-Manriquez, “A Semantic Knowledge Management System for Government Repositories,” in Proceedings of International Workshop on Database and Expert Systems Applications (DEXA), 2016.
[13] W. Lu, R. Yu, S. Wang, C. Wang, P. Jian, and H. Huang, “Sentence Semantic Matching Based on 3D CNN for Human–Robot Language Interaction,” ACM Transactions on Internet Technology (TOIT), 2021.
[14] G. Stoilos, N. Papasarantopoulos, P. Vougiouklis, and P. Bansky, “Type Linking for Query Understanding and Semantic Search,” in Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2022.
[15] A. Vaswani et al., “Attention Is All You Need,” Advances in Neural Information Process System, 2017.
[16] J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies Conference (NAACL HLT 2019), 2018.
[17] Y. Liu et al., “RoBERTa: A Robustly Optimized BERT Pretraining Approach,” in Proceedings of the 20th Chinese National Conference on Computational Linguistics, 2019.
[18] W. X. Zhao, J. Liu, R. Ren, and J.-R. Wen, “Dense Text Retrieval based on Pretrained Language Models: A Survey,” ACM Transactions on Information Systems, 2024.
[19] L. Gao, Z. Dai, and J. Callan, “Rethink Training of BERT Rerankers in Multi-Stage Retrieval Pipeline,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 12657 LNCS, 2021.
[20] N. Thakur, N. Reimers, J. Daxenberger, and I. Gurevych, “Augmented SBERT: Data Augmentation Method for Improving Bi-Encoders for Pairwise Sentence Scoring Tasks,” in Proceedings of the 2021 North American Chapter of the Association for Computational Linguistics: Human Language Technologies Conference (NAACL-HLT 2021), 2021.
[21] A. van den Oord DeepMind, Y. Li DeepMind, and O. Vinyals DeepMind, “Representation Learning with Contrastive Predictive Coding,” ArXiv, 2018.
[22] T. Gao, X. Yao, and D. Chen, “SimCSE: Simple Contrastive Learning of Sentence Embeddings,” in Proceedings of the 2021 Empirical Methods in Natural Language Processing Conference (EMNLP 2021), 2021.
[23] K. He, H. Fan, Y. Wu, S. Xie, and R. Girshick, “Momentum Contrast for Unsupervised Visual Representation Learning,” in Proceedings of 2020 IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR), 2019.
[24] Lingling Xu, Haoran Xie, Zongxi Li, Fu Lee Wang, Weiming Wang, and Qing Li, “Contrastive Learning Models for Sentence Representations,” ACM Transactions on Intelligent Systems Technology, 2023.
[25] Q. Zhong Alibaba et al., “The Short Text Matching Model Enhanced with Knowledge via Contrastive Learning,” ArXiv, 2023.
[26] Y. Zuo, W. Lu, X. Peng, S. Wang, W. Zhang, and X. Qiao, “DuCL: Dual-stage contrastive learning framework for Chinese semantic textual matching,” Computers and Electrical Engineering, 2023.
[27] T. Schopf, E. Gerber, M. Ostendorff, and F. Matthes, “AspectCSE: Sentence Embeddings for Aspect-based Semantic Textual Similarity using Contrastive Learning and Structured Knowledge,” ArXiv, 2023.
[28] T. Chen, S. Kornblith, M. Norouzi, and G. Hinton, “A Simple Framework for Contrastive Learning of Visual Representations,” in Proceedings of the 37th International Conference on Machine Learning (ICML 2020), 2020.
[29] H. Yang, X. Ding, J. Wang, and J. Li, “SimCL:Simple Contrastive Learning for Image Classification,” ACM International Conference Proceeding Series, 2022.
[30] S. Zhang, R. Xu, C. Xiong, and C. Ramaiah, “Use All The Labels: A Hierarchical Multi-Label Contrastive Learning Framework,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2022), 2022.
[31] C. Hou, J. Zhang, H. Wang, T. Zhou, and J. University, “Subclass-balancing Contrastive Learning for Long-tailed Recognition,” in Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision (ICCV 2023), 2023.
[32] S. G. Aithal, A. B. Rao, and S. Singh, “Automatic question-answer pairs generation and question similarity mechanism in question answering system,” Applied Intelligence, 2021.
[33] D. S. Sachan, M. Lewis, D. Yogatama, L. Zettlemoyer, J. Pineau, and M. Zaheer, “Questions Are All You Need to Train a Dense Passage Retriever,” Transactions of the Association for Computational Linguistics, 2022.
[34] T. He, W. Huang, Y. Qiao, and J. Yao, “Text-Attentional Convolutional Neural Networks for Scene Text Detection,” IEEE Transactions on Image Processing, 2015.
[35] J. Mueller and A. Thyagarajan, “Siamese Recurrent Architectures for Learning Sentence Similarity,” AAAI Conference on Artificial Intelligence, 2016.
[36] Z. Wang, W. Hamza, and R. Florian, “Bilateral Multi-Perspective Matching for Natural Language Sentences,” in Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI 2017), 2017.
[37] Z. Lian and Y. Lan, “Multi-layer attention neural network for sentence semantic matching,” ACM International Conference Proceeding Series, 2019.
[38] S. Kim, I. Kang, and N. Kwak, “Semantic sentence matching with densely-connected recurrent and co-attentive information,” in Proceedings of the 33rd AAAI Conference on Artificial Intelligence (AAAI 2019), 31st Innovative Applications of Artificial Intelligence Conference (IAAI 2019) and the 9th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI 2019), 2019.
[39] Y. Deng, X. Li, M. Zhang, X. Lu, and X. Sun, “Enhanced distance-aware self-attention and multi-level match for sentence semantic matching,” Neurocomputing, 2022.
[40] X. Zhang, Y. Li, W. Lu, P. Jian, and G. Zhang, “Intra-Correlation Encoding for Chinese Sentence Intention Matching,” in Proceedings of the 28th International Conference on Computational Linguistics Conference (COLING 2020), 2020.
[41] X. Zhang, W. Lu, G. Zhang, F. Li, and S. Wang, “Chinese Sentence Semantic Matching Based on Multi-Granularity Fusion Model,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2020.
[42] P. Zhao, W. Lu, Y. Li, J. Yu, P. Jian, and X. Zhang, “Chinese Semantic Matching with Multi-granularity Alignment and Feature Fusion,” in Proceedings of the International Joint Conference on Neural Networks (IJCNN 2021), 2021.
[43] Y. Zuo et al., “Chinese Sentence Matching with Multiple Alignments and Feature Augmentation,” in Proceedings of the International Joint Conference on Neural Networks (IJCNN 2022), 2022.
[44] X. Tang, Y. Luo, D. Xiong, J. Yang, R. Li, and D. Peng, “Short text matching model with multiway semantic interaction based on multi-granularity semantic embedding,” Applied Intelligence, 2022.
[45] N. Reimers and I. Gurevych, “Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks,” in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing Conference (EMNLP-IJCNLP 2019), 2019.
[46] W. Guo et al., “DeText: A Deep Text Ranking Framework with BERT,” in Proceedings of the International Conference on Information and Knowledge Management (CIKM 2022), 2020.
[47] P. Khosla et al., “Supervised Contrastive Learning,” in Proceedings of the 34th International Conference on Neural Information Processing Systems (NIPS 2020), 2020.
[48] Y. Yan, R. Li, S. Wang, F. Zhang, W. Wu, and W. Xu, “ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer,” in Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing Conference (ACL-IJCNLP 2021), 2021.
[49] R. Cao et al., “Exploring the Impact of Negative Samples of Contrastive Learning: A Case Study of Sentence Embedding,” Findings of the Annual Meeting of the Association for Computational Linguistics, 2022.
[50] Y. S. Chuang et al., “DiffCSE: Difference-based Contrastive Learning for Sentence Embeddings,” in Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies Conference (NAACL 2022), 2022.
[51] X. Wu, C. Gao, Z. Lin, J. Han, Z. Wang, and S. Hu, “InfoCSE: Information-aggregated Contrastive Learning of Sentence Embeddings,” Findings of the Association for Computational Linguistics: EMNLP 2022, 2022.
[52] T. Shao, F. Cai, J. Zheng, M. Wang, and H. Chen, “Pairwise contrastive learning for sentence semantic equivalence identification with limited supervision,” Knowlege Based Systems, 2023.
[53] Z. Lu, H. Xia, and W. Hong, “Multi-granularity Feature Fusion Algorithm for Short Chinese Texts Based on Hierarchical Attention Networks,” ACM International Conference Proceeding Series, 2022.
[54] M. Li et al., “Keywords and Instances: A Hierarchical Contrastive Learning Framework Unifying Hybrid Granularities for Text Generation,” in Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, 2022.
[55] H. Sedghamiz, S. Raval, E. Santus, T. Alhanai, and M. Ghassemi, “SupCL-Seq: Supervised Contrastive Learning for Downstream Optimized Sequence Representations,” Findings of the Association for Computational Linguistics, Findings of ACL: EMNLP 2021, 2021.
[56] J. Chen, Q. Chen, X. Liu, H. Yang, D. Lu, and B. Tang, “The BQ Corpus: A Large-scale Domain-specific Chinese Corpus For Sentence Semantic Equivalence Identification,” in Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP 2018), 2018.
[57] X. Liu et al., “LCQMC:A Large-scale Chinese Question Matching Corpus.” In Proceedings of the 27th International Conference on Computational Linguistics, 2018.
[58] “FETnet”, https://www.fetnet.net/content/cbu/tw/index.html
[59] Karen Sparck Jones. “A statistical interpretation of term specificity and its application in retrieval.” Document retrieval systems, 1988.

指導教授

陳以錚(Yi-Cheng Chen)

審核日期

2024-7-22

推文