利用與摘要相關的文章重點句結合對比學習改進摘要模型的事實一致性

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：54

、訪客IP：18.216.219.75

姓名

張景泰(Ching-Tai Chang) 查詢紙本館藏

畢業系所

軟體工程研究所

論文名稱

利用與摘要相關的文章重點句結合對比學習改進摘要模型的事實一致性
(Combining Key Sentences Related to the Abstract with Contrastive Learning to Improve Summarization Factual Inconsistency)

相關論文

★ A Real-time Embedding Increasing for Session-based Recommendation with Graph Neural Networks	★ 基於主診斷的訓練目標修改用於出院病摘之十代國際疾病分類任務
★ 混合式心臟疾病危險因子與其病程辨識於電子病歷之研究	★ 基於 PowerDesigner 規範需求分析產出之快速導入方法
★ 社群論壇之問題檢索	★ 非監督式歷史文本事件類型識別──以《明實錄》中之衛所事件為例
★ 應用自然語言處理技術分析文學小說角色之關係：以互動視覺化呈現	★ 基於生醫文本擷取功能性層級之生物學表徵語言敘述：由主成分分析發想之K近鄰算法
★ 基於分類系統建立文章表示向量應用於跨語言線上百科連結	★ Code-Mixing Language Model for Sentiment Analysis in Code-Mixing Data
★ 應用角色感知於深度神經網路架構之對話行為分類	★ 藉由加入多重語音辨識結果來改善對話狀態追蹤
★ 主動式學習之古漢語斷詞	★ 對話系統應用於中文線上客服助理:以電信領域為例
★ 應用遞歸神經網路於適當的時機回答問題	★ 使用多任務學習改善使用者意圖分類

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 (2026-1-18以後開放)

摘要(中)

摘要中的事實不一致性代表摘要中的訊息無法從來源文章中獲得驗證，是抽象式摘要中棘手的問題，研究顯示模型產出的摘要有30\%擁有事實不一致的問題，使得抽象式摘要難以應用在生活中，近幾年研究者也開始重視這個問題。

過去的方法傾向於提供額外的背景知識，將其融入於模型中，或者在模型解碼後對產出的結果進行檢查及更正。

對比學習是近幾年新的模型訓練方法，它在影像領域取得了卓越的成績，其概念是利用正樣本、負樣本之間的對比性，使得模型學習出來的向量物以類聚，正樣本經過模型得到的向量彼此間的距離會較貼近，負樣本經過模型得到的向量彼此間的距離會較疏遠。如此模型在一定程度上擁有了區分不同事物的能力。

在我們的研究中，首先對原始文章找出與摘要每一句最相關的句子，接著對編碼器使用了對比學習方法使得編碼過後的向量可以獲得與摘要更加相關的原始文章向量使得解碼器產出的摘要更符合事實一致。

摘要(英)

Hallucination, also known as factual inconsistency, is when models generate summaries that contain incorrect information or information not mentioned in source text.

It is a critical problem in abstractive summarization and makes summaries generated by models hard to use in practice.
Previous works prefer to add additional information such as background knowledge into the model or use post-correct/rank method after decoding to improve this headache.

Contrastive learning is a new model-training method and has achieved excellent results in the Image Processing field. The concept is to use the contrast between positive and negative samples to make vectors learned by the model cluster together. Given the anchor point, the distance between the anchor point and the positive samples will be closer, and the distance between the anchor point and the negative samples will be farther. This way, the model has the ability to distinguish positive examples from negative examples to a certain extent.

We propose a new method to improve factual consistency by separating representation of the most relevant sentences and the least relevant sentences from the source document during the training phase through contrastive learning so that the model can learn how to generate summaries that are more relevant to the main points of the source documents.

關鍵字(中)

★ 抽象式摘要
★ 預訓練模型
★ 對比學習
★ 事實一致性

關鍵字(英)

★ Abstractive Summarization
★ Pre-trained Model
★ Factual Inconsistency
★ Hallucination
★ Contrastive Learning

論文目次

Contents
中文摘要 i
Abstract ii
誌謝 iv
Contents v
List of Figures vii
List of Tables viii
1 Introduction 1
2 Related work 4
2.1 Pre-trained language model . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1.1 Bart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Automatic text summarization . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.1 Extractive summarization . . . . . . . . . . . . . . . . . . . . . 8
2.2.2 Abstractive summarization . . . . . . . . . . . . . . . . . . . . . 8
2.3 Factuality improvement . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.4 Factuality evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3 Method 11
3.1 Abstractive text summarization . . . . . . . . . . . . . . . . . . . . . . . 11
v
3.2 Sentence extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.2.1 Relevant sentences extraction . . . . . . . . . . . . . . . . . . . 14
3.2.2 Less relevant sentences extraction . . . . . . . . . . . . . . . . . 15
3.3 Contrastive learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.4 Final training objective . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4 Experiments 18
4.1 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.1.1 CNN Dailymail . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.1.2 Xsum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.2 Models to compare . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.3 Evaluation metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.3.1 ROUGE-1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.3.2 ROUGE-2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.3.3 ROUGE-L . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.3.4 QuestEval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.3.5 FactCC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.4 Implementation details . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5 Results and analysis 22
5.1 CNN Dailymail . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
5.2 Xsum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
5.3 Study on contrastive encoder . . . . . . . . . . . . . . . . . . . . . . . . 24
5.4 Case study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.4.1 Embedding visualization . . . . . . . . . . . . . . . . . . . . . . 25
5.5 Hyperparameter combination in final loss . . . . . . . . . . . . . . . . . 27
6 Conclusion 28
Bibliography 28
vi
List of Figures
1.1 Example of factual inconsistency . . . . . . . . . . . . . . . . . . . . . . 2
2.1 Architecture of the Transformer . . . . . . . . . . . . . . . . . . . . . . 6
2.2 Five data corruption methods in Bart . . . . . . . . . . . . . . . . . . . . 7
3.1 Example of sentence extraction on CNN Dailymail dataset . . . . . . . . 13
3.2 Example of sentence extraction on XSum dataset . . . . . . . . . . . . . 14
3.3 Our model architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
5.1 case study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.2 case visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
vii
List of Tables
5.1 The Cnn Dailymail result scores . . . . . . . . . . . . . . . . . . . . . . 23
5.2 The Xsum result scores . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
5.3 Encoder study result . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
5.4 The hyperparameter combination result . . . . . . . . . . . . . . . . . . 27

參考文獻

[1] Y. Liu and M. Lapata, “Text summarization with pretrained encoders,” arXiv preprint
arXiv:1908.08345, 2019.
[2] M. Lewis, Y. Liu, N. Goyal, M. Ghazvininejad, A. Mohamed, O. Levy, V. Stoyanov,
and L. Zettlemoyer, “BART: Denoising sequence-to-sequence pre-training for
natural language generation, translation, and comprehension,” in Proceedings of the
58th Annual Meeting of the Association for Computational Linguistics. Online:
Association for Computational Linguistics, Jul. 2020, pp. 7871–7880. [Online].
Available: https://www.aclweb.org/anthology/2020.acl-main.703
[3] J. Maynez, S. Narayan, B. Bohnet, and R. McDonald, “On faithfulness and
factuality in abstractive summarization,” in Proceedings of the 58th Annual
Meeting of the Association for Computational Linguistics. Online: Association
for Computational Linguistics, Jul. 2020, pp. 1906–1919. [Online]. Available:
https://www.aclweb.org/anthology/2020.acl-main.173
[4] Y. Huang, X. Feng, X. Feng, and B. Qin, “The factual inconsistency problem in
abstractive text summarization: A survey,” arXiv preprint arXiv:2104.14839, 2021.
[5] C.-Y. Lin, “ROUGE: A package for automatic evaluation of summaries,”
in Text Summarization Branches Out. Barcelona, Spain: Association for
Computational Linguistics, Jul. 2004, pp. 74–81. [Online]. Available: https:
//www.aclweb.org/anthology/W04-1013
[6] A. Wang, K. Cho, and M. Lewis, “Asking and answering questions to evaluate
the factual consistency of summaries,” in Proceedings of the 58th Annual
Meeting of the Association for Computational Linguistics. Online: Association
for Computational Linguistics, Jul. 2020, pp. 5008–5020. [Online]. Available:
https://www.aclweb.org/anthology/2020.acl-main.450
[7] S. Cao and L. Wang, “Cliff: Contrastive learning for improving faithfulness and
factuality in abstractive summarization,” arXiv preprint arXiv:2109.09209, 2021.
[8] W. Liu, H. Wu, W. Mu, Z. Li, T. Chen, and D. Nie, “Co2sum: Contrastive learning
for factual-consistent abstractive summarization,” arXiv preprint arXiv:2112.01147,
2021.
[9] P. F. Brown, V. J. Della Pietra, P. V. Desouza, J. C. Lai, and R. L. Mercer, “Class-
based n-gram models of natural language,” Computational linguistics, vol. 18, no. 4,
pp. 467–480, 1992.
[10] T. Mikolov, M. Karafiát, L. Burget, J. Cernockỳ, and S. Khudanpur, “Recurrent neu-
ral network based language model.” in Interspeech, vol. 2, no. 3. Makuhari, 2010,
pp. 1045–1048
[11] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser,
and I. Polosukhin, “Attention is all you need,” in Advances in neural information
processing systems, 2017, pp. 5998–6008.
[12] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep
bidirectional transformers for language understanding,” in Proceedings of the 2019
Conference of the North American Chapter of the Association for Computational
Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers).
Minneapolis, Minnesota: Association for Computational Linguistics, Jun. 2019, pp.
4171–4186. [Online]. Available: https://www.aclweb.org/anthology/N19-1423
[13] A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, I. Sutskever et al., “Language
models are unsupervised multitask learners,” OpenAI blog, vol. 1, no. 8, p. 9, 2019.
[14] Q. Zhou, N. Yang, F. Wei, S. Huang, M. Zhou, and T. Zhao, “Neural
document summarization by jointly learning to score and select sentences,” in
Proceedings of the 56th Annual Meeting of the Association for Computational
Linguistics (Volume 1: Long Papers). Melbourne, Australia: Association
for Computational Linguistics, Jul. 2018, pp. 654–663. [Online]. Available:
https://www.aclweb.org/anthology/P18-1061
[15] Y. Wu and B. Hu, “Learning to extract coherent summary via deep reinforcement
learning,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32,
no. 1, 2018
[16] N. Moratanch and S. Chitrakala, “A survey on abstractive text summarization,” in
2016 International Conference on Circuit, power and computing technologies (IC-
CPCT). IEEE, 2016, pp. 1–7.
[17] Z. Zhao, S. B. Cohen, and B. Webber, “Reducing quantity hallucinations in
abstractive summarization,” in Findings of the Association for Computational
Linguistics: EMNLP 2020. Online: Association for Computational Linguistics,
Nov. 2020, pp. 2237–2249. [Online]. Available: https://www.aclweb.org/anthology/
2020.findings-emnlp.203
[18] S. Chen, F. Zhang, K. Sone, and D. Roth, “Improving faithfulness in abstractive
summarization with contrast candidate generation and selection,” arXiv preprint
arXiv:2104.09061, 2021.
[19] M. Cao, Y. Dong, J. Wu, and J. C. K. Cheung, “Factual error correction
for abstractive summarization models,” in Proceedings of the 2020 Conference
on Empirical Methods in Natural Language Processing (EMNLP). Online:
Association for Computational Linguistics, Nov. 2020, pp. 6251–6258. [Online].
Available: https://www.aclweb.org/anthology/2020.emnlp-main.506
[20] Y. Dong, S. Wang, Z. Gan, Y. Cheng, J. C. K. Cheung, and J. Liu, “Multi-
fact correction in abstractive text summarization,” in Proceedings of the 2020
Conference on Empirical Methods in Natural Language Processing (EMNLP).
Online: Association for Computational Linguistics, Nov. 2020, pp. 9320–9331.
[Online]. Available: https://www.aclweb.org/anthology/2020.emnlp-mai
[21] F. Nan, R. Nallapati, Z. Wang, C. N. d. Santos, H. Zhu, D. Zhang, K. McKeown,
and B. Xiang, “Entity-level factual consistency of abstractive text summarization,”
arXiv preprint arXiv:2102.09130, 2021.
[22] Z. Cao, F. Wei, W. Li, and S. Li, “Faithful to the original: Fact aware neural ab-
stractive summarization,” in thirty-second AAAI conference on artificial intelligence,
2018.
[23] B. Gunel, C. Zhu, M. Zeng, and X. Huang, “Mind the facts: Knowledge-boosted
coherent abstractive text summarization,” arXiv preprint arXiv:2006.15435, 2020.
[24] S. Welleck, I. Kulikov, S. Roller, E. Dinan, K. Cho, and J. Weston, “Neural text
generation with unlikelihood training,” arXiv preprint arXiv:1908.04319, 2019.
[25] H. Li, J. Zhu, J. Zhang, and C. Zong, “Ensure the correctness of the summary: In-
corporate entailment knowledge into abstractive sentence summarization,” in Pro-
ceedings of the 27th International Conference on Computational Linguistics, 2018,
pp. 1430–1441.
[26] F. Nan, C. N. d. Santos, H. Zhu, P. Ng, K. McKeown, R. Nallapati, D. Zhang,
Z. Wang, A. O. Arnold, and B. Xiang, “Improving factual consistency of abstractive
summarization via question answering,” arXiv preprint arXiv:2105.04623, 2021.
[27] W. Kryscinski, B. McCann, C. Xiong, and R. Socher, “Evaluating the factual
consistency of abstractive text summarization,” in Proceedings of the 2020
Conference on Empirical Methods in Natural Language Processing (EMNLP).
Online: Association for Computational Linguistics, Nov. 2020, pp. 9332–9346.
[Online]. Available: https://www.aclweb.org/anthology/2020.emnlp-main.750
[28] T. Scialom, P.-A. Dray, P. Gallinari, S. Lamprier, B. Piwowarski, J. Staiano, and
A. Wang, “Questeval: Summarization asks for fact-based evaluation,” arXiv preprint
arXiv:2103.12693, 2021.
[29] J. Zhang, Y. Zhao, M. Saleh, and P. Liu, “Pegasus: Pre-training with extracted gap-
sentences for abstractive summarization,” in International Conference on Machine
Learning. PMLR, 2020, pp. 11 328–11 339.
[30] T. Zhang, V. Kishore, F. Wu, K. Q. Weinberger, and Y. Artzi, “Bertscore: Evaluating
text generation with bert,” arXiv preprint arXiv:1904.09675, 2019.
[31] R. Nallapati, B. Zhou, C. dos Santos, Ç. glar Gulçehre, and B. Xiang, “Abstractive
text summarization using sequence-to-sequence rnns and beyond,” CoNLL 2016, p.
280, 2016.
[32] S. Narayan, S. B. Cohen, and M. Lapata, “Don＇t give me the details, just the sum-
mary! topic-aware convolutional neural networks for extreme summarization,” in
Proceedings of the 2018 Conference on Empirical Methods in Natural Language
Processing, 2018, pp. 1797–1807.
[33] T. Goyal and G. Durrett, “Annotating and modeling fine-grained factuality in sum-
marization,” arXiv preprint arXiv:2104.04302, 2021

指導教授

蔡宗翰(Tzong-Han Tsai)

審核日期

2023-2-2

推文