評估中文摘要之事實一致性並探討斷詞對其之影響

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：34

、訪客IP：18.116.43.130

姓名

李正倫(Zheng-Lun Li) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

評估中文摘要之事實一致性並探討斷詞對其之影響
(Does the Tokenization Influence the Faithfulness? Evaluation of Hallucinations for Chinese Abstractive Summarization)

相關論文

★ A Real-time Embedding Increasing for Session-based Recommendation with Graph Neural Networks	★ 基於主診斷的訓練目標修改用於出院病摘之十代國際疾病分類任務
★ 混合式心臟疾病危險因子與其病程辨識於電子病歷之研究	★ 基於 PowerDesigner 規範需求分析產出之快速導入方法
★ 社群論壇之問題檢索	★ 非監督式歷史文本事件類型識別──以《明實錄》中之衛所事件為例
★ 應用自然語言處理技術分析文學小說角色之關係：以互動視覺化呈現	★ 基於生醫文本擷取功能性層級之生物學表徵語言敘述：由主成分分析發想之K近鄰算法
★ 基於分類系統建立文章表示向量應用於跨語言線上百科連結	★ Code-Mixing Language Model for Sentiment Analysis in Code-Mixing Data
★ 藉由加入多重語音辨識結果來改善對話狀態追蹤	★ 對話系統應用於中文線上客服助理:以電信領域為例
★ 應用遞歸神經網路於適當的時機回答問題	★ 使用多任務學習改善使用者意圖分類
★ 使用轉移學習來改進針對命名實體音譯的樞軸語言方法	★ 基於歷史資訊向量與主題專精程度向量應用於尋找社群問答網站中專家

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

事實一致性問題是自動萃取式摘要中關鍵且棘手的問題，近年來受
到許多研究者的關注，然而先前之研究集中於探討英文摘要中的事實
一致性問題，中文摘要的事實一致性則尚被評估與研究。
我們基於中文相對於英文較為不同的部分進行研究，也就是斷詞，
現今的中文預訓練模型大多使用和 BERT 相同的斷詞系統，實際上相
當接近單純使用字元進行斷詞。
透過使用不同中文斷詞套件來訓練中文 BART 模型，並在 LCSTS
中文摘要資料集上微調，我們證實了斷詞不只影響傳統 ROUGE 分數
也同時影響了事實一致性。
此外考慮到簡體和繁體中文的用詞差異，我們也建立了台灣新聞弱
監督自動萃取式摘要資料集 TWNSum ，透過最簡單的 LEAD 方式抽
取摘要並使用事實一致性評估篩選，表明從大量未標記的新聞語料中
生成自動萃取式摘要資料集是可行的。

摘要(英)

Hallucination is a critical and hard problem in abstractive summarization,
getting increasing attention in recent years. However, hallucination in some
languages, or specifically, in Chinese, is still unexplored. We experiment with
a special procedure in the Chinese modeling, which is tokenization, to figure
out the effect of tokenization on hallucinations in abstractive summarization.
Tokenization is not often taken out for additional experimented in English
due to the language characteristics. In the Chinese scenario, current models
use either the characterlevel tokenization or the tokenization similar to the
characterlevel tokenization, such as the BERT tokenizer. By applying different Chinese tokenizers to the BART model, we confirm that the tokenizer
will affect both the ROUGE score and the faithfulness of the model. Moreover, considering the difference between the traditional Chinese and simplified Chinese tokenizers, we create Taiwan Weakly supervised News Summarization dataset (TWNSum) by using the simple LEAD method and the
hallucination evaluation filtering. Additionally, our TWNSum dataset shows
that creating an abstractive summarization dataset from a large amount of
unlabeled news by a weakly supervised method is feasible.

關鍵字(中)

★ 自動萃取式摘要
★ 預訓練模型
★ 中文斷詞
★ 事實一致性

關鍵字(英)

★ Abstractive Summarization
★ Pretrained Model
★ Tokenization
★ Hallucination

論文目次

中文摘要 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
誌謝 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.1 Transformer . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Pretrained Language Model . . . . . . . . . . . . . . . . . . . . . 4
2.3 Tokenization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.4 Summarization . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.4.1 Extractive Summarization . . . . . . . . . . . . . . . . . 7
2.4.2 Abstractive Summarization . . . . . . . . . . . . . . . . 8
2.5 Hallucination . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.5.1 Intrinsic and Extrinsic Hallucinations . . . . . . . . . . 9
2.5.2 Current Evaluations and Solutions for Hallucination . . . 10
3 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.1 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4 Experiments and Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 18
5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

參考文獻

[1] J. Devlin, M.W. Chang, K. Lee, and K. Toutanova, “BERT: Pretraining of deep
bidirectional transformers for language understanding,” in Proceedings of the 2019
Conference of the North American Chapter of the Association for Computational
Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers).
Minneapolis, Minnesota: Association for Computational Linguistics, Jun. 2019, pp.
4171–4186. [Online]. Available: https://www.aclweb.org/anthology/N191423
[2] Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov, “Roberta: A robustly optimized bert pretraining approach,”
arXiv preprint arXiv:1907.11692, 2019.
[3] G. Lample and A. Conneau, “Crosslingual language model pretraining,” arXiv
preprint arXiv:1901.07291, 2019.
[4] M. Lewis, Y. Liu, N. Goyal, M. Ghazvininejad, A. Mohamed, O. Levy, V. Stoyanov,
and L. Zettlemoyer, “BART: Denoising sequencetosequence pretraining for
natural language generation, translation, and comprehension,” in Proceedings of the
58th Annual Meeting of the Association for Computational Linguistics. Online:
Association for Computational Linguistics, Jul. 2020, pp. 7871–7880. [Online].
Available: https://www.aclweb.org/anthology/2020.aclmain.703
[5] C.Y. Lin, “ROUGE: A package for automatic evaluation of summaries,”
in Text Summarization Branches Out. Barcelona, Spain: Association for
Computational Linguistics, Jul. 2004, pp. 74–81. [Online]. Available: https:
//www.aclweb.org/anthology/W041013
[6] K. M. Hermann, T. Kocisky, E. Grefenstette, L. Espeholt, W. Kay, M. Suleyman,
and P. Blunsom, “Teaching machines to read and comprehend,” Advances in neural
information processing systems, vol. 28, pp. 1693–1701, 2015.
[7] R. Nallapati, B. Zhou, C. dos Santos, Ç. glar Gulçehre, and B. Xiang, “Abstractive
text summarization using sequencetosequence rnns and beyond,” CoNLL 2016, p.
280, 2016.
[8] S. Narayan, S. B. Cohen, and M. Lapata, “Don＇t give me the details, just the summary! topicaware convolutional neural networks for extreme summarization,” in
Proceedings of the 2018 Conference on Empirical Methods in Natural Language
Processing, 2018, pp. 1797–1807.
[9] B. Hu, Q. Chen, and F. Zhu, “LCSTS: A large scale Chinese short text
summarization dataset,” in Proceedings of the 2015 Conference on Empirical
Methods in Natural Language Processing. Lisbon, Portugal: Association
for Computational Linguistics, Sep. 2015, pp. 1967–1972. [Online]. Available:
https://www.aclweb.org/anthology/D151229
[10] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser,
and I. Polosukhin, “Attention is all you need,” in Advances in neural information
processing systems, 2017, pp. 5998–6008.
[11] Y. Bengio, P. Simard, and P. Frasconi, “Learning longterm dependencies with gradient descent is difficult,” IEEE transactions on neural networks, vol. 5, no. 2, pp.
157–166, 1994.
[12] M. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee, and L. Zettlemoyer,
“Deep contextualized word representations,” in Proceedings of the 2018 Conference
of the North American Chapter of the Association for Computational Linguistics:
Human Language Technologies, Volume 1 (Long Papers). New Orleans, Louisiana:
Association for Computational Linguistics, Jun. 2018, pp. 2227–2237. [Online].
Available: https://www.aclweb.org/anthology/N181202
[13] Z. Yang, Z. Dai, Y. Yang, J. Carbonell, R. R. Salakhutdinov, and Q. V. Le, “Xlnet: Generalized autoregressive pretraining for language understanding,” Advances
in neural information processing systems, vol. 32, 2019.
[14] R. Sennrich, B. Haddow, and A. Birch, “Neural machine translation of rare words
with subword units,” arXiv preprint arXiv:1508.07909, 2015.
[15] M. Schuster and K. Nakajima, “Japanese and korean voice search,” in 2012 IEEE
International Conference on Acoustics, Speech and Signal Processing (ICASSP).
IEEE, 2012, pp. 5149–5152.
[16] A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, I. Sutskever et al., “Language
models are unsupervised multitask learners,” OpenAI blog, vol. 1, no. 8, p. 9, 2019.
[17] X. Li, Y. Meng, X. Sun, Q. Han, A. Yuan, and J. Li, “Is word segmentation
necessary for deep learning of Chinese representations?” in Proceedings of the 57th
Annual Meeting of the Association for Computational Linguistics. Florence, Italy:
Association for Computational Linguistics, Jul. 2019, pp. 3242–3252. [Online].
Available: https://www.aclweb.org/anthology/P191314
[18] N. Moratanch and S. Chitrakala, “A survey on extractive text summarization,” in
2017 international conference on computer, communication and signal processing
(ICCCSP). IEEE, 2017, pp. 1–6.
[19] Q. Zhou, N. Yang, F. Wei, S. Huang, M. Zhou, and T. Zhao, “Neural
document summarization by jointly learning to score and select sentences,” in
Proceedings of the 56th Annual Meeting of the Association for Computational
Linguistics (Volume 1: Long Papers). Melbourne, Australia: Association
for Computational Linguistics, Jul. 2018, pp. 654–663. [Online]. Available:
https://www.aclweb.org/anthology/P181061
[20] Y. Liu and M. Lapata, “Text summarization with pretrained encoders,” arXiv preprint
arXiv:1908.08345, 2019.
[21] X. Zhang, F. Wei, and M. Zhou, “HIBERT: Document level pretraining
of hierarchical bidirectional transformers for document summarization,” in
Proceedings of the 57th Annual Meeting of the Association for Computational
Linguistics. Florence, Italy: Association for Computational Linguistics, Jul. 2019,
pp. 5059–5069. [Online]. Available: https://www.aclweb.org/anthology/P191499
[22] J. Gu, Z. Lu, H. Li, and V. O. Li, “Incorporating copying mechanism in
sequencetosequence learning,” in Proceedings of the 54th Annual Meeting of
the Association for Computational Linguistics (Volume 1: Long Papers). Berlin,
Germany: Association for Computational Linguistics, Aug. 2016, pp. 1631–1640.
[Online]. Available: https://www.aclweb.org/anthology/P161154
[23] A. See, P. J. Liu, and C. D. Manning, “Get to the point: Summarization with
pointergenerator networks,” in Proceedings of the 55th Annual Meeting of the
Association for Computational Linguistics (Volume 1: Long Papers). Vancouver,
Canada: Association for Computational Linguistics, Jul. 2017, pp. 1073–1083.
[Online]. Available: https://www.aclweb.org/anthology/P171099
[24] J. Zhang, Y. Zhao, M. Saleh, and P. Liu, “Pegasus: Pretraining with extracted gapsentences for abstractive summarization,” in International Conference on Machine
Learning. PMLR, 2020, pp. 11 328–11 339.
[25] W. Qi, Y. Yan, Y. Gong, D. Liu, N. Duan, J. Chen, R. Zhang, and M. Zhou,
“ProphetNet: Predicting future ngram for sequencetoSequencePretraining,” in
Findings of the Association for Computational Linguistics: EMNLP 2020. Online:
Association for Computational Linguistics, Nov. 2020, pp. 2401–2410. [Online].
Available: https://www.aclweb.org/anthology/2020.findingsemnlp.217
[26] J. Maynez, S. Narayan, B. Bohnet, and R. McDonald, “On faithfulness and
factuality in abstractive summarization,” in Proceedings of the 58th Annual
Meeting of the Association for Computational Linguistics. Online: Association
for Computational Linguistics, Jul. 2020, pp. 1906–1919. [Online]. Available:
https://www.aclweb.org/anthology/2020.aclmain.173
[27] E. Durmus, H. He, and M. Diab, “FEQA: A question answering evaluation framework for faithfulness assessment in abstractive summarization,” in Proceedings of
the 58th Annual Meeting of the Association for Computational Linguistics. Online:
Association for Computational Linguistics, Jul. 2020, pp. 5055–5070. [Online].
Available: https://www.aclweb.org/anthology/2020.aclmain.454
[28] A. Wang, K. Cho, and M. Lewis, “Asking and answering questions to evaluate
the factual consistency of summaries,” in Proceedings of the 58th Annual
Meeting of the Association for Computational Linguistics. Online: Association
for Computational Linguistics, Jul. 2020, pp. 5008–5020. [Online]. Available:
https://www.aclweb.org/anthology/2020.aclmain.450
[29] F. Nan, C. Nogueira dos Santos, H. Zhu, P. Ng, K. McKeown, R. Nallapati,
D. Zhang, Z. Wang, A. O. Arnold, and B. Xiang, “Improving factual consistency
of abstractive summarization via question answering,” in Proceedings of the 59th
Annual Meeting of the Association for Computational Linguistics and the 11th
International Joint Conference on Natural Language Processing (Volume 1: Long
Papers). Online: Association for Computational Linguistics, Aug. 2021, pp.
6881–6894. [Online]. Available: https://aclanthology.org/2021.acllong.536
[30] R. Pasunuru and M. Bansal, “Multireward reinforced summarization with saliency
and entailment,” in Proceedings of the 2018 Conference of the North American
Chapter of the Association for Computational Linguistics: Human Language
Technologies, Volume 2 (Short Papers). New Orleans, Louisiana: Association
for Computational Linguistics, Jun. 2018, pp. 646–653. [Online]. Available:
https://www.aclweb.org/anthology/N182102
[31] S. Chen, F. Zhang, K. Sone, and D. Roth, “Improving faithfulness in abstractive
summarization with contrast candidate generation and selection,” in Proceedings
of the 2021 Conference of the North American Chapter of the Association for
Computational Linguistics: Human Language Technologies. Online: Association
for Computational Linguistics, Jun. 2021, pp. 5935–5941. [Online]. Available:
https://aclanthology.org/2021.naaclmain.475
[32] F. Nan, R. Nallapati, Z. Wang, C. Nogueira dos Santos, H. Zhu, D. Zhang,
K. McKeown, and B. Xiang, “Entitylevel factual consistency of abstractive text
summarization,” in Proceedings of the 16th Conference of the European Chapter of
the Association for Computational Linguistics: Main Volume. Online: Association
for Computational Linguistics, Apr. 2021, pp. 2727–2733. [Online]. Available:
https://aclanthology.org/2021.eaclmain.235
[33] M. Ott, S. Edunov, A. Baevski, A. Fan, S. Gross, N. Ng, D. Grangier, and M. Auli,
“fairseq: A fast, extensible toolkit for sequence modeling,” in Proceedings of the
2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations), 2019, pp. 48–53.
[34] P.H. Li, T.J. Fu, and W.Y. Ma, “Why attention? analyze bilstm deficiency and
its remedies in the case of ner,” Proceedings of the AAAI Conference on Artificial
Intelligence, vol. 34, no. 05, pp. 8236–8244, Apr. 2020. [Online]. Available:
https://ojs.aaai.org/index.php/AAAI/article/view/6338

指導教授

蔡宗翰(Richard Tzong-Han Tsai)

審核日期

2021-9-30

推文