結合擷取式與萃取式兩段式模型以增進摘要效能之研究

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：147

、訪客IP：3.17.186.218

姓名

王美淋(Mei-Lin Wang) 查詢紙本館藏

畢業系所

資訊管理學系

論文名稱

結合擷取式與萃取式兩段式模型以增進摘要效能之研究

相關論文

★ 網路合作式協同教學設計平台－以國中九年一貫課程為例	★ 內容管理機制於常用問答集(FAQ)之應用
★ 行動多重代理人技術於排課系統之應用	★ 存取控制機制與國內資安規範之研究
★ 信用卡系統導入NFC手機交易機制探討	★ App應用在電子商務的推薦服務-以P公司為例
★ 建置服務導向系統改善生產之流程-以W公司PMS系統為例	★ NFC行動支付之TSM平台規劃與導入
★ 關鍵字行銷在半導體通路商運用-以G公司為例	★ 探討國內田徑競賽資訊系統－以103年全國大專田徑公開賽資訊系統為例
★ 航空地勤機坪作業盤櫃追蹤管理系統導入成效評估—以F公司為例	★ 導入資訊安全管理制度之資安管理成熟度研究－以B個案公司為例
★ 資料探勘技術在電影推薦上的應用研究-以F線上影音平台為例	★ BI視覺化工具運用於資安日誌分析—以S公司為例
★ 特權帳號登入行為即時分析系統之實證研究	★ 郵件系統異常使用行為偵測與處理-以T公司為例

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

有鑑於現今網路資源豐富，人們無法短時間內處理這些龐大的資料，自動文本摘要任務因應而生。本研究使用兩段式模型，結合擷取式模型與萃取式模型，利用額外的輸入與預訓練模型，使萃取式模型產生通順的中文摘要結果，提升自動文本摘要之效能，並在其中探討詞頻模型與深度學習模型，應用於擷取式模型之好壞，以及擷取式模型輸出詞級別與句級別對於萃取式模型之影響。
根據實驗結果，我們發現使用本研究提出之兩段式模型，在實驗效能上皆優於Transformer，與其他學者相比在Rouge-1、Rouge-2的表現上也比較好，其中效能最好的模型是在第一個階段的擷取式模型上使用TF-IDF並配上詞級別的輸出方式，其評估指標Rouge-1、Rouge-2以及Rouge-L的效果高達0.447、0.268、0.407，而另一方面在擷取式模型中搭配深度學習也有不錯的表現，最好的結果是使用三層BiLSTM加上注意力機制，Rouge-1、Rouge-2以及Rouge-L的結果為0.4435、0.2669、0.400，與TF-IDF的效能相差不遠，因此可證實本研究對於萃取式摘要生成任務提出一個能有效提升中文摘要效能之系統，不管在摘要的流暢度或是訊息量都是有相當好的表現，也提供各項實驗數據以供後續學者進行相關研究。

摘要(英)

In an era of information expansion, it brings abundant internet resources for human. But it is difficult for people to handle these huge data in a short time, hence the automatic text summarization task has been proposed accordingly. This research applies a two-stage model to do abstractive summarization task, combining an extractive model and an extractive model, and uses additional inputs and pre-trained models to make the abstractive model produce logical and semantically-smooth Chinese summary to improve the performance of automatic text summarization. We also discuss the word frequency model and deep learning model, which applied to the extraction model is best, and the impact of word level and sentence level output on the extraction model.
According to the experimental results, we found that the two-stage model proposed in this research is better than the Transformer on the experimental performance, and the performance of Rouge-1 and Rouge-2 is State-of-the-Art better than models proposed by other scholars.
The best model uses the TF-IDF and the word-level output method on the first-stage extraction model. Rouge-1, Rouge-2 and Rouge-L are 0.447、0.268 and 0.407.
It also has a good performance with deep learning model in the extractive stage. In this situation, the best result is to use a three-layer BiLSTM with attention mechanism. The results of Rouge-1, Rouge-2 and Rouge-L are 0.4435, 0.2669 and 0.4, which is not far from the performance of TF-IDF.
Therefore we can infer that we proposes a system that can effectively improve the performance of Chinese summarization for abstractive generation tasks and helps follow-up scholars to do the related researches there are grounds to follow.

關鍵字(中)

★ 中文萃取式摘要
★ 自然語言處理
★ 遞迴神經網路
★ 詞向量
★ Transformer

關鍵字(英)

★ Chinese Abstractive Summarization
★ NLP
★ Recurrent neural network
★ word embedding
★ Transformer

論文目次

摘要 i
Abstract ii
目錄 iii
圖目錄 v
表目錄 vi
一、緒論 1
1-1 研究背景 1
1-2 研究動機 2
1-3 研究目的 3
1-4 本文架構 4
二、文獻研究 6
2-1 自動文本摘要 6
2-2 詞頻模型 (TF-IDF) 7
2-3 詞向量之語言模型 8
2-3-1 FastText 9
2-4 序列到序列模型 10
2-4-1 遞迴神經網路 10
2-4-2 Transformer 13
三、研究方法 15
3-1 研究流程 15
3-2 資料前處理 15
3-3 詞向量轉換 16
3-4 模型建立 17
3-4-1 擷取式模型 17
3-4-2 萃取式模型 21
3-5 結果評估 25
四、實驗架構 26
4-1 實驗環境. 26
4-2 實驗資料集 26
4-3 實驗設計與結果 28
4-3-1 實驗一：擷取式模型輸出詞級別與句級別之效能影響。 28
4-3-2 實驗二：深度學習模型層數之影響。 30
4-3-3 實驗三：Transformer參數設定之影響。 33
4-3-4 實驗四：消融測試之FastText詞向量模型。 34
4-3-5 實驗五：消融測試之Beam Search模組 36
4-4 實驗結果與其他學者比較 36
4-5 實例分析 39
4-5-1 Rouge分數較高實例 40
4-5-2 Rouge分數較低實例 40
4-5-3 系統改善之實例 41
五、結論與未來研究方向 43
5-1 結論 43
5-2 研究限制 43
5-3 未來方向 46

參考文獻

中文部分：
蔡汶霖（2018）以詞向量模型增進基於遞歸神經網路之中文文字摘要系統效能。
麥嘉芳 (2019) 基於注意力機制之詞向量中文萃取式摘要研究。

英文部分：
Bahdanau, D., Cho, K. & Bengio, Y. (2016) Neural Machine Translation by Jointly Learning to Align and Translate. arXiv:1409.0473 [cs, stat]. [Online] Available from: http://arxiv.org/abs/1409.0473 [Accessed: 26 March 2020].
Cao, Z., Li, W., Li, S., Wei, F., et al. (2016) AttSum: Joint Learning of Focusing and Summarization with Neural Attention. arXiv:1604.00125 [cs]. [Online] Available from: http://arxiv.org/abs/1604.00125 [Accessed: 25 January 2020].
Chen, Y.-C. & Bansal, M. (2018) Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting. arXiv:1805.11080 [cs]. [Online] Available from: http://arxiv.org/abs/1805.11080 [Accessed: 9 December 2019].
Chung, J., Gulcehre, C., Cho, K. & Bengio, Y. (2014) Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling. arXiv:1412.3555 [cs]. [Online] Available from: http://arxiv.org/abs/1412.3555 [Accessed: 20 February 2020].
Elman, J.L. (1990) Finding Structure in Time. Cognitive Science. [Online] 14 (2), 179–211. Available from: doi:10.1207/s15516709cog1402_1.
Hochreiter, S. & Schmidhuber, J. (1997) Long Short-Term Memory. Neural Computation 9(8).
Hsieh, Y.-L., Liu, S.-H., Chen, K.-Y., Wang, H.-M., et al. (2016) 運用序列到序列生成架構於重寫式自動摘要(Exploiting Sequence-to-Sequence Generation Framework for Automatic Abstractive Summarization)[In Chinese]. In: Proceedings of the 28th Conference on Computational Linguistics and Speech Processing (ROCLING 2016). [Online]. October 2016 Tainan, Taiwan, The Association for Computational Linguistics and Chinese Language Processing (ACLCLP). pp. 115–128. Available from: https://www.aclweb.org/anthology/O16-1012 [Accessed: 26 March 2020].
Hu, B., Chen, Q. & Zhu, F. (2016) LCSTS: A Large Scale Chinese Short Text Summarization Dataset. arXiv:1506.05865 [cs]. [Online] Available from: http://arxiv.org/abs/1506.05865 [Accessed: 26 March 2020].
Joulin, A., Grave, E., Bojanowski, P., Nickel, M., et al. (2017) Fast Linear Model for Knowledge Graph Embeddings. arXiv:1710.10881 [cs, stat]. [Online] Available from: http://arxiv.org/abs/1710.10881 [Accessed: 6 January 2020].
LeCun, Y., Bengio, Y. & Hinton, G. (2015) Deep learning. Nature. [Online] 521 (7553), 436–444. Available from: doi:10.1038/nature14539.
Lin, C.-Y. (2004) ROUGE: A Package for Automatic Evaluation of Summaries. In: Text Summarization Branches Out. [Online]. July 2004 Barcelona, Spain, Association for Computational Linguistics. pp. 74–81. Available from: https://www.aclweb.org/anthology/W04-1013 [Accessed: 5 March 2020].
Liu, P.J., Saleh, M., Pot, E., Goodrich, B., et al. (2018) Generating Wikipedia by Summarizing Long Sequences. arXiv:1801.10198 [cs]. [Online] Available from: http://arxiv.org/abs/1801.10198 [Accessed: 26 March 2020].
Liu, Y. (2019) Fine-tune BERT for Extractive Summarization. arXiv:1903.10318 [cs]. [Online] Available from: http://arxiv.org/abs/1903.10318 [Accessed: 26 March 2020].
Liu, Y. & Lapata, M. (2019) Text Summarization with Pretrained Encoders. arXiv:1908.08345 [cs]. [Online] Available from: http://arxiv.org/abs/1908.08345 [Accessed: 21 May 2020].
McDonald, D.M. & Chen, H. (2006) Summary in context: Searching versus browsing. ACM Transactions on Information Systems (TOIS). [Online] 24 (1), 111–141. Available from: doi:10.1145/1125857.1125861.
McKeown, K. & Radev, D.R. (1995) Generating summaries of multiple news articles. In: Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR ’95. [Online]. 1995 Seattle, Washington, United States, ACM Press. pp. 74–82. Available from: doi:10.1145/215206.215334 [Accessed: 22 May 2020].
Mikolov, T., Chen, K., Corrado, G. & Dean, J. (2013) Efficient Estimation of Word Representations in Vector Space. arXiv:1301.3781 [cs]. [Online] Available from: http://arxiv.org/abs/1301.3781 [Accessed: 6 January 2020].
Page, L., Brin, S., Motwani, R. & Winograd, T. (1999) The PageRank Citation Ranking: Bringing Order to the Web. [Online]. 11 November 1999. Available from: http://ilpubs.stanford.edu:8090/422/ [Accessed: 25 January 2020].
Radev, D.R., Hovy, E. & McKeown, K. (2002) Introduction to the Special Issue on Summarization. Computational Linguistics. [Online] 28 (4), 399–408. Available from: doi:10.1162/089120102762671927.
Ramos, J. (2003) Using TF-IDF to Determine Word Relevance in Document Queries. In Proceedings of the first instructional conference on machine learning. 242133–142.
Sanh, V., Wolf, T. & Ruder, S. (2019) A Hierarchical Multi-Task Approach for Learning Embeddings from Semantic Tasks. Proceedings of the AAAI Conference on Artificial Intelligence. [Online] 33 (01), 6949–6956. Available from: doi:10.1609/aaai.v33i01.33016949.
Schuster, M. & Paliwal, K.K. (1997) Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing. [Online] 45 (11), 2673–2681. Available from: doi:10.1109/78.650093.
Stribling, J., Aguayo, D. & Krohn, M. (2005) Rooter: A Methodology for the Typical Uniﬁcation of Access Points and Redundancy. 4.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., et al. (2017) Attention is All you Need. In: I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, et al. (eds.). Advances in Neural Information Processing Systems 30. [Online]. Curran Associates, Inc. pp. 5998–6008. Available from: http://papers.nips.cc/paper/7181-attention-is-all-you-need.pdf [Accessed: 25 January 2020].
Yan, Y., Qi, W., Gong, Y., Liu, D., et al. (2020) ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training. arXiv:2001.04063 [cs]. [Online] Available from: http://arxiv.org/abs/2001.04063 [Accessed: 30 March 2020].

網路部分：
Olah, C. (2015) Understanding LSTM Networks -- colah’s blog. [Online]. 27 August 2015. Available from: https://colah.github.io/posts/2015-08-Understanding-LSTMs/ [Accessed: 27 March 2020].

指導教授

林熙禎(Shi-Jen Lin)

審核日期

2020-7-20

推文