博碩士論文 108423033 詳細資訊




以作者查詢圖書館館藏 以作者查詢臺灣博碩士 以作者查詢全國書目 勘誤回報 、線上人數:32 、訪客IP:3.136.22.50
姓名 董子瑄(Tzu-Hsuan Tung)  查詢紙本館藏   畢業系所 資訊管理學系
論文名稱 結合Selective Mechanism與多向注意力機制應用於自動文本摘要之研究
相關論文
★ 網路合作式協同教學設計平台-以國中九年一貫課程為例★ 內容管理機制於常用問答集(FAQ)之應用
★ 行動多重代理人技術於排課系統之應用★ 存取控制機制與國內資安規範之研究
★ 信用卡系統導入NFC手機交易機制探討★ App應用在電子商務的推薦服務-以P公司為例
★ 建置服務導向系統改善生產之流程-以W公司PMS系統為例★ NFC行動支付之TSM平台規劃與導入
★ 關鍵字行銷在半導體通路商運用-以G公司為例★ 探討國內田徑競賽資訊系統-以103年全國大專田徑公開賽資訊系統為例
★ 航空地勤機坪作業盤櫃追蹤管理系統導入成效評估—以F公司為例★ 導入資訊安全管理制度之資安管理成熟度研究-以B個案公司為例
★ 資料探勘技術在電影推薦上的應用研究-以F線上影音平台為例★ BI視覺化工具運用於資安日誌分析—以S公司為例
★ 特權帳號登入行為即時分析系統之實證研究★ 郵件系統異常使用行為偵測與處理-以T公司為例
檔案 [Endnote RIS 格式]    [Bibtex 格式]    [相關文章]   [文章引用]   [完整記錄]   [館藏目錄]   [檢視]  [下載]
  1. 本電子論文使用權限為同意立即開放。
  2. 已達開放權限電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。
  3. 請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。

摘要(中) 文本摘要任務的目的在於將原始文本以精簡的文字重新呈現,同時要保留重點且不失原文語意。本研究結合Selective Mechanism與Transformer模型中的多向注意力機制以提升萃取式摘要模型的生成摘要品質,透過一個可訓練的Selective Gate Network對Transformer編碼器的多向注意力輸出進行過濾,產生二次潛在語意向量,以達到精煉的效果,其目的在於以過濾的方式,除去次要的資訊,萃取出應保留在摘要中的重點資訊,並使用二次潛在語意向量進行解碼,來產生更好的摘要。
本研究並將此模型應用於中文文本摘要生成上,以ROUGE值做為評估指標,實驗結果顯示此模型在ROUGE-1、ROUGE-2、ROUGE-L都能超越Baseline模型,在Word-based ROUGE上提升約7.3~12.7%,在Character-based ROUGE上提升約4.9~7.9%,此外搭配Word to Character的斷詞方法並擴大編碼器更可以大幅提升各項ROUGE指標,在Word-based ROUGE可再提升20.4~41.8%,Character-based ROUGE可再提升約21.5~31.1%。
摘要(英) Text summarization task aims to represent the original article in condensed text, while retaining the key points and the original semantics. This research combines selective mechanism with multi-head attention to improve the generated summary quality of the abstractive summarization model. A trainable selective gate network is used to filter the multi-head attention outputs in the Transformer encoder, which can select important information and discard unimportant information, and finally construct second level representation. The second level representation is a tailored sentence representation, which can be decoded into a better summary.
This model is applied to Chinese text summarization task, and the evaluation metric is ROUGE score. The experiment result shows that the model performance exceed the baseline by 7.3 to 12.7% on word-based ROUGE, and 4.9 to 7.9% on character-based ROUGE. Moreover, with word to character tokenization and larger vocabulary banks can significantly improve the performance. In word-based ROUGE, it can increase by 20.4 to 41.8%, and character-based ROUGE can increase by 21.5 to 31.1%.
關鍵字(中) ★ Transformer
★ Selective Mechanism
★ 自注意力機制
★ 萃取式摘要
★ 中文文本摘要
關鍵字(英) ★ Transformer
★ Selective mechanism
★ Self-attention
★ Abstractive summarization
★ Chinese summarization
論文目次 摘要 I
ABSTRACT II
誌謝 III
目錄 IV
圖目錄 VI
表目錄 VII
一、緒論 1
1-1 研究背景 1
1-2 研究動機 2
1-3 研究目的 2
1-4 論文架構 3
二、文獻探討 4
2-1 自動文本摘要 4
2-2 編解碼器架構 5
2-3 RNN 5
2-3-1 RNN + 注意力機制 7
2-3-2 RNN + Selective Mechanism 9
2-4 Transformer 12
2-5 BERT 14
三、研究方法 16
3-1 研究流程 16
3-2 資料前處理 17
3-3 摘要模型架構 19
3-3-1 預訓練詞向量 20
3-3-2 Selective Gate Multi-Head Attention 21
3-4 結果評估 23
四、實驗 25
4-1 實驗設置 25
4-2 實驗資料集 25
4-3 實驗設計與結果 26
4-3-1 實驗一:Transformer 與 Transformer + Selective Mechanism 之比較 26
4-3-2 實驗二:Selective Mechanism 應用於不同架構之比較 28
4-3-3 實驗三:斷詞模式與辭典大小之影響 29
4-3-4 實驗四:辭典大小於訓練時間和評估指標的影響 32
4-4 實驗結果與其他學者之比較 34
五、結論與未來方向 36
5-1 結論 36
5-2 研究限制 36
5-3 未來方向 36
參考文獻 38
參考文獻 Bahdanau, D., Cho, K., & Bengio, Y. (2016). Neural Machine Translation by Jointly Learning to Align and Translate. ArXiv:1409.0473 [Cs, Stat]. http://arxiv.org/abs/1409.0473
Bojanowski, P., Grave, E., Joulin, A., & Mikolov, T. (2017). Enriching Word Vectors with Subword Information. ArXiv:1607.04606 [Cs]. http://arxiv.org/abs/1607.04606
Chang, C.-T., Huang, C.-C., Yang, C.-Y., & Hsu, J. Y.-J. (2018). A Hybrid Word-Character Approach to Abstractive Summarization. ArXiv:1802.09968 [Cs]. http://arxiv.org/abs/1802.09968
Chen, Q., Zhu, X., Ling, Z., Wei, S., & Jiang, H. (2016). Distraction-Based Neural Networks for Document Summarization. ArXiv:1610.08462 [Cs]. http://arxiv.org/abs/1610.08462
Chen, X., Xu, L., Liu, Z., Sun, M., & Luan, H. (2015). Joint learning of character and word embeddings. Proceedings of the 24th International Conference on Artificial Intelligence, 1236–1242.
Christian, H., Agus, M. P., & Suhartono, D. (2016). Single Document Automatic Text Summarization using Term Frequency-Inverse Document Frequency (TF-IDF). ComTech: Computer, Mathematics and Engineering Applications, 7(4), 285–294. https://doi.org/10.21512/comtech.v7i4.3746
Chuang, W. T., & Yang, J. (2000). Extracting sentence segments for text summarization: A machine learning approach. Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 152–159. https://doi.org/10.1145/345508.345566
Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. ArXiv:1810.04805 [Cs]. http://arxiv.org/abs/1810.04805
Duan, X., Yu, H., Yin, M., Zhang, M., Luo, W., & Zhang, Y. (2019). Contrastive Attention Mechanism for Abstractive Sentence Summarization. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 3044–3053. https://doi.org/10.18653/v1/D19-1301
Elman, J. L. (1990). Finding structure in time. Cognitive Science, 14(2), 179–211. https://doi.org/10.1016/0364-0213(90)90002-E
Gu, J., Lu, Z., Li, H., & Li, V. O. K. (2016). Incorporating Copying Mechanism in Sequence-to-Sequence Learning. ArXiv:1603.06393 [Cs]. http://arxiv.org/abs/1603.06393
Hochreiter, S., & Schmidhuber, J. (1997). Long Short-term Memory. Neural Computation, 9, 1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735
Hu, B., Chen, Q., & Zhu, F. (2016). LCSTS: A Large Scale Chinese Short Text Summarization Dataset. ArXiv:1506.05865 [Cs]. http://arxiv.org/abs/1506.05865
Kaibi, I., Nfaoui, E. H., & Satori, H. (2019). A Comparative Evaluation of Word Embeddings Techniques for Twitter Sentiment Analysis. 2019 International Conference on Wireless Technologies, Embedded and Intelligent Systems (WITS), 1–4. https://doi.org/10.1109/WITS.2019.8723864
Kedzie, C., McKeown, K., & Daume III, H. (2019). Content Selection in Deep Learning Models of Summarization. ArXiv:1810.12343 [Cs]. http://arxiv.org/abs/1810.12343
Kilimci, Z. H., & Akyokuş, S. (2019). The Evaluation of Word Embedding Models and Deep Learning Algorithms for Turkish Text Classification. 2019 4th International Conference on Computer Science and Engineering (UBMK), 548–553. https://doi.org/10.1109/UBMK.2019.8907027
Kim, Y. (2014). Convolutional Neural Networks for Sentence Classification. ArXiv:1408.5882 [Cs]. http://arxiv.org/abs/1408.5882
Klein, G., Kim, Y., Deng, Y., Senellart, J., & Rush, A. M. (2017). OpenNMT: Open-Source Toolkit for Neural Machine Translation. ArXiv:1701.02810 [Cs]. http://arxiv.org/abs/1701.02810
Lin, C.-Y. (2004). ROUGE: A Package for Automatic Evaluation of Summaries. Text Summarization Branches Out, 74–81. https://www.aclweb.org/anthology/W04-1013
Lin, J., Sun, X., Ma, S., & Su, Q. (2018). Global Encoding for Abstractive Summarization. ArXiv:1805.03989 [Cs]. http://arxiv.org/abs/1805.03989
Liu, Y. (2019). Fine-tune BERT for Extractive Summarization. ArXiv:1903.10318 [Cs]. http://arxiv.org/abs/1903.10318
Liu, Y., & Lapata, M. (2019). Text Summarization with Pretrained Encoders. ArXiv:1908.08345 [Cs]. http://arxiv.org/abs/1908.08345
Luong, M.-T., Pham, H., & Manning, C. D. (2015). Effective Approaches to Attention-based Neural Machine Translation. ArXiv:1508.04025 [Cs]. http://arxiv.org/abs/1508.04025
Ma, S., Sun, X., Lin, J., & Wang, H. (2018). Autoencoder as Assistant Supervisor: Improving Text Representation for Chinese Social Media Text Summarization. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 725–731. https://doi.org/10.18653/v1/P18-2115
Mihalcea, R., & Tarau, P. (2004). TextRank: Bringing Order into Text. Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, 404–411. https://www.aclweb.org/anthology/W04-3252
Nallapati, R., Zhou, B., santos, C. N. dos, Gulcehre, C., & Xiang, B. (2016). Abstractive Text Summarization Using Sequence-to-Sequence RNNs and Beyond. ArXiv:1602.06023 [Cs]. http://arxiv.org/abs/1602.06023
Nenkova, A., & Vanderwende, L. (2005). The impact of frequency on summarization.
Rush, A. M., Chopra, S., & Weston, J. (2015). A Neural Attention Model for Abstractive Sentence Summarization. ArXiv:1509.00685 [Cs]. http://arxiv.org/abs/1509.00685
Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to Sequence Learning with Neural Networks. ArXiv:1409.3215 [Cs]. http://arxiv.org/abs/1409.3215
Tas, O., & Kiyani, F. (2017). A SURVEY AUTOMATIC TEXT SUMMARIZATION. PressAcademia Procedia, 5(1), 205–213. https://doi.org/10.17261/Pressacademia.2017.591
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I. (2017). Attention Is All You Need. ArXiv:1706.03762 [Cs]. http://arxiv.org/abs/1706.03762
Wang, L., Yao, J., Tao, Y., Zhong, L., Liu, W., & Du, Q. (2018). A Reinforced Topic-Aware Convolutional Sequence-to-Sequence Model for Abstractive Text Summarization. Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 4453–4460. https://doi.org/10.24963/ijcai.2018/619
Wei, B., Ren, X., Sun, X., Zhang, Y., Cai, X., & Su, Q. (2018). Regularizing Output Distribution of Abstractive Chinese Social Media Text Summarization for Improved Semantic Consistency. ArXiv:1805.04033 [Cs]. http://arxiv.org/abs/1805.04033
Zhou, Q., Yang, N., Wei, F., & Zhou, M. (2017). Selective Encoding for Abstractive Sentence Summarization. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 1095–1104. https://doi.org/10.18653/v1/P17-1101
張昇暉(2017)。中文文件串流之摘要擷取研究。國立中央大學資訊管理研究所碩士論文,桃園市。
楊佩臻(2013)。利用文句關係網路自動萃取文件摘要之研究。國立中央大學資訊管理研究所碩士論文,桃園市。
王美淋(2020)。結合擷取式與萃取式兩段式模型以增進摘要效能之研究。國立中央大學資訊管理研究所碩士論文,桃園市。
王蓮淨(2015)。以主題事件追蹤為基礎之摘要擷取。國立中央大學資訊管理研究所碩士論文,桃園市。
蔡汶霖(2018)。以詞向量模型增進基於遞歸神經網路之中文文字摘要系統效能。國立中央大學資訊管理研究所碩士論文,桃園市。
陳俞琇(2019)。具擷取及萃取能力的摘要模型。國立中央大學資訊管理研究所碩士論文,桃園市。
麥嘉芳(2019)。基於注意力機制之詞向量中文萃取式摘要研究。國立中央大學資訊管理研究所碩士論文,桃園市。
黃嘉偉(2014)。以文句網路分群架構萃取多文件摘要。國立中央大學資訊管理研究所碩士論文,桃園市。
指導教授 林熙禎(Shi-Jen Lin) 審核日期 2021-8-2
推文 facebook   plurk   twitter   funp   google   live   udn   HD   myshare   reddit   netvibes   friend   youpush   delicious   baidu   
網路書籤 Google bookmarks   del.icio.us   hemidemi   myshare   

若有論文相關問題,請聯絡國立中央大學圖書館推廣服務組 TEL:(03)422-7151轉57407,或E-mail聯絡  - 隱私權政策聲明