基於注意力機制之詞向量中文萃取式摘要研究

DC 欄位	值	語言
DC.contributor	資訊管理學系	zh_TW
DC.creator	麥嘉芳	zh_TW
DC.creator	Chia-Fang Mai	en_US
dc.date.accessioned	2019-7-19T07:39:07Z
dc.date.available	2019-7-19T07:39:07Z
dc.date.issued	2019
dc.identifier.uri	http://ir.lib.ncu.edu.tw:88/thesis/view_etd.asp?URN=106423014
dc.contributor.department	資訊管理學系	zh_TW
DC.description	國立中央大學	zh_TW
DC.description	National Central University	en_US
dc.description.abstract	隨著科技發展，雖帶來豐富資源，但若沒有適當管理，會造成資訊爆炸之問題，因此自動文件摘要為當今重要的研究議題。有鑑於近年來硬體限制和運算資源不足的突破，深度學習開始被學者們重新探討，並廣泛應用於自然語言處理的領域當中，因此本研究以基於注意力機制的Transformer類神經網路作為主要系統架構，生成具有邏輯性且語句通順的萃取式摘要，並探討如何透過詞向量預訓練模型，有效地增進摘要品質，進而大幅減少時間與人力上之成本。此外，並使用目前最大且最常見之中文資料集LCSTS進行驗證，比較淺層預訓練模型（Word2vec、FastText）和深層預訓練模型（ELMo）之差異，其結果有助於後續學者的相關研究。從實驗結果可得知，整體來看，除了Word2vec和FastText中的CBOW模型不利於增進摘要結果外，其他詞向量預訓練模型，在Rouge-1、Rouge-2和Rouge-L都有較好的表現，以本研究實驗中最好結果FastText 的Skip-gram詞向量預訓練模型，加上Transformer模型為例，Rouge-1、Rouge-2和Rouge-L分別為0.391、0.247、0.43，特別在Rouge-L的表現上高達0.43，即表示本研究自動生成摘要對原文有較高的涵蓋率，與實驗基準相比，Rouge-1、Rouge-2和Rouge-L分別提升了9%、16%、9%，且與Transformer 8層模型時間相比，透過較短的訓練時間，減少了5.5個小時，卻能得到更好的摘要結果，因此可推論本研究結合詞向量預訓練模型是可用較少的時間去增進系統效能。	zh_TW
dc.description.abstract	With the development of science and technology, it brings abundant resources for human. If we don’t properly manage it, it will cause information explosion. Therefore, the automatic text summarization is an important research topic today. In view of the breakthroughs in hardware limitations and computing resources in recent years, deep Learning is re-discussed by scholars and begins to be widely used in the field of natural language processing (NLP). Therefore, this research uses the attention mechanism model, Transformer, as the main system architecture to generate logical and sentence-smooth abstractive summarization. Besides, the research expects to use word embedding pre-training model to effectively improve the quality of the abstractive summarization. This system can greatly reduce the time and labor costs. The research uses the current largest Chinese data set LCSTS for comparison. At the same time, it will compare the results of the shallow pre-training word embedding model (Word2vec, FastText) and the deep pre-training word embedding model (ELMo). The results will help follow-up scholars do other researches. From the experimental results, we can see that: besides the CBOW model of Word2vec and FastText are not conducive to improving the summarization results, other word vector pre-training models have better performance in Rouge-1, Rouge-2 and Rouge-L. Take FastText Skip-gram word vector pre-training model combine Transformer model for example, this model has best performance. Rouge-1, Rouge-2, and Rouge-L are 0.391, 0.247, and 0.43, especially in Rouge-L. The performance of Rouge-L is as high as 0.43, which means that the automatic generation of the abstract in this study has a higher coverage rate for the original text. Compared with the experimental benchmark, Rouge-1, Rouge-2 and Rouge-L are increased by 9%, 16% and 9%. Besides, compared with the time of Transformer 8-layer model, it has shorter training time and can get better summary results at the same time. The training time of the model can reduce about 5.5 hours. We can infer that model combined with the word vector pre-training model is available to improve system performance.	en_US
DC.subject	自然語言處理	zh_TW
DC.subject	中文萃取式摘要	zh_TW
DC.subject	注意力機制	zh_TW
DC.subject	Transformer	zh_TW
DC.subject	詞向量	zh_TW
DC.title	基於注意力機制之詞向量中文萃取式摘要研究	zh_TW
dc.language.iso	zh-TW	zh-TW
DC.type	博碩士論文	zh_TW
DC.type	thesis	en_US
DC.publisher	National Central University	en_US

博碩士論文 106423014 完整後設資料紀錄