基於注意力機制之詞向量中文萃取式摘要研究

NCU Institutional Repository > 管理學院 > 資訊管理研究所 > 博碩士論文 > Item 987654321/81289

請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/81289

題名:	基於注意力機制之詞向量中文萃取式摘要研究
作者:	麥嘉芳;Mai, Chia-Fang
貢獻者:	資訊管理學系
關鍵詞:	自然語言處理;中文萃取式摘要;注意力機制;Transformer;詞向量
日期:	2019-07-19
上傳時間:	2019-09-03 15:42:47 (UTC+8)
出版者:	國立中央大學
摘要:	隨著科技發展，雖帶來豐富資源，但若沒有適當管理，會造成資訊爆炸之問題，因此自動文件摘要為當今重要的研究議題。有鑑於近年來硬體限制和運算資源不足的突破，深度學習開始被學者們重新探討，並廣泛應用於自然語言處理的領域當中，因此本研究以基於注意力機制的Transformer類神經網路作為主要系統架構，生成具有邏輯性且語句通順的萃取式摘要，並探討如何透過詞向量預訓練模型，有效地增進摘要品質，進而大幅減少時間與人力上之成本。此外，並使用目前最大且最常見之中文資料集LCSTS進行驗證，比較淺層預訓練模型（Word2vec、FastText）和深層預訓練模型（ELMo）之差異，其結果有助於後續學者的相關研究。從實驗結果可得知，整體來看，除了Word2vec和FastText中的CBOW模型不利於增進摘要結果外，其他詞向量預訓練模型，在Rouge-1、Rouge-2和Rouge-L都有較好的表現，以本研究實驗中最好結果FastText 的Skip-gram詞向量預訓練模型，加上Transformer模型為例，Rouge-1、Rouge-2和Rouge-L分別為0.391、0.247、0.43，特別在Rouge-L的表現上高達0.43，即表示本研究自動生成摘要對原文有較高的涵蓋率，與實驗基準相比，Rouge-1、Rouge-2和Rouge-L分別提升了9%、16%、9%，且與Transformer 8層模型時間相比，透過較短的訓練時間，減少了5.5個小時，卻能得到更好的摘要結果，因此可推論本研究結合詞向量預訓練模型是可用較少的時間去增進系統效能。 ;With the development of science and technology, it brings abundant resources for human. If we don’t properly manage it, it will cause information explosion. Therefore, the automatic text summarization is an important research topic today. In view of the breakthroughs in hardware limitations and computing resources in recent years, deep Learning is re-discussed by scholars and begins to be widely used in the field of natural language processing (NLP). Therefore, this research uses the attention mechanism model, Transformer, as the main system architecture to generate logical and sentence-smooth abstractive summarization. Besides, the research expects to use word embedding pre-training model to effectively improve the quality of the abstractive summarization. This system can greatly reduce the time and labor costs. The research uses the current largest Chinese data set LCSTS for comparison. At the same time, it will compare the results of the shallow pre-training word embedding model (Word2vec, FastText) and the deep pre-training word embedding model (ELMo). The results will help follow-up scholars do other researches. From the experimental results, we can see that: besides the CBOW model of Word2vec and FastText are not conducive to improving the summarization results, other word vector pre-training models have better performance in Rouge-1, Rouge-2 and Rouge-L. Take FastText Skip-gram word vector pre-training model combine Transformer model for example, this model has best performance. Rouge-1, Rouge-2, and Rouge-L are 0.391, 0.247, and 0.43, especially in Rouge-L. The performance of Rouge-L is as high as 0.43, which means that the automatic generation of the abstract in this study has a higher coverage rate for the original text. Compared with the experimental benchmark, Rouge-1, Rouge-2 and Rouge-L are increased by 9%, 16% and 9%. Besides, compared with the time of Transformer 8-layer model, it has shorter training time and can get better summary results at the same time. The training time of the model can reduce about 5.5 hours. We can infer that model combined with the word vector pre-training model is available to improve system performance.
顯示於類別:	[資訊管理研究所] 博碩士論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	242	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....