博碩士論文 107423027 完整後設資料紀錄

DC 欄位 語言
DC.contributor資訊管理學系zh_TW
DC.creator王美淋zh_TW
DC.creatorMei-Lin Wangen_US
dc.date.accessioned2020-7-20T07:39:07Z
dc.date.available2020-7-20T07:39:07Z
dc.date.issued2020
dc.identifier.urihttp://ir.lib.ncu.edu.tw:88/thesis/view_etd.asp?URN=107423027
dc.contributor.department資訊管理學系zh_TW
DC.description國立中央大學zh_TW
DC.descriptionNational Central Universityen_US
dc.description.abstract有鑑於現今網路資源豐富,人們無法短時間內處理這些龐大的資料,自動文本摘要任務因應而生。本研究使用兩段式模型,結合擷取式模型與萃取式模型,利用額外的輸入與預訓練模型,使萃取式模型產生通順的中文摘要結果,提升自動文本摘要之效能,並在其中探討詞頻模型與深度學習模型,應用於擷取式模型之好壞,以及擷取式模型輸出詞級別與句級別對於萃取式模型之影響。 根據實驗結果,我們發現使用本研究提出之兩段式模型,在實驗效能上皆優於Transformer,與其他學者相比在Rouge-1、Rouge-2的表現上也比較好,其中效能最好的模型是在第一個階段的擷取式模型上使用TF-IDF並配上詞級別的輸出方式,其評估指標Rouge-1、Rouge-2以及Rouge-L的效果高達0.447、0.268、0.407,而另一方面在擷取式模型中搭配深度學習也有不錯的表現,最好的結果是使用三層BiLSTM加上注意力機制,Rouge-1、Rouge-2以及Rouge-L的結果為0.4435、0.2669、0.400,與TF-IDF的效能相差不遠,因此可證實本研究對於萃取式摘要生成任務提出一個能有效提升中文摘要效能之系統,不管在摘要的流暢度或是訊息量都是有相當好的表現,也提供各項實驗數據以供後續學者進行相關研究。zh_TW
dc.description.abstractIn an era of information expansion, it brings abundant internet resources for human. But it is difficult for people to handle these huge data in a short time, hence the automatic text summarization task has been proposed accordingly. This research applies a two-stage model to do abstractive summarization task, combining an extractive model and an extractive model, and uses additional inputs and pre-trained models to make the abstractive model produce logical and semantically-smooth Chinese summary to improve the performance of automatic text summarization. We also discuss the word frequency model and deep learning model, which applied to the extraction model is best, and the impact of word level and sentence level output on the extraction model. According to the experimental results, we found that the two-stage model proposed in this research is better than the Transformer on the experimental performance, and the performance of Rouge-1 and Rouge-2 is State-of-the-Art better than models proposed by other scholars. The best model uses the TF-IDF and the word-level output method on the first-stage extraction model. Rouge-1, Rouge-2 and Rouge-L are 0.447、0.268 and 0.407. It also has a good performance with deep learning model in the extractive stage. In this situation, the best result is to use a three-layer BiLSTM with attention mechanism. The results of Rouge-1, Rouge-2 and Rouge-L are 0.4435, 0.2669 and 0.4, which is not far from the performance of TF-IDF. Therefore we can infer that we proposes a system that can effectively improve the performance of Chinese summarization for abstractive generation tasks and helps follow-up scholars to do the related researches there are grounds to follow.en_US
DC.subject中文萃取式摘要zh_TW
DC.subject自然語言處理zh_TW
DC.subject遞迴神經網路zh_TW
DC.subject詞向量zh_TW
DC.subjectTransformerzh_TW
DC.subjectChinese Abstractive Summarizationen_US
DC.subjectNLPen_US
DC.subjectRecurrent neural networken_US
DC.subjectword embeddingen_US
DC.subjectTransformeren_US
DC.title結合擷取式與萃取式兩段式模型以增進摘要效能之研究zh_TW
dc.language.isozh-TWzh-TW
DC.type博碩士論文zh_TW
DC.typethesisen_US
DC.publisherNational Central Universityen_US

若有論文相關問題,請聯絡國立中央大學圖書館推廣服務組 TEL:(03)422-7151轉57407,或E-mail聯絡  - 隱私權政策聲明