博碩士論文 105423021 詳細資訊




以作者查詢圖書館館藏 以作者查詢臺灣博碩士 以作者查詢全國書目 勘誤回報 、線上人數:26 、訪客IP:18.188.211.117
姓名 蔡汶霖(Wen-Lin Tsai)  查詢紙本館藏   畢業系所 資訊管理學系
論文名稱 以詞向量模型增進基於遞歸神經網路之中文文字摘要系統效能
相關論文
★ 網路合作式協同教學設計平台-以國中九年一貫課程為例★ 內容管理機制於常用問答集(FAQ)之應用
★ 行動多重代理人技術於排課系統之應用★ 存取控制機制與國內資安規範之研究
★ 信用卡系統導入NFC手機交易機制探討★ App應用在電子商務的推薦服務-以P公司為例
★ 建置服務導向系統改善生產之流程-以W公司PMS系統為例★ NFC行動支付之TSM平台規劃與導入
★ 關鍵字行銷在半導體通路商運用-以G公司為例★ 探討國內田徑競賽資訊系統-以103年全國大專田徑公開賽資訊系統為例
★ 航空地勤機坪作業盤櫃追蹤管理系統導入成效評估—以F公司為例★ 導入資訊安全管理制度之資安管理成熟度研究-以B個案公司為例
★ 資料探勘技術在電影推薦上的應用研究-以F線上影音平台為例★ BI視覺化工具運用於資安日誌分析—以S公司為例
★ 特權帳號登入行為即時分析系統之實證研究★ 郵件系統異常使用行為偵測與處理-以T公司為例
檔案 [Endnote RIS 格式]    [Bibtex 格式]    [相關文章]   [文章引用]   [完整記錄]   [館藏目錄]   [檢視]  [下載]
  1. 本電子論文使用權限為同意立即開放。
  2. 已達開放權限電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。
  3. 請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。

摘要(中) 在資訊過度膨脹的時代,人們難以在短時間內接受大量資訊,自動摘要的技術因應而生。本研究以遞歸神經網路(recurrent neural network, RNN)為基礎建立一套萃取式(abstractive)摘要系統,並以word2vec與GloVe及fastText等不同的詞向量(word embedding)模型作為遞歸神經網路之預訓練詞向量模型,藉此提升摘要系統之品質。
本研究使用來自維基百科的大規模泛用語料庫與來自LCSTS資料集的語料庫作為預訓練詞向量之語料庫,並以不同維度的多種詞向量模型搭配不同維度的遞歸神經網路交互測試實驗,發現預訓練詞向量的確有助於提升系統效能,且使用適中維度的詞向量模型搭配高維度的遞歸神經網路時能取得最佳表現。
本研究亦將系統應用於中文文章,提出泛用性高且效能優異的萃取式中文摘要系統,除了以自動化評估指標取得優於前人研究30%之水準外,本研究亦以質性分析列出從優而劣之摘要成果以供參考,最後則以臺灣地區之實際新聞文章測試並驗證系統效能。
摘要(英) In an era of information expansion, it is difficult for people to accept a large amount of information in a short time. That is the reason why the technology of automatic summarization was born. In this study, an abstractive text summarization system based on recurrent neural network (RNN) is established. Various pre-trained word embedding models, such as word2vec, GloVe, and fastText, are used with the RNN model to improve the quality of the summarization system.
In this study, we used two corpora to pre-train word embedding models, including a large-scale and general corpus from Wikipedia and a corpus from the LCSTS dataset. In a series of experiments, we built RNN models with different hidden units’ size and different word embedding models with their different dimensions and found that the pre-trained word embedding models conduce to improve system performances. To achieve best results, using suitable dimensions in word embedding models with larger hidden units’ size in RNN is highly recommended.
The summarization system is also applied to Chinese articles to achieve a Chinese abstractive summarization system with high versatility and high performance. Our system exceeds previous works’ results in 30%, and we also provide qualitative analyses to demonstrate our outstanding achievements. Lastly, we use Taiwan news articles to test and verify our system performance.
關鍵字(中) ★ 詞向量
★ 詞嵌入
★ 中文摘要
★ 萃取式摘要
★ 遞歸神經網路
關鍵字(英) ★ word vector
★ word embedding
★ Chinese summarization
★ abstractive summarization
★ RNN
論文目次 摘要 i
Abstract ii
誌謝 iii
目錄 iv
圖目錄 vii
表目錄 viii
一、 緒論 1
1-1 研究背景 1
1-2 研究動機 2
1-3 研究目的 3
1-4 研究架構 4
二、 相關文獻探討 5
2-1 自動文件摘要 5
2-2 詞向量 6
2-2-1 word2vec 7
2-2-2 GloVe 9
2-2-3 fastText 9
2-3 遞歸神經網路 10
2-3-1 經典遞歸神經網路 10
2-3-2 長短期記憶 12
2-3-3 序列到序列模型 15
2-3-4 注意力機制 17
三、 研究方法 20
3-1 研究架構 20
3-2 系統架構 21
3-2-1 資料前處理 21
3-2-2 詞向量訓練 21
3-2-3 遞歸神經網路訓練 22
3-2-4 成果評估 26
四、 實驗與討論 27
4-1 資料集與前處理 27
4-2 實驗環境 29
4-3 評估方法 29
4-4 實驗設計與結果 32
4-4-1 實驗一:詞向量模型之影響 32
4-4-2 實驗二:詞向量語料庫之影響 34
4-4-3 實驗三:詞向量維度之影響 36
4-4-4 實驗四:遞歸神經網路維度之影響 39
4-5 實驗分析與建議 41
4-6 質性分析 42
4-6-1 ROUGE分數極高實例 43
4-6-2 ROUGE分數尚可實例 44
4-6-3 ROUGE分數極低實例 45
4-7 實際應用 47
五、 結論與未來研究方向 51
5-1 結論 51
5-2 研究限制 52
5-3 未來研究方向 53
參考文獻 54
附錄一、 預訓練詞向量維度50之實驗結果 58
語料庫為wiki者 58
語料庫為LCSTS者 58
附錄二、 預訓練詞向量維度128之實驗結果 59
語料庫為wiki者 59
語料庫為LCSTS者 60
附錄三、 預訓練詞向量維度300之實驗結果 61
語料庫為wiki者 61
語料庫為LCSTS者 61
附錄四、 預訓練詞向量維度500之實驗結果 62
語料庫為wiki者 62
語料庫為LCSTS者 62
參考文獻 英文文獻
[1]. Bahdanau, D. et al. (2014). Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.
[2]. Baxendale, P. B. (1958). Machine-made index for technical literature—an experiment. IBM Journal of research and development, 2(4), 354-361.
[3]. Bengio, Y. et al. (2003). A neural probabilistic language model. Journal of machine learning research, 3(Feb), 1137-1155.
[4]. Bojanowski, P. et al. (2016). Enriching word vectors with subword information. arXiv preprint arXiv:1607.04606.
[5]. BYVoid et al. (2017). OpenCC. GitHub repository. Retrieved from https://github.com/BYVoid/OpenCC
[6]. Chopra, S. et al. (2016). Abstractive sentence summarization with attentive recurrent neural networks. Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.
[7]. Collobert, R. et al. (2011). Natural language processing (almost) from scratch. Journal of machine learning research, 12(Aug), 2493-2537.
[8]. Conroy, J. M. & O′leary, D. P. (2001). Text summarization via hidden markov models. Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval.
[9]. Das, D. & Martins, A. F. (2007). A survey on automatic text summarization. Literature Survey for the Language and Statistics II course at CMU, 4, 192-195.
[10]. Deerwester, S. et al. (1990). Indexing by latent semantic analysis. Journal of the American society for information science, 41(6), 391-407.
[11]. Deng, L. (2014). A tutorial survey of architectures, algorithms, and applications for deep learning. APSIPA Transactions on Signal and Information Processing, 3.
[12]. Edmundson, H. P. (1969). New methods in automatic extracting. Journal of the ACM (JACM), 16(2), 264-285.
[13]. Elman, J. L. (1990). Finding structure in time. Cognitive science, 14(2), 179-211.
[14]. fxsjy et al. (2018). jieba. GitHub repository. Retrieved from https://github.com/fxsjy/jieba
[15]. Graves, A. & Jaitly, N. (2014). Towards end-to-end speech recognition with recurrent neural networks. International Conference on Machine Learning.
[16]. Gu, J. et al. (2016). Incorporating copying mechanism in sequence-to-sequence learning. arXiv preprint arXiv:1603.06393.
[17]. Hassan, H. et al. (2018). Achieving Human Parity on Automatic Chinese to English News Translation. arXiv preprint arXiv:1803.05567.
[18]. Hinton, G. E. (1986). Learning distributed representations of concepts. Proceedings of the eighth annual conference of the cognitive science society.
[19]. Hochreiter, S. & Schmidhuber, J. (1997). Long short-term memory. Neural computation, 9(8), 1735-1780.
[20]. Hu, B. et al. (2015). Lcsts: A large scale chinese short text summarization dataset. arXiv preprint arXiv:1506.05865.
[21]. Huang, E. H. et al. (2012). Improving word representations via global context and multiple word prototypes. Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers-Volume 1.
[22]. Joulin, A. et al. (2016). Bag of tricks for efficient text classification. arXiv preprint arXiv:1607.01759.
[23]. Karpathy, A. & Fei-Fei, L. (2015). Deep visual-semantic alignments for generating image descriptions. Proceedings of the IEEE conference on computer vision and pattern recognition.
[24]. Klein, G. et al. (2017). Opennmt: Open-source toolkit for neural machine translation. arXiv preprint arXiv:1701.02810.
[25]. Kupiec, J. et al. (1995). A trainable document summarizer. Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval.
[26]. Lai, S. et al. (2016). How to generate a good word embedding. IEEE Intelligent Systems, 31(6), 5-14.
[27]. Lin, C.-Y. (1999). Training a selection function for extraction. Proceedings of the eighth international conference on Information and knowledge management.
[28]. Lin, C.-Y. (2004). Rouge: A package for automatic evaluation of summaries. Text Summarization Branches Out.
[29]. Lin, C.-Y. & Hovy, E. (1997). Identifying topics by position. Proceedings of the fifth conference on Applied natural language processing.
[30]. Lipton, Z. C. et al. (2015). A critical review of recurrent neural networks for sequence learning. arXiv preprint arXiv:1506.00019.
[31]. Luhn, H. P. (1958). The automatic creation of literature abstracts. IBM Journal of research and development, 2(2), 159-165.
[32]. Mesnil, G. et al. (2013). Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding. Interspeech.
[33]. Mikolov, T. et al. (2013a). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.
[34]. Mikolov, T. et al. (2010). Recurrent neural network based language model. Eleventh Annual Conference of the International Speech Communication Association.
[35]. Mikolov, T. et al. (2013b). Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems.
[36]. Mnih, A. & Hinton, G. (2007). Three new graphical models for statistical language modelling. Proceedings of the 24th international conference on Machine learning.
[37]. Nallapati, R. et al. (2016). Abstractive text summarization using sequence-to-sequence rnns and beyond. arXiv preprint arXiv:1602.06023.
[38]. Olah, C. (2015). Understanding LSTM Networks. Retrieved from http://colah.github.io/posts/2015-08-Understanding-LSTMs/
[39]. Osborne, M. (2002). Using maximum entropy for sentence extraction. Proceedings of the ACL-02 Workshop on Automatic Summarization-Volume 4.
[40]. Pennington, J. et al. (2014). Glove: Global vectors for word representation. Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP).
[41]. Radev, D. R. et al. (2002). Introduction to the special issue on summarization. Computational linguistics, 28(4), 399-408.
[42]. Schuster, M. & Paliwal, K. K. (1997). Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing, 45(11), 2673-2681.
[43]. Sutskever, I. et al. (2014). Sequence to sequence learning with neural networks. Advances in neural information processing systems.
[44]. Svore, K. et al. (2007). Enhancing single-document summarization by combining RankNet and third-party sources. Proceedings of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL).
[45]. Turian, J. et al. (2010). Word representations: a simple and general method for semi-supervised learning. Proceedings of the 48th annual meeting of the association for computational linguistics.
[46]. Wang, P. et al. (2015a). A unified tagging solution: Bidirectional LSTM recurrent neural network with word embedding. arXiv preprint arXiv:1511.00215.
[47]. Wang, P. et al. (2015b). Word embedding for recurrent neural network based TTS synthesis. Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on.
[48]. Zuckerberg, M. (2017, June 27). As of this morning, the Facebook community is now officially 2 billion people! We′re making progress connecting the world, and now let′s bring the world closer together. It′s an honor to be on this journey with you [Facebook Status Update]. Retrieved from https://www.facebook.com/zuck/posts/10103831654565331
中文文獻
[49]. 張昇暉. (2017). 中文文件串流之摘要擷取研究. (碩士論文), 國立中央大學.
指導教授 林熙禎(Shi-Jen Lin) 審核日期 2018-7-27
推文 facebook   plurk   twitter   funp   google   live   udn   HD   myshare   reddit   netvibes   friend   youpush   delicious   baidu   
網路書籤 Google bookmarks   del.icio.us   hemidemi   myshare   

若有論文相關問題,請聯絡國立中央大學圖書館推廣服務組 TEL:(03)422-7151轉57407,或E-mail聯絡  - 隱私權政策聲明