具擷取及萃取能力的摘要模型

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：59

、訪客IP：3.145.196.150

姓名

陳俞琇(Yu-Xiu Chen) 查詢紙本館藏

畢業系所

資訊管理學系

論文名稱

具擷取及萃取能力的摘要模型

相關論文

★ 網路合作式協同教學設計平台－以國中九年一貫課程為例	★ 內容管理機制於常用問答集(FAQ)之應用
★ 行動多重代理人技術於排課系統之應用	★ 存取控制機制與國內資安規範之研究
★ 信用卡系統導入NFC手機交易機制探討	★ App應用在電子商務的推薦服務-以P公司為例
★ 建置服務導向系統改善生產之流程-以W公司PMS系統為例	★ NFC行動支付之TSM平台規劃與導入
★ 關鍵字行銷在半導體通路商運用-以G公司為例	★ 探討國內田徑競賽資訊系統－以103年全國大專田徑公開賽資訊系統為例
★ 航空地勤機坪作業盤櫃追蹤管理系統導入成效評估—以F公司為例	★ 導入資訊安全管理制度之資安管理成熟度研究－以B個案公司為例
★ 資料探勘技術在電影推薦上的應用研究-以F線上影音平台為例	★ BI視覺化工具運用於資安日誌分析—以S公司為例
★ 特權帳號登入行為即時分析系統之實證研究	★ 郵件系統異常使用行為偵測與處理-以T公司為例

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 ( 永不開放)

摘要(中)

自然語言處理的模型由於需要準備字典給模型做挑選，因此衍生出 Out Of Vocabulary(OOV) 這個問題，是指句子裡面有不存在於字典的用詞，過往有人嘗試在 Recurrent Neural Networks(RNN) 上加入複製機制，以改善這個問題。但 Transformer 是自然語言處理的新模型，不若過往的 RNN 或 Convolutional Neural Networks(CNN) 已經有許多改善機制，因此本研究將針對 Transformer 進行改良，添加額外的輸入和輸出的相互注意力，來完成複製機制的概念，讓 Transformer 能有更佳的表現。

摘要(英)

In natural language processing, Out of Vocabulary(OOV) has always been a issue. It limits the performance of summarization model. Past study resolve this problem by adding copy mechanism to Recurrent Neural Networks(RNN). However, resent study discover a new model – Transformer which outperforms RNN in many categories. So, this work will improve Transformer model by adding copy mechanism in order to enhance the relation of model’s input and output result..

關鍵字(中)

★ 自然語言處理
★ 萃取式摘要
★ 注意力機制
★ Transformer
★ 複製機制

關鍵字(英)

論文目次

摘要 i
Abstract ii
誌謝 iii
目錄 iv
圖目錄 vii
表目錄 viii
一、緒論 1
1-1 研究背景 1
1-2 研究動機 2
1-3 研究目的 3
1-4 文章架構 3
二、文獻 5
2-1 摘要類型 5
2-2 Encoder-Decoder 模型 6
2-3 Recurrent Neural Networks 7
2-3-1 Long Short-Term Memory 8
2-3-2 RNN的 Encoder-Decoder 模型 9
2-3-3 注意力機制 10
2-4 複製機制 11
2-4-1 CopyNet 12
2-4-2 Generator/Pointer Switch 14
2-4-3 Pointer-Generator 15
2-4-4 複製機制總整理 16
2-5 Transformer 17
2-5-1 與過往模型的比較 18
三、研究方法 20
3-1 研究架構 20
3-2 資料集處理 20
3-3 模型 22
3-3-1 多向注意力機制 23
3-3-2 前饋網路 24
3-3-3 注意力分佈 24
3-3-4 Pgen 25
3-3-5 標準化 25
3-3-6 最終輸出 26
3-4 評估結果 26
四、實驗結果與討論 29
4-1 資料集 29
4-2 實驗環境 30
4-3 實驗設計 30
4-3-1 實驗一：與過往模型之比較 30
4-3-2 實驗二：消融測試之 Norm 模組 32
4-3-3 實驗三：消融測試之 Pgen 模組 33
4-3-4 實驗四：輸入長度測試 34
五、結論與未來研究方向 36
5-1 結論 36
5-2 資料來源：本研究研究限制 37
5-3 未來研究方向 37
參考文獻 38

參考文獻

Ba, J. L., Kiros, J. R., & Hinton, G. E. (2016). Layer Normalization. ArXiv:1607.06450 [Cs, Stat]. Retrieved from http://arxiv.org/abs/1607.06450
Bahdanau, D., Cho, K., & Bengio, Y. (2014). Neural Machine Translation by Jointly Learning to Align and Translate. Proceedings of the 2015 International Conference on Learning Representations (ICLR). Retrieved from http://arxiv.org/abs/1409.0473
Cao, Z., Wei, F., Dong, L., Li, S., & Zhou, M. (n.d.). Ranking with Recursive Neural Networks and Its Application to Multi-document Summarization. Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 7.
Chen, M. X., Firat, O., Bapna, A., Johnson, M., Macherey, W., Foster, G., … Hughes, M. (2018). The Best of Both Worlds: Combining Recent Advances in Neural Machine Translation. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 76–86. Retrieved from https://www.aclweb.org/anthology/P18-1008
Cho, K., van Merrienboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y. (2014). Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1724–1734. https://doi.org/10.3115/v1/D14-1179
Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019, June). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. 4171–4186. Retrieved from https://aclweb.org/anthology/papers/N/N19/N19-1423/
Elman, J. L. (1990). Finding Structure in Time. Cognitive Science, 14(2), 179–211. https://doi.org/10.1207/s15516709cog1402_1
Gehring, J., Auli, M., Grangier, D., Yarats, D., & Dauphin, Y. N. (2017). Convolutional Sequence to Sequence Learning. Proceedings of the 34th International Conference on Machine Learning - Volume 70, 1243–1252. Retrieved from http://dl.acm.org/citation.cfm?id=3305381.3305510
Gong, Y., & Liu, X. (2001). Generic text summarization using relevance measure and latent semantic analysis. Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval - SIGIR ’01, 19–25. https://doi.org/10.1145/383952.383955
Graves, A., & Jaitly, N. (n.d.). Towards End-to-End Speech Recognitionwith Recurrent Neural Networks. Proceedings of the 31 St International Conference on Machine Learning, 9.
Gu, J., Lu, Z., Li, H., & Li, V. O. K. (2016). Incorporating Copying Mechanism in Sequence-to-Sequence Learning. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 1631–1640. https://doi.org/10.18653/v1/P16-1154
Hassan, H., Aue, A., Chen, C., Chowdhary, V., Clark, J., Federmann, C., … Zhou, M. (2018). Achieving Human Parity on Automatic Chinese to English News Translation. ArXiv:1803.05567 [Cs]. Retrieved from http://arxiv.org/abs/1803.05567
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 770–778. https://doi.org/10.1109/CVPR.2016.90
Hochreiter, S., & Schmidhuber, J. (1997). Long Short-Term Memory. Neural Computation, 9(8), 1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735
Jean, S., Cho, K., Memisevic, R., & Bengio, Y. (2015). On Using Very Large Target Vocabulary for Neural Machine Translation. Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 1–10. https://doi.org/10.3115/v1/P15-1001
Kalchbrenner, N., Grefenstette, E., & Blunsom, P. (2014). A Convolutional Neural Network for Modelling Sentences. Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 655–665. https://doi.org/10.3115/v1/P14-1062
Kim, B., Kim, H., & Kim, G. (2019). Abstractive Summarization of Reddit Posts with Multi-level Memory Networks. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), 2519–2531. Retrieved from https://www.aclweb.org/anthology/N19-1260
Kim, Y. (2014). Convolutional Neural Networks for Sentence Classification. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1746–1751. https://doi.org/10.3115/v1/D14-1181
Lei, T., Zhang, Y., Wang, S. I., Dai, H., & Artzi, Y. (2018). Simple Recurrent Units for Highly Parallelizable Recurrence. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 4470–4481. Retrieved from https://www.aclweb.org/anthology/D18-1477
Loper, E., & Bird, S. (2002). NLTK: The Natural Language Toolkit. In Proceedings of the ACL-02 Workshop on Effective Tools and Methodologies for Teaching Natural Language Processing and Computational Linguistics. Retrieved from http://arxiv.org/abs/cs/0205028
Mihalcea, R., & Tarau, P. (2004). TextRank: Bringing Order into Text. Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, 404–411. Retrieved from https://www.aclweb.org/anthology/W04-3252
Nallapati, R., Zhou, B., dos Santos, C., Gulcehre, C., & Xiang, B. (2016). Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond. Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning, 280–290. https://doi.org/10.18653/v1/K16-1028
See, A., Liu, P. J., & Manning, C. D. (2017, July). Get To The Point: Summarization with Pointer-Generator Networks. 1073–1083. https://doi.org/10.18653/v1/P17-1099
Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to Sequence Learning with Neural Networks. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, & K. Q. Weinberger (Eds.), Advances in Neural Information Processing Systems 27 (pp. 3104–3112). Retrieved from http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf
Vaswani, A., Bengio, S., Brevdo, E., Chollet, F., Gomez, A. N., Gouws, S., … Uszkoreit, J. (2018). Tensor2Tensor for Neural Machine Translation. ArXiv:1803.07416 [Cs, Stat]. Retrieved from http://arxiv.org/abs/1803.07416
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., … Polosukhin, I. (2017). Attention is All you Need. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, & R. Garnett (Eds.), Advances in Neural Information Processing Systems 30 (pp. 5998–6008). Retrieved from http://papers.nips.cc/paper/7181-attention-is-all-you-need.pdf
Vosoughi, S., Vijayaraghavan, P., & Roy, D. (2016). Tweet2Vec: Learning Tweet Embeddings Using Character-level CNN-LSTM Encoder-Decoder. Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval, 1041–1044. https://doi.org/10.1145/2911451.2914762
Wu, Y., Schuster, M., Chen, Z., Le, Q. V., Norouzi, M., Macherey, W., … Dean, J. (2016). Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation. ArXiv:1609.08144 [Cs]. Retrieved from http://arxiv.org/abs/1609.08144
Zhang, A., Pueyo, L. G., Wendt, J. B., Najork, M., & Broder, A. (2017). Email Category Prediction. Companion Proc. of the 26th International World Wide Web Conference, 495–503.
Zhang, Y., Er, M. J., Zhao, R., & Pratama, M. (2017). Multiview Convolutional Neural Networks for Multidocument Extractive Summarization. IEEE Transactions on Cybernetics, 47(10), 3230–3242. https://doi.org/10.1109/TCYB.2016.2628402

指導教授

林熙禎

審核日期

2019-7-19

推文