基於雙層詞性序列對序列模型之對話機器人

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：11

、訪客IP：18.117.103.28

姓名

呂家慧(Chia-Hui Lu) 查詢紙本館藏

畢業系所

資訊管理學系在職專班

論文名稱

基於雙層詞性序列對序列模型之對話機器人
(Chatbot based on two layer parts-of-speech Seq2Seq Model)

相關論文

★ 台灣50走勢分析：以多重長短期記憶模型架構為基礎之預測	★ 以多重遞迴歸神經網路模型為基礎之黃金價格預測分析
★ 增量學習用於工業4.0瑕疵檢測	★ 遞回歸神經網路於電腦零組件銷售價格預測之研究
★ 長短期記憶神經網路於釣魚網站預測之研究	★ 基於深度學習辨識跳頻信號之研究
★ Opinion Leader Discovery in Dynamic Social Networks	★ 深度學習模型於工業4.0之機台虛擬量測應用
★ A Novel NMF-Based Movie Recommendation with Time Decay	★ 以類別為基礎sequence-to-sequence模型之POI旅遊行程推薦
★ A DQN-Based Reinforcement Learning Model for Neural Network Architecture Search	★ Neural Network Architecture Optimization Based on Virtual Reward Reinforcement Learning
★ 生成式對抗網路架構搜尋	★ 以漸進式基因演算法實現神經網路架構搜尋最佳化
★ Enhanced Model Agnostic Meta Learning with Meta Gradient Memory	★ 遞迴類神經網路結合先期工業廢水指標之股價預測研究

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 (2025-7-1以後開放)

摘要(中)

對話機器人的智能應答，除可提供快速的客戶服務，亦可以幫助企業節省大量人力，所以提供這樣的服務也代表企業的一種競爭優勢。但其效能調校工作常需耗費時間及人力成本進行維護。我們希望可以提供一種生成式對話機器人，透過深度學習大量資料建立一種自然生成對話的模型。為了提高對話機器人的回覆準確率，我們在機器訓練過程中加入了詞性維度。藉由詞性，讓機器學習了解一個句子的結構及文法，在組成答句時，能夠更貼近人類的語言。
根據研究，生成式對話機器人多為序列對序列的深度學習模型。我們基於門控遞迴單元編碼器與解碼器組成序列對序列框架，再加入詞性，設計出四個新的詞性序列對序列。根據模型訓練後的評估結果，其中三種設計的模型都有高於基準序列對序列框架的效能表現，其中又以雙層詞性序列對序列模型的效能最為優越。
雙層詞性序列對序列的模型，經實驗多重驗證後，應可實作於業界的對話機器人的訓練上。提升的效能，除了可降低維護人力成本外；精準的回覆客戶問題，亦可增加客戶滿意度。

摘要(英)

In this paper, I develop a deep learning model to build a chatbot. To improve the response accuracy of the chatbot, I added the parts-of-speech dimension in the model to make the machine can learn the structure and grammar of a sentence.
This research is based on GRU Seq2Seq framework, adding parts-of-speech dimension and generate 4 new models for comparison.
According to the evaluation results, the three models of the 1hPosSeq2Seq Model, CVPoSSeq2Seq Model, and 2LPoSSeq2Seq Model all have higher performance than the benchmark sequence-to-sequence framework. Among them, the performance of the LPoSSeq2Seq model is the most superior, with a performance improvement of 40.08 %.

關鍵字(中)

★ 對話機器人
★ 詞性
★ 序列對序列

關鍵字(英)

論文目次

學位論文授權書 i
論文指導教授推薦書 ii
論文口試委員會審定書 iii
中文摘要 iv
Abstract v
誌謝 vi
目錄 vii
圖次 x
表次 xii
Chapter 1 緒論 1
1.1 研究背景 1
1.2 研究動機與目的 2
1.3 研究貢獻 3
1.4 論文架構 3
Chapter 2 文獻探討 5
2.1 預先訓練之詞向量(詞嵌入) 5
2.2 序列對序列於對話機器人運用現況 5
2.3 詞性在序列對序列之應用 6
2.4 對話機器人技術現況探討 7
Chapter 3 研究方法 9
3.1 詞嵌入 9
3.1.1 連續詞袋模型(Continuous Bag of Words, CBOW) 12
3.1.2 跳躍式模型(Skip-gram) 13
3.2 條件式的遞歸神經網路(Conditional RNN) 14
3.2.1 長短期記憶體(Long Short-term Memory, LSTM) 14
3.2.2 門控遞迴單元(Gate Recurrent Unit, GRU) 16
3.3 詞性序列對序列模型 18
3.3.1 編碼器/解碼器 19
3.3.2 注意力機制 19
3.3.3 詞性Seq2Seq模型建立 21
3.3.4 獨熱詞性序列對序列模型 21
3.3.5 常數向量詞性序列對序列模型 22
3.3.6 雙層詞性神經網路對序列模型 23
3.3.7 雙層詞性序列對序列模型 24
3.4 模型訓練評估 25
3.4.1 分類交叉熵損失函數(Categorical Crossentropy loss function) 26
3.4.2 困惑度(Perplexity) 26
3.4.3 雙語言評估互補(Bi-Lingual Evaluation Understudy, BLEU) 27
Chapter 4 研究結果分析 29
4.1 資料集 29
4.2 實驗設計 31
4.2.1 四種新Seq2Seq模組 31
4.3 實驗參數討論 35
4.3.1 Batch size 35
4.3.2 隱藏層神經元數量之影響 37
4.3.3 資料亂序 38
Chapter 5 結論 40
5.1 研究結果 40
5.2 研究發現 40
5.3 研究限制 42
5.4 未來研究與建議 42
參考文獻 45
附錄 50
結巴詞性 50
對話機器人介面 51

參考文獻

英文文獻
[1] Aalipour, G., Kumar, P., Aditham, S., Nguyen, T., & Sood, A. (2018). Applications of Sequence to Sequence Models for Technical Support Automation. In 2018 IEEE International Conference on Big Data (Big Data), 4861-4869.
[2] Adiwardana, D., Luong, M. T., So, D. R., Hall, J., Fiedel, N., Thoppilan, R., ... & Le, Q. V. (2020). Towards a human-like open-domain chatbot. arXiv preprint arXiv:2001.09977.
[3] Bahdanau, D., Cho, K., & Bengio, Y. (2014). Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473
[4] Belinkov, Y., Màrquez, L., Sajjad, H., Durrani, N., Dalvi, F., & Glass, J. (2018). Evaluating layers of representation in neural machine translation on part-of-speech and semantic tagging tasks. arXiv preprint arXiv:1801.07772.
[5] Cahn, J. (2017). CHATBOT: Architecture, design, & development. University of Pennsylvania School of Engineering and Applied Science Department of Computer and Information Science.
[6] Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y. (2014). Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078.
[7] Chung, J., Gulcehre, C., Cho, K., & Bengio, Y. (2014). Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555.
[8] Csaky, R. (2019). Deep learning based chatbot models. arXiv preprint arXiv:1908.08835.
[9] Ghazvininejad, M., Brockett, C., Chang, M. W., Dolan, B., Gao, J., Yih, W. T., & Galley, M. (2018). A knowledge-grounded neural conversation model. In Thirty-Second AAAI Conference on Artificial Intelligence.
[10] Greff, K., Srivastava, R. K., Koutník, J., Steunebrink, B. R., & Schmidhuber, J. (2016). LSTM: A search space odyssey. IEEE transactions on neural networks and learning systems, 28(10), 2222-2232.
[11] Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural computation, 9(8), 1735-1780.
[12] Honghao, W. E. I., Zhao, Y., & Ke, J. (2017). Building Chatbot with Emotions. Retrieved April, 12, 2018.
[13] Ji, Z., Lu, Z., & Li, H. (2014). An information retrieval approach to short text conversation. arXiv preprint arXiv:1408.6988.
[14] Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
[15] Lőrincz, B., Nuţu, M., & Stan, A. (2019). Romanian Part of Speech Tagging using LSTM Networks. In 2019 IEEE 15th International Conference on Intelligent Computer Communication and Processing (ICCP) ,223-228.
[16] Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.
[17] Palasundram, K., Sharef, N. M., Nasharuddin, N., Kasmiran, K., & Azman, A. (2019). Sequence to sequence model performance for education chatbot. International Journal of Emerging Technologies in Learning (iJET), 14(24), 56-68.
[18] Papineni, K., Roukos, S., Ward, T., & Zhu, W. J. (2002). BLEU: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics, 311-318.
[19] Raffel, C., & Ellis, D. P. (2015). Feed-forward networks with attention can solve some long-term memory problems. arXiv preprint arXiv:1512.08756.
[20] Rezaeinia, S. M., Ghodsi, A., & Rahmani, R. (2017). Improving the accuracy of pre-trained word embeddings for sentiment analysis. arXiv preprint arXiv:1711.08609.
[21] Ritter, A., Cherry, C., & Dolan, W. B. (2011). Data-driven response generation in social media. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, 583-593.
[22] Serban, I. V., Sordoni, A., Bengio, Y., Courville, A., & Pineau, J. (2016). Building end-to-end dialogue systems using generative hierarchical neural network models. In Thirtieth AAAI Conference on Artificial Intelligence.
[23] Shawar, B. A., & Atwell, E. (2003). Using dialogue corpora to train a chatbot. In Proceedings of the Corpus Linguistics 2003 conference, 681-690.
[24] Singh, S. P., Kearns, M. J., Litman, D. J., & Walker, M. A. (2000). Reinforcement learning for spoken dialogue systems. In Advances in Neural Information Processing Systems, 956-962.
[25] Sriram, A., Jun, H., Satheesh, S., & Coates, A. (2017). Cold fusion: Training seq2seq models together with language models. arXiv preprint arXiv:1708.06426.
[26] Su, S. Y., Lo, K. L., Yeh, Y. T., & Chen, Y. N. (2018). Natural language generation by hierarchical decoding with linguistic patterns. arXiv preprint arXiv:1808.02747.
[27] Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to sequence learning with neural networks. In Advances in neural information processing systems, 3104-3112
[28] Velay, M., & Daniel, F. (2018). Seq2Seq and Multi-Task Learning for joint intent and content extraction for domain specific interpreters. arXiv preprint arXiv:1808.00423.
[29] Vu, V. H., Nguyen, Q. P., Nguyen, K. H., Shin, J. C., & Ock, C. Y. (2020). Korean-Vietnamese Neural Machine Translation with Named Entity Recognition and Part-of-Speech Tags. IEICE Transactions on Information and Systems, 103(4), 866-873.
[30] Williams, J. D., & Young, S. (2007). Partially observable Markov decision processes for spoken dialog systems. Computer Speech & Language, 21(2), 393-422.
[31] Xu, A., Liu, Z., Guo, Y., Sinha, V., & Akkiraju, R. (2017). A new chatbot for customer service on social media. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, 3506-3510.
[32] Yan, Z., Duan, N., Bao, J., Chen, P., Zhou, M., Li, Z., & Zhou, J. (2016). Docchat: An information retrieval approach for chatbot engines using unstructured documents. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 516-525.
[33] Yang, X., Liu, Y., Xie, D., Wang, X., & Balasubramanian, N. (2019). Latent part-of-speech sequences for neural machine translation. arXiv preprint arXiv:1908.11782.
[34] Yin, Z., Chang, K. H., & Zhang, R. (2017). Deepprobe: Information directed sequence understanding and chatbot design via recurrent neural networks. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2131-2139.
[35] Zalake, N., & Naik, G. (2019). Generative Chat Bot Implementation Using Deep Recurrent Neural Networks and Natural Language Understanding. In Proceedings 2019: Conference on Technologies for Future Cities (CTFC).
網站資訊
[36] Chatbots Magazine (2017), Can Chatbots Help Reduce Customer Service Costs by 30%? ,https://chatbotsmagazine.com/how-with-the-help-of-chatbots-customer-service-costs-could-be-reduced-up-to-30-b9266a369945 ,存取時間：2020/3/14.
[37] REVE Chat (2020) ,https://www.revechat.com/blog/chatbots-trends-stats/,存取日期：2020/5/2.

指導教授

陳以錚(Yi-Cheng Chen)

審核日期

2020-7-28

推文