基于注意和情绪觸發的序列到序列短文本對話學習模型設計研究

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：58

、訪客IP：3.145.34.221

姓名

李胤龍(Sébastien Montella) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

基于注意和情绪觸發的序列到序列短文本對話學習模型設計研究
(Emotionally-Triggered Short Text Conversation using Attention-Based Sequence Generation Models)

相關論文

★ 行程邀約郵件的辨識與不規則時間擷取之研究	★ NCUFree校園無線網路平台設計及應用服務開發
★ 網際網路半結構性資料擷取系統之設計與實作	★ 非簡單瀏覽路徑之探勘與應用
★ 遞增資料關聯式規則探勘之改進	★ 應用卡方獨立性檢定於關連式分類問題
★ 中文資料擷取系統之設計與研究	★ 非數值型資料視覺化與兼具主客觀的分群
★ 關聯性字組在文件摘要上的探討	★ 淨化網頁：網頁區塊化以及資料區域擷取
★ 問題答覆系統使用語句分類排序方式之設計與研究	★ 時序資料庫中緊密頻繁連續事件型樣之有效探勘
★ 星狀座標之軸排列於群聚視覺化之應用	★ 由瀏覽歷程自動產生網頁抓取程式之研究
★ 動態網頁之樣版與資料分析研究	★ 同性質網頁資料整合之自動化研究

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

在情緒認知裡，語意認知受到高度的重視。結合自然語言生成的研究，透過回應，產生連續的特定情緒來回覆人們，進一步來使機器更人性化。對於文本或句子的情感分析問題已被廣泛研究和改進，但針對文句內容產生相對應情緒的研究始終被忽視。同時，受惠於 Generative Adversarial Network (GAN)，生成模型最近得到一系列的改進，使自然語言處理和計算機視覺中相繼得到滿意的結果。然而，當應用於文本生成時，對抗性學習可能導致產生的文本質量低落且模型崩壞。在本文中，我們利用社交媒體單一對話的訓練資料，來提出一種新的訓練方法，以便為NTCIR-14 研討會的 Short-Text Conversation task (STC-3) 生成語法正確和情感一致的答案。我們參照了 StarGAN 的框架，使用 Attention-based Sequence-to-Sequence 來作為我們的文本生成器。我們使用情感鑲嵌和情緒分類器的輸出來幫助模型訓練。為了避免上述對抗網絡的問題，我們選擇使用 maximum likelihood 或 adversarial loss 來訓練我們的文本生成器。

摘要(英)

Emotional Intelligence is a field from which awareness is heavily being raised. Coupled with language generation, one expects to further humanize the machine and be a step closer to the user by generating responses that are consistent with a specific emotion. The analysis of sentiment within documents or sentences have been widely studied and improved while the generation of emotional content remains under-researched. Meanwhile, generative models have recently known series of improvements thanks to Generative Adversarial Network (GAN). Promising results are frequently reported in both natural language processing and computer vision. However, when applied to text generation, adversarial learning may lead to poor quality sentences and mode collapse. In this paper, we leverage one-round data conversation from social media to propose a novel approach in order to generate grammatically-correct-and-emotional-consistent answers for Short-Text Conversation task (STC-3) for NTCIR-14 workshop. We make use of an Attention-based Sequence-to-Sequence as our generator, inspired from StarGAN framework. We provide emotion embeddings and direct feedback from an emotion classifier to guide the generator. To avoid the aforementioned issues with adversarial networks, we alternatively train our generator using maximum likelihood and adversarial loss.

關鍵字(中)

★ Short-Text Conversation task
★ Attention-based Sequence-to-Sequence
★ Generative Adversarial Network (GAN)

關鍵字(英)

★ Short-Text Conversation task
★ Attention-based Sequence-to-Sequence
★ Generative Adversarial Network (GAN)

論文目次

1 Introduction . . .1
1.1 Motivation . . .1
1.2 Thesis Structure. . .3
2 Related Work. . .4
3 Background. . .7
3.1 Word Representation . . .7
3.2 Recurrent Neural Models. . .8
3.2.1 Recurrent Neural Network (RNN) . . .8
3.2.2 Long-Short Term Memory . . .9
3.2.3 Sequence-to-Sequence Model . . . 11
3.3 Training Algorithms . . .12
3.3.1 Maximum Likelihood Estimation . . .12
3.3.2 Adversarial Training . . . 13
4 Dataset. . .14
4.1 NTCIR Workshop . . .14
4.2 Data Description . . .14
5 STC with NTCIR Labels. . .16
5.1 Seq2Seq (Baseline) . . . 16
5.2 Attention-Based SeqGAN. . .18
5.2.1 SeqGAN Framework . . . 18
5.2.2 Generative and Discriminative Models . . . 19
5.3 Experiment . . . 20
5.3.1 Training Setup . . . 20
5.3.2 Evaluation Results . . . 22
6 STC with our Classifier Labels . . . 25
6.1 Emotion Classifier . . . 25
6.1.1 Data Collection . . . 25
6.1.2 Model . . . 26
6.2 Experiment . . . 27
6.2.1 Experiments Details . . . 27
6.2.2 Training Settings . . . 27
6.2.3 Experimental Results . . . 28
6.3 StarGAN . . . 29
6.3.1 StarGAN Framework . . . 29
6.3.2 Model Adaptation . . . 31
6.4 Experiment . . .33
7 Conclusion & Future Work . . . 36
7.1 Performances Discussion . . . 36
7.2 Automatic Evaluation . . . 37
7.3 Final Conclusion . . . 38

參考文獻

1] Asghar, N., Poupart, P., Hoey, J., Jiang, X., and Mou, L. Affective neural response
generation. In 40th European Conference on Information Retrieval (ECIR 2018) (France,
2018).
[2] Bahdanau, D., Cho, K., and Bengio, Y. A neural conversational model. In International
Conference on Learning Representations (San Diego, California, 2014).
[3] Bahdanau, D., Cho, K., and Bengio, Y. A hierarchical latent variable encoder-decoder
model for generating dialogues. In Association for the Advancement of Artificial Intelligence
(AAAI-17) (San Francisco, California, 2017).
[4] Choi, Y., Choi, M., Kim, M., Ha, J.-W., Kim, S., and Choo, J. Stargan: Unified
generative adversarial networks for multi-domain image-to-image translation. In Conference
on Computer Vision and Pattern Recognition (CVPR) (Salt Lake City, Utah, 2018).
[5] Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. Bert: Pre-training of deep bidi-
rectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
[6] Goodfellow, I. J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D.,
Ozair, S., Courville, A., and Bengio, Y. Generative adversarial nets. In Neural Infor-
mation Processing System (NIPS-16) (2016).
[7] Hochreiter Sepp, S. J. Long short-term memory. Neural Computation (November 15,
1997), 1735{1780.
[8] Kingma, D. P., and Ba, J. L. Adam: A method for stochastic optimization. In International
Conference on Learning Representations (ICLR) (2015).
[9] Lin, C.-Y. Rouge: A package for automatic evaluation of summaries. In Proc. ACL workshop
on Text Summarization Branches Out (2004), p. 10.
[10] Martin Arjovsky, L. B. Towards principled methods for training generative adversarial
networks. In Proc. of ICLR (2017).
[11] Mikolov, T., Chen, K., Corrado, G., and Dean, J. Efficient estimation of word
representations in vector space. In Proceedings of the International Conference on Learning
Representations (ICLR) (2013).
[12] Papineni, K., Roukos, S., Ward, T., and Wei-JingZhu. Bleu: a method for automatic
evaluation of machine translation. In Proceedings of the 40th Annual Meeting on Association
for Computational Linguistics (Philadelphia, Pennsylvania, 2002).
[13] Pestian JP, Matykiewicz P, L.-G. M. S. B. U. O. W. J. C. K. H. J. B. C. Sentiment
analysis of suicide notes: A shared task. Biomed Inform Insights (2012), 3{6.
[14] Peters, M. E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., and
Zettlemoyer, L. Deep contextualized word representations. In Proc. of NAACL (2018).
[15] Piotr Bojanowski, Edouard Grave, A. J. T. M. Enriching word vectors with subword
information. In Transactions of the Association for Computational Linguistics (2017).
[16] Serban, I. V., Sordoni, A., Bengio, Y., Courville, A., and Pineau, J. Building end-
to-end dialogue systems using generative hierarchical neural network models. In Proceedings
of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16) (Phoenix, Arizona,
2016).
[17] Shang, L., Lu, Z., and Li, H. Neural responding machine for short-text conversation. In
The 2015 Conference of the Association for Computational Linguistics (2015).
[18] Sutskever, I., Vinyals, O., and Le, Q. V. Sequence to sequence learning with neural net-
works. In Proceedings of the 27th International Conference on Neural Information Processing
Systems (Montreal Canada, December 08 - 13, 2014), NIPS.
[19] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N.,
Kaiser, L., and Polosukhin, I. Attention is all you need. In 31st Conference on Neural
Information Processing System (Long Beach, California, 2017).
[20] Vinyals, O., and Le, Q. V. A neural conversational model. In Proceedings of the 31st
International Conference on Machine Learning (Lille, France, 2014).
[21] Wang, K., and Wan, X. Sentigan: Generating sentimental texts via mixture adversarial
networks. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial
Intelligence (IJCAI) (2018).
[22] Xiaohe Li, Jiaqing Liu, W. Z. X. W. Y. Z., and Dou, Z. Rucir at ntcir-14 stc-13 task. In
Proceedings of the 14th NTCIR Conference on Evaluation of Information Access Technologies
(Tokyo, Japan, 2019).
[23] Yangyang Zhou, Zheng Liu, X. K. Y. W., and Ren, F. Tkuim at ntcir-14 stc-3 cecg
task. In Proceedings of the 14th NTCIR Conference on Evaluation of Information Access
Technologies (Tokyo, Japan, 2019).
[24] Yangyang Zhou, Zheng Liu, X. K. Y. W., and Ren, F. Tua1 at the ntcir-14 stc-3
task. In Proceedings of the 14th NTCIR Conference on Evaluation of Information Access
Technologies (Tokyo, Japan, 2019).
[25] Yu, L., Zhang, W., JunWang, and Yu, Y. Distributed representations of words and
phrases and their compositionality. In Proceedings of the 26th International Conference on
Neural Information Processing Systems (Lake Tahoe, Nevada, 2013).
[26] Yu, L., Zhang, W., JunWang, and Yu, Y. Seqgan: Sequence generative adversarial
nets with policy gradient. In Proceedings of the Thirty-First AAAI Conference on Artificial
Intelligence (AAAI-17) (San Francisco, California, 2017).
[27] Yu, L., Zhang, W., JunWang, and Yu, Y. Long text generation via adversarialtraining
with leaked information. In Association for the Advancement of Artificial Intelligence (AAAI-
18) (New Orleans, Louisiana, 2018).
[28] Zhou, H., Huang, M., Zhang, T., Zhu, X., and Liu, B. Emotional chatting machine:
Emotional conversation generation with internal and external memory. In Association for the
Advancement of Artificial Intelligence (AAAI-18) (2018).
[29] Zhu, J.-Y., Park, T., Isola, P., and Efros, A. A. Unpaired image-to-image transla-
tion using cycle-consistent adversarial networks. In The IEEE International Conference on
Computer Vision (ICCV) (Venice, Italy, 2017).

指導教授

張嘉惠

審核日期

2019-7-25

推文