應用強化式學習於多面向對話回應模組之研究

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：19

、訪客IP：18.118.14.81

姓名

陳臆玄(Yi-Hsuan Chen) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

應用強化式學習於多面向對話回應模組之研究
(Application of Reinforcement Learning in Multi-Faceted Story Chatbot Response Action Selection)

相關論文

★ 行程邀約郵件的辨識與不規則時間擷取之研究	★ NCUFree校園無線網路平台設計及應用服務開發
★ 網際網路半結構性資料擷取系統之設計與實作	★ 非簡單瀏覽路徑之探勘與應用
★ 遞增資料關聯式規則探勘之改進	★ 應用卡方獨立性檢定於關連式分類問題
★ 中文資料擷取系統之設計與研究	★ 非數值型資料視覺化與兼具主客觀的分群
★ 關聯性字組在文件摘要上的探討	★ 淨化網頁：網頁區塊化以及資料區域擷取
★ 問題答覆系統使用語句分類排序方式之設計與研究	★ 時序資料庫中緊密頻繁連續事件型樣之有效探勘
★ 星狀座標之軸排列於群聚視覺化之應用	★ 由瀏覽歷程自動產生網頁抓取程式之研究
★ 動態網頁之樣版與資料分析研究	★ 同性質網頁資料整合之自動化研究

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

我們希望透過英語閱讀的方式使學生對英文產生興趣，讓學生透過閱讀將英文與自我生活的產生連結，在社會脈絡中發展語言的能力。然而這樣的社會文化建構過程需要大量的師資，在目前有限的人力資源下並不可行。
因此我們對話系統以聊故事為主軸與使用者建立共同話題並展開對話，我們團隊希望學生與聊天機器人的互動，不只是單純得進行故事的討論，也可以進行日常對話的問答或是讓學生適當地擁有主導的話語權，然而這目前仍然是一個挑戰，因為在多面向模組的整合下，機器人更需要有充足的自然語言理解以及對話策略選擇的能力，可以自動且有效率的提供符合當下情境的回應。
此文的主要任務就是要介紹我們如何訓練一個教育對話機器人模型，讓他可以從多種狀態下去察覺學生的情況，再探勘此狀態組合的對應的回覆，在此模型中我們採用了強化式學習(Reinforcement learning)的訓練架構進行訓練，以此達到此論文最終目的---與使用者建立關係並使對話長久進行。

摘要(英)

We hope that through reading in English, students will be interested in English, so that students can connect English with their own life through reading, and develop their language ability in the social context. However, such a social and cultural construction process requires a large number of teachers, which is not feasible under the current limited human resources.
Therefore, our dialogue system takes the story as the main axis to establish a common topic and start a dialogue with users. Our team hopes that the interaction between students and chatbots is not only a simple discussion of stories, but also a question-and-answer session in daily conversations or allowing students to appropriately However, this is still a challenge, because under the integration of multi-faceted modules, robots need to have sufficient natural language understanding and dialogue strategy selection capabilities, which can automatically and efficiently provide products that meet the needs of the current situation. situational response.
The main task of this article is to introduce how we train an educational dialogue robot model, so that it can detect the situation of students from various states, and then explore the corresponding replies of this combination of states. In this model, we use Reinforcement learning training architecture to achieve the ultimate goal of this paper - to establish a relationship with the user and make the dialogue perpetual.

關鍵字(中)

★ 教育型聊天機器人
★ 強化式學習

關鍵字(英)

★ Educational chatbot
★ Reinforcement learning

論文目次

中文摘要............................................................................................................... i
英文摘要............................................................................................................... ii
目錄 ...................................................................................................................... iii
圖目錄 .................................................................................................................. v
表目錄 .................................................................................................................. vi
一、緒論 ................................................................................................ 1
1.1 問題描述 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 動機 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 研究目標 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
二、相關研究......................................................................................... 3
2.1 對話系統 (Dialogue Systems) . . . . . . . . . . . . . . . . . . . . 3
2.2 對話管理(Dialogue Manager) . . . . . . . . . . . . . . . . . . . . 3
2.3 教育類型的對話機器人 . . . . . . . . . . . . . . . . . . . . . . . 4
2.4 深度強化學習(Deep Reinforcement Learning) . . . . . . . . . . . 5
2.5 強化學習結合聊天機器人 . . . . . . . . . . . . . . . . . . . . . . 6
三、方法 ................................................................................................ 8
3.1 任務定義 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2 狀態集的特徵擷取 . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.3 歷史對話的特徵擷取 . . . . . . . . . . . . . . . . . . . . . . . . 10
3.4 方法與模型 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.4.1 訓練方法與演算法 . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.5 基於規則式的回應模組 . . . . . . . . . . . . . . . . . . . . . . . 12
四、資料準備與資料集 .......................................................................... 16
4.1 資料來源 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.2 標記過程 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.3 對話標記資料 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.3.1 資料統計與分析 . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.3.2 對話回應的選擇與相似度分析 . . . . . . . . . . . . . . . . . . . 17
4.3.3 對話回應評分統計 . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.3.4 對話狀態統計 . . . . . . . . . . . . . . . . . . . . . . . . . . . .19
五、實驗 ................................................................................................ 22
5.1 實驗分析 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
5.2 模型學習曲線效能 . . . . . . . . . . . . . . . . . . . . . . . . . . 22
5.3 強化式學習方法比較 . . . . . . . . . . . . . . . . . . . . . . . . 22
5.4 強化式學習方法與規則式比較 . . . . . . . . . . . . . . . . . . . 25
5.5 小結 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
六、結論與未來展望.............................................................................. 27
參考文獻...............................................................................................................28

參考文獻

[1] Stefano Mezza, Alessandra Cervone, Giuliano Tortoreto, Evgeny A Stepanov,
and Giuseppe Riccardi. Iso-standard domain-independent dialogue act tagging
for conversational agents. arXiv preprint arXiv:1806.04327, 2018.
[2] Marina Umaschi Bers and Justine Cassell. Interactive storytelling systems for
children: Using technology to explore language and identity. Journal of Interactive Learning Research, 9:183–215, 1998.
[3] Neville Bennett. The emanuel miller memorial lecture 1990 cooperative learning in classrooms: Processes and outcomes. Journal of Child Psychology and
Psychiatry, 32(4):581–594, 1991.
[4] Hatice Çirali Sarica and Yasemin Koçak Usluel. The effect of digital storytelling
on visual memory and writing skills. Comput. Educ., 94:298–309, 2016.
[5] Tecnam Yoon. DEVELOPING MULTIMODAL DIGITAL LITERACY: THE
APPLICATION OF DIGITAL STORYTELLING AS A NEW AVENUE FOR
EFFECTIVE ENGLISH LEARNING WITH EFL ELEMENTARY SCHOOL
STUDENTS IN KOREA. PhD thesis, University of Massachusetts Amherst,
Amherst, MA, 5 2014. An optional note.
[6] Nicoletta Di Blas, Franca Garzotto, Paolo Paolini, and Amalia G. Sabiescu.
Digital storytelling as a whole-class learning activity: Lessons from a threeyears project. In ICIDS, 2009.
[7] Pelin Yuksel, Bernard R. Robin, and Sara G. McNeil. Educational uses of digital
storytelling all around the world. In Proceedings of Society for Information
Technology & Teacher Education International Conference 2006, 2011.
[8] Banny S. K. Chan, Daniel Churchill, and Thomas K. F. Chiu. Digital literacy
learning in higher education through digital storytelling approach. Journal of
International Education Research, 13:1–16, 2017.
[9] Heather Lotherington and Jennifer Jenson. Teaching multimodal and digital
literacy in l2 settings: New literacies, new basics, new pedagogies. Annual
Review of Applied Linguistics, 31:226 – 246, 2011.
[10] Ya-Ting Carolyn Yang and Wan-Chi Wu. Digital storytelling for enhancing
student academic achievement, critical thinking, and learning motivation: A
year-long experimental study. Comput. Educ., 59:339–352, 2012.
[11] Chen-Chung Liu, Pin ching Wang, and Shu-Ju Diana Tai. An analysis of student
engagement patterns in language learning facilitated by web 2.0 technologies.
ReCALL, 28:104 – 122, 2016.
[12] Crystal Shelby-Caffey, Edwin Ubeda, and Beth Jenkins. Digital storytelling
revisited: An educator’s use of an innovative literacy practice. The Reading
Teacher, 68:191–199, 2014.
[13] Arthur C Graesser, Patrick Chipman, Brian C Haynes, and Andrew Olney.
Autotutor: An intelligent tutoring system with mixed-initiative dialogue. IEEE
Transactions on Education, 48(4):612–618, 2005.
[14] Fumihide Tanaka and Shizuko Matsuzoe. Children teach a care-receiving robot
to promote their learning: Field experiments in a classroom for vocabulary
learning. Journal of Human-Robot Interaction, 1(1):78–95, 2012.
[15] James P Baker, Cathlin V Clark-Gordon, and Scott A Myers. Using emotional
response theory to examine dramatic teaching behaviors and student approach–
avoidance behaviors. Communication Education, 68(2):193–214, 2019.
[16] Joseph E Michaelis and Bilge Mutlu. Supporting interest in science learning
with a social robot. In Proceedings of the 18th ACM International Conference
on Interaction Design and Children, pages 71–82, 2019.
[17] Martin Saerbeck, Tom Schut, Christoph Bartneck, and Maddy D Janse. Expressive robots in education: varying the degree of social supportive behavior
of a robotic tutor. In Proceedings of the SIGCHI conference on human factors
in computing systems, pages 1613–1622, 2010.
[18] Ying Xu, Dakuo Wang, Penelope Collins, Hyelim Lee, and Mark Warschauer.
Same benefits, different communication patterns: Comparing children’s reading
with a conversational agent vs. a human partner. Computers & Education,
161:104059, 2021.
[19] Richard S Sutton, Andrew G Barto, et al. Introduction to reinforcement learning. 1998.
[20] Peter Stone, Richard S Sutton, and Gregory Kuhlmann. Reinforcement learning
for robocup soccer keepaway. Adaptive Behavior, 13(3):165–188, 2005.
[21] Arkady Epshteyn, Adam Vogel, and Gerald DeJong. Active reinforcement learning. In Proceedings of the 25th international conference on Machine learning,
pages 296–303, 2008.
[22] Christopher JCH Watkins and Peter Dayan. Q-learning. Machine learning,
8(3):279–292, 1992.
[23] Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis
Antonoglou, Daan Wierstra, and Martin Riedmiller. Playing atari with deep
reinforcement learning. arXiv preprint arXiv:1312.5602, 2013.
[24] Hado Van Hasselt, Arthur Guez, and David Silver. Deep reinforcement learning
with double q-learning. In Proceedings of the AAAI conference on artificial
intelligence, volume 30, 2016.
[25] Csaba Szepesvári. Algorithms for reinforcement learning. Synthesis lectures on
artificial intelligence and machine learning, 4(1):1–103, 2010.
[26] Xiujun Li, Yun-Nung Chen, Lihong Li, Jianfeng Gao, and Asli Celikyilmaz. End-to-end task-completion neural dialogue systems. arXiv preprint
arXiv:1703.01008, 2017.
[27] Baolin Peng, Xiujun Li, Jianfeng Gao, Jingjing Liu, Kam-Fai Wong, and ShangYu Su. Deep dyna-q: Integrating planning for task-completion dialogue policy
learning. arXiv preprint arXiv:1801.06176, 2018.
[28] Richard S Sutton. Integrated architectures for learning, planning, and reacting
based on approximating dynamic programming. In Machine learning proceedings 1990, pages 216–224. Elsevier, 1990.
[29] Aviral Kumar, Aurick Zhou, George Tucker, and Sergey Levine. Conservative
q-learning for offline reinforcement learning. Advances in Neural Information
Processing Systems, 33:1179–1191, 2020.
[30] Iulian V Serban, Chinnadhurai Sankar, Mathieu Germain, Saizheng Zhang,
Zhouhan Lin, Sandeep Subramanian, Taesup Kim, Michael Pieper, Sarath
Chandar, Nan Rosemary Ke, et al. A deep reinforcement learning chatbot.
arXiv preprint arXiv:1709.02349, 2017.

指導教授

張嘉惠(Chia-Hui Chang)

審核日期

2022-9-22

推文