藉由加入多重語音辨識結果來改善對話狀態追蹤

DC 欄位	值	語言
DC.contributor	資訊工程學系	zh_TW
DC.creator	蕭又誠	zh_TW
DC.creator	Yu-Cheng Hsiao	en_US
dc.date.accessioned	2018-1-25T07:39:07Z
dc.date.available	2018-1-25T07:39:07Z
dc.date.issued	2018
dc.identifier.uri	http://ir.lib.ncu.edu.tw:88/thesis/view_etd.asp?URN=104522059
dc.contributor.department	資訊工程學系	zh_TW
DC.description	國立中央大學	zh_TW
DC.description	National Central University	en_US
dc.description.abstract	近年來，對話系統的發展改變了人們與電腦交流的方式。過去人們需要透過特定指令或動作才能命令電腦進行動作，而今追求的是電腦可以從對話中理解使用者的意圖，並協助達到使用者目的。相較於純聊天的對話機器人，任務式的對話機器人以完成使用者的任務為主，也因此需要克服的問題相當多。一、系統要能透過自然語言理解來明白使用者的意圖;二、系統需要進行對話管理來決策目前對話的狀態以及下個步驟;三、系統需要產生自然語言的句子回饋給使用者。而其中對話管理在對話系統中可以算是其中最為困難的課題，能否準確追蹤對話的狀態將會大大影響對話系統的結果。目前語音辨識結果中只有30%的錯誤率，雖然很多都是直接採用最好的語音辨識結果做為輸入來做對話狀態追蹤，但我們的目標是能夠藉由多個語音辨識結果的輸入來有效的改善對話狀態追蹤的準確率，此外還可以有效的允許錯誤的語音輸入結果。我們將以多個語音辨識結果為輸入，透過強化學習的方式，來決定每一輪對話中需要考慮的語音辨識結果有哪些，在聚合多個結果，根據機率選擇最有可能的作為本輪對話的狀態。而我們的方法可以在測試資料集中達到59.98%的準確率，比只使用最優語音辨識結果的系統要來的好。	zh_TW
dc.description.abstract	Nowadays, the development of dialogue systems has changed the communication between human and computer. In the past, people use commands or instructions to ask computers to do tasks. We expect the computer can understand the user intent in the dialogue, and accomplish the user goal. Unlike chit-chat bots, the purpose of task-oriented dialogue systems (TDS) is to accomplish specific tasks, like booking restaurants. So the complexity of TDS’s is more difficult than that of chi-chat bots. First, a TDS needs to understand the user intent by Language Understanding (LU). Second, a TDS requires dialog management to perform dialog state tracking (DST) and dialog policy selection. At last, the system generates the natural language sentence respond to users. Dialogue management is most difficult in the task-oriented dialogue system structure. Our research is focused on dialog state tracking. We use the Dialog State Tracking Challenge 2(DSTC2) dataset in our experiment. According to the statistics, the Word Error Rate of automatic speech recognition (ASR) is 30%. Most of studies only used the top ASR result as the input of their models for DST. We propose to use multiple ASR results. We use reinforcement learning to select useful rank ASR results in addition to the top-1. And use DST model to predict the dialog state of the selected ASR results. The final step is aggregating all the dialog states as our system’s output. Our method can achieve an accuracy of 59.98% in the test set, showing that our method is better than the baseline which just uses top ASR result as the input. In the future, we plan to use language understanding information of the ASR results in our method.	en_US
DC.subject	對話系統	zh_TW
DC.subject	自動語音辨識	zh_TW
DC.subject	狀態追蹤	zh_TW
DC.subject	深度學習	zh_TW
DC.subject	強化學習	zh_TW
DC.subject	Dialogue system	en_US
DC.subject	Automatic Speech Recognition	en_US
DC.subject	State Tracking	en_US
DC.subject	Deep Learning	en_US
DC.subject	Reinforcement Learning	en_US
DC.title	藉由加入多重語音辨識結果來改善對話狀態追蹤	zh_TW
dc.language.iso	zh-TW	zh-TW
DC.title	Improving Dialogue State Tracking by incorporating multiple Automatic Speech Recognition results	en_US
DC.type	博碩士論文	zh_TW
DC.type	thesis	en_US
DC.publisher	National Central University	en_US

博碩士論文 104522059 完整後設資料紀錄