通過強化學習重新校正並提高最佳 ASR 假設

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：33

、訪客IP：18.118.139.79

姓名

陳家豪(Chia-Hao Chen) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

通過強化學習重新校正並提高最佳 ASR 假設
(Improve Top ASR Hypothesis with Re-correction by Reinforcement Learning)

相關論文

★ A Real-time Embedding Increasing for Session-based Recommendation with Graph Neural Networks	★ 基於主診斷的訓練目標修改用於出院病摘之十代國際疾病分類任務
★ 混合式心臟疾病危險因子與其病程辨識於電子病歷之研究	★ 基於 PowerDesigner 規範需求分析產出之快速導入方法
★ 社群論壇之問題檢索	★ 非監督式歷史文本事件類型識別──以《明實錄》中之衛所事件為例
★ 應用自然語言處理技術分析文學小說角色之關係：以互動視覺化呈現	★ 基於生醫文本擷取功能性層級之生物學表徵語言敘述：由主成分分析發想之K近鄰算法
★ 基於分類系統建立文章表示向量應用於跨語言線上百科連結	★ Code-Mixing Language Model for Sentiment Analysis in Code-Mixing Data
★ 藉由加入多重語音辨識結果來改善對話狀態追蹤	★ 對話系統應用於中文線上客服助理:以電信領域為例
★ 應用遞歸神經網路於適當的時機回答問題	★ 使用多任務學習改善使用者意圖分類
★ 使用轉移學習來改進針對命名實體音譯的樞軸語言方法	★ 基於歷史資訊向量與主題專精程度向量應用於尋找社群問答網站中專家

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

在實際情況中，話語由ASR（自動語音識別）系統轉錄，其通常提出多個候選轉錄（假設）。大多數時候，第一個假設通常是最好和最常用的假設。但是，在嘈雜的環境中，ASR的第一個假設經常會錯漏一些對LU（語言理解）而言很重要的詞，而這些詞經常可以在其他假設中找到。但總的來說，第一個ASR假設明顯優於其他的ASR假設。如果我們放棄第一個ASR假設，就因為它缺少一些單詞，這並不是最好的選擇。如果我們可以參考第2個ASR假設來修改第1個ASR假設的缺失的或冗餘的詞，我們可以使話語更接近使用者的真實意圖。在這篇論文中，我們提出了一種通過強化學習模型自動校正第1個ASR假設的方法。它可以通過地2假設逐字逐句糾正第一個假設。我們的方法將第1次ASR假設的得分從70.18提高到76.74。

摘要(英)

In real situations, utterances are transcribed by ASR(Automatic Speech Recognition) systems, which usually propose multiple candidate transcriptions(hypothesis). Most of the time, the first hypothesis is the best and most commonly used. But the first hypothesis of ASR in a noisy environment often misses some words that are important to the LU(Language Understanding), and these words can be found among second hypothesis. But on the whole, the first ASR hypothesis is significantly better than the second ASR hypothesis. It is not the best choice if we abandon the first ASR hypothesis because it lacks some words. If we can refer to the 2th ASR hypothesis to modify the missing or redundant words of the first ASR hypothesis, we can get utterances closer to the user′s true intentions. In this paper we propose a method to automatically correct the 1th ASR hypothesis by the reinforcement learning model. It can correct the first hypothesis word by word by other hypothesis. Our method raises the bleu score of 1th ASR hypothesis from 70.18 to 76.74.

關鍵字(中)

★ 強化學習
★ 自然語言處理
★ 自動語音辨識
★ 錯字修正

關鍵字(英)

★ Reinforcement Learning
★ Natural Language Processing
★ Automatic Speech Recognition
★ Correcting

論文目次

摘要ii
Abstract iii
Contents iv
List of Figures vi
List of Tables vii
1 Introduction 1
1.1 Dialogue system and automatic speech recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Research motivation and purpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Paper Architecture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2 Relate Work 4
2.1 Deep Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1.1 Word2vec. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1.2 RNN. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1.3 GRU. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 Reinforcement Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2.1 Q-learn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.2 Policy gradient. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.3 Actor critic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3 Method 11
3.1 Action and Reward . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4 Experiment 20
4.1 Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.2 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5 Performance and discussion 24
5.1 Performance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
5.2 Discuss. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
6 Conclusion 29
Bibliography 31
A Appendix detail example 33

參考文獻

[1] Mikolov, Tomas, et al. ”Efficient estimation of word representations in vector space.” arXiv preprint
arXiv:1301.3781 (2013).
[2] Rumelhart, David E., Geoffrey E. Hinton, and Ronald J. Williams. Learning internal representations
by error propagation. No. ICS-8506. California Univ San Diego La Jolla Inst for Cognitive Science,
1985.
[3] Elman, Jeffrey L. ”Finding structure in time.” Cognitive science 14.2 (1990): 179-211.
[4] Jordan, Michael I. ”Serial order: A parallel distributed processing approach.” Advances in psychology.
Vol. 121. North-Holland, 1997. 471-495.
[5] Mozer, Michael C. ”A focused backpropagation algorithm for temporal.” Backpropagation: Theory,
architectures, and applications 137 (1995).
[6] Hochreiter, Sepp, and Jürgen Schmidhuber. ”Long short-term memory.” Neural computation 9.8
(1997): 1735-1780.
[7] Chung, Junyoung, et al. ”Empirical evaluation of gated recurrent neural networks on sequence modeling.”
arXiv preprint arXiv:1412.3555 (2014).
[8] Sutton, Richard S., and Andrew G. Barto. Introduction to reinforcement learning. Vol. 135. Cambridge:
MIT press, 1998.
[9] Watkins, Christopher JCH, and Peter Dayan. ”Q-learning.” Machine learning 8.3-4 (1992): 279-
292.
[10] Peters, Jan, and Stefan Schaal. ”Policy gradient methods for robotics.” 2006 IEEE/RSJ International
Conference on Intelligent Robots and Systems. IEEE, 2006.
[11] Williams, Ronald J. ”Simple statistical gradient-following algorithms for connectionist reinforcement
learning.” Machine learning 8.3-4 (1992): 229-256.
[12] Peters, Jan, and Stefan Schaal. ”Natural actor-critic.” Neurocomputing 71.7-9 (2008): 1180-1190.
[13] Dahl, Deborah A., et al. ”Expanding the scope of the ATIS task: The ATIS-3 corpus.” Proceedings of
the workshop on Human Language Technology. Association for Computational Linguistics, 1994.
[14] Rummery, Gavin A., and Mahesan Niranjan. On-line Q-learning using connectionist systems. Vol.
37. Cambridge, England: University of Cambridge, Department of Engineering, 1994.
[15] Steeneken, Herman JM, and Andrew Varga. ”Assessment for automatic speech recognition: I. Comparison
of assessment methods.” Speech Communication 12.3 (1993): 241-246.
[16] Varga, Andrew, and Herman JM Steeneken. ”Assessment for automatic speech recognition: II.
NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition
systems.” Speech communication 12.3 (1993): 247-251.
[17] Seide, Frank, and Amit Agarwal. ”CNTK: Microsoft’s open-source deep-learning toolkit.” Proceedings
of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data
Mining. ACM, 2016.
[18] Jang, Eric, Shixiang Gu, and Ben Poole. ”Categorical reparameterization with gumbel-softmax.”
arXiv preprint arXiv:1611.01144 (2016).

指導教授

蔡宗翰(Tsung-Han Tsai)

審核日期

2019-7-23

推文