摘要(英) |
In real situations, utterances are transcribed by ASR(Automatic Speech Recognition) systems, which usually propose multiple candidate transcriptions(hypothesis). Most of the time, the first hypothesis is the best and most commonly used. But the first hypothesis of ASR in a noisy environment often misses some words that are important to the LU(Language Understanding), and these words can be found among second hypothesis. But on the whole, the first ASR hypothesis is significantly better than the second ASR hypothesis. It is not the best choice if we abandon the first ASR hypothesis because it lacks some words. If we can refer to the 2th ASR hypothesis to modify the missing or redundant words of the first ASR hypothesis, we can get utterances closer to the user′s true intentions. In this paper we propose a method to automatically correct the 1th ASR hypothesis by the reinforcement learning model. It can correct the first hypothesis word by word by other hypothesis. Our method raises the bleu score of 1th ASR hypothesis from 70.18 to 76.74. |
參考文獻 |
[1] Mikolov, Tomas, et al. ”Efficient estimation of word representations in vector space.” arXiv preprint
arXiv:1301.3781 (2013).
[2] Rumelhart, David E., Geoffrey E. Hinton, and Ronald J. Williams. Learning internal representations
by error propagation. No. ICS-8506. California Univ San Diego La Jolla Inst for Cognitive Science,
1985.
[3] Elman, Jeffrey L. ”Finding structure in time.” Cognitive science 14.2 (1990): 179-211.
[4] Jordan, Michael I. ”Serial order: A parallel distributed processing approach.” Advances in psychology.
Vol. 121. North-Holland, 1997. 471-495.
[5] Mozer, Michael C. ”A focused backpropagation algorithm for temporal.” Backpropagation: Theory,
architectures, and applications 137 (1995).
[6] Hochreiter, Sepp, and Jürgen Schmidhuber. ”Long short-term memory.” Neural computation 9.8
(1997): 1735-1780.
[7] Chung, Junyoung, et al. ”Empirical evaluation of gated recurrent neural networks on sequence modeling.”
arXiv preprint arXiv:1412.3555 (2014).
[8] Sutton, Richard S., and Andrew G. Barto. Introduction to reinforcement learning. Vol. 135. Cambridge:
MIT press, 1998.
[9] Watkins, Christopher JCH, and Peter Dayan. ”Q-learning.” Machine learning 8.3-4 (1992): 279-
292.
[10] Peters, Jan, and Stefan Schaal. ”Policy gradient methods for robotics.” 2006 IEEE/RSJ International
Conference on Intelligent Robots and Systems. IEEE, 2006.
[11] Williams, Ronald J. ”Simple statistical gradient-following algorithms for connectionist reinforcement
learning.” Machine learning 8.3-4 (1992): 229-256.
[12] Peters, Jan, and Stefan Schaal. ”Natural actor-critic.” Neurocomputing 71.7-9 (2008): 1180-1190.
[13] Dahl, Deborah A., et al. ”Expanding the scope of the ATIS task: The ATIS-3 corpus.” Proceedings of
the workshop on Human Language Technology. Association for Computational Linguistics, 1994.
[14] Rummery, Gavin A., and Mahesan Niranjan. On-line Q-learning using connectionist systems. Vol.
37. Cambridge, England: University of Cambridge, Department of Engineering, 1994.
[15] Steeneken, Herman JM, and Andrew Varga. ”Assessment for automatic speech recognition: I. Comparison
of assessment methods.” Speech Communication 12.3 (1993): 241-246.
[16] Varga, Andrew, and Herman JM Steeneken. ”Assessment for automatic speech recognition: II.
NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition
systems.” Speech communication 12.3 (1993): 247-251.
[17] Seide, Frank, and Amit Agarwal. ”CNTK: Microsoft’s open-source deep-learning toolkit.” Proceedings
of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data
Mining. ACM, 2016.
[18] Jang, Eric, Shixiang Gu, and Ben Poole. ”Categorical reparameterization with gumbel-softmax.”
arXiv preprint arXiv:1611.01144 (2016). |