參考文獻 |
[1] G. Hinton, S. Osindero, and Y. Teh, ‘‘A fast learning algorithm for deep belief nets,” Neural Computation, vol. 18, no. 7, pp. 1527-1554, 2006.
[2] G. Hinton and R. Salakhutdinov, ‘‘Reducing the dimensionality of data with neural networks,” Science, vol. 313, no. 5786, pp. 504-507, 2006.
[3] Y. Bengio, A. Courville, and P. Vincent, ‘‘Representation Learning: A Review and New Perspectives,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 8, pp. 1798-1828, Aug. 2013.
[4] Hinton, Geoffrey, et al. "Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups." IEEE Signal processing magazine 29.6 (2012): 82-97.
[5] G. Hinton and R. Salakhutdinov, “Reducing the dimensionality of data with neural networks,” Science, vol. 313, pp. 504-507, 2006.
[6] A. Ng, “Sparse autoencoder,” CS294A Lecture notes, pp. 72-2011.
[7] S. Nie, H. Zhang, X. Zhang, and W. Liu, “Deep stacking networks with time series for speech separation,” in Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing, 2014, pp. 6667-6671.
[8] M. Hermans and B. Schrauwen, “Training and analyzing deep recurrent neural networks,” in Proceedings Advances in Neural Information Processing Systems, 2013, pp. 190-198.
[9] R. Pascanu, C. Gulcehre, K. Cho, and Y. Bengio, “How to construct deep recurrent neural networks,” in Proceedings International Conference on Learning Representations, 2014
[10] Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. 2012.
[11] LeCun, Yann, and Yoshua Bengio. "Convolutional networks for images, speech, and time series." The handbook of brain theory and neural networks 3361.10 (1995): 1995.
[12] He, Kaiming, et al. "Deep residual learning for image recognition." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
[13] He, Kaiming, et al. "Identity mappings in deep residual networks." European conference on computer vision. Springer, Cham, 2016.
[14] ]Goodfellow, Ian, et al. "Generative adversarial nets." Advances in neural information processing systems. 2014.
[15] Radford, Alec, Luke Metz, and Soumith Chintala. "Unsupervised representation learning with deep convolutional generative adversarial networks." arXiv preprint arXiv:1511.06434 (2015).]
[16] O′Shaughnessy, Douglas. "Linear predictive coding." IEEE potentials 7.1 (1988): 29-32.
[17] Logan, Beth. "Mel Frequency Cepstral Coefficients for Music Modeling." ISMIR. Vol. 270. 2000.
[18] Molau, Sirko, et al. "Computing mel-frequency cepstral coefficients on the power spectrum." Acoustics, Speech, and Signal Processing, 2001. Proceedings.(ICASSP′01). 2001 IEEE International Conference on. Vol. 1. IEEE, 2001.
[19] McCulloch, Warren S., and Walter Pitts. "A logical calculus of the ideas immanent in nervous activity." The bulletin of mathematical biophysics 5.4 (1943): 115-133.
[20] Wu, Bo, et al. "A reverberation-time-aware approach to speech dereverberation based on deep neural networks." IEEE/ACM Transactions on Audio, Speech, and Language Processing 25.1 (2017): 102-111.
[21] Han, Kun, Yuxuan Wang, and DeLiang Wang. "Learning spectral mapping for speech dereverberation." Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on. IEEE, 2014.
[22] Feng, Xue, Yaodong Zhang, and James Glass. "Speech feature denoising and dereverberation via deep autoencoders for noisy reverberant speech recognition." Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on. IEEE, 2014.
[23] Wang, D. S., Y. X. Zou, and W. Shi. "A deep convolutional encoder-decoder model for robust speech dereverberation." Digital Signal Processing (DSP), 2017 22nd International Conference on. IEEE, 2017.
[24] Park, Sunchan, et al. "Linear prediction-based dereverberation with very deep convolutional neural networks for reverberant speech recognition." Electronics, Information, and Communication (ICEIC), 2018 International Conference on. IEEE, 2018.
[25] Weninger, Felix, et al. "Deep recurrent de-noising auto-encoder and blind de-reverberation for reverberated speech recognition." Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on. IEEE, 2014.
[26] Ahmad, Abdul Manan, Saliza Ismail, and D. F. Samaon. "Recurrent neural network with backpropagation through time for speech recognition." Communications and Information Technology, 2004. ISCIT 2004. IEEE International Symposium on. Vol. 1. IEEE, 2004.
[27] Weninger, Felix, et al. "Deep recurrent de-noising auto-encoder and blind de-reverberation for reverberated speech recognition." Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on. IEEE, 2014.
[28] Santos, Joao Felipe, and Tiago H. Falk. "Speech Dereverberation With Context-Aware Recurrent Neural Networks." IEEE/ACM Transactions on Audio, Speech, and Language Processing 26.7 (2018): 1236-1246.
[29] He, Kaiming, et al. "Delving deep into rectifiers: Surpassing human-level performance on imagenet classification." Proceedings of the IEEE international conference on computer vision. 2015.
[30] Glorot, Xavier, and Yoshua Bengio. "Understanding the difficulty of training deep feedforward neural networks." Proceedings of the thirteenth international conference on artificial intelligence and statistics. 2010.
[31] Kingma, Diederik P., and Jimmy Ba. "Adam: A method for stochastic optimization." arXiv preprint arXiv:1412.6980 (2014).
[32] Vincent, Pascal, et al. "Extracting and composing robust features with denoising autoencoders." Proceedings of the 25th international conference on Machine learning. ACM, 2008.
[33] Feng, Xue, Yaodong Zhang, and James Glass. "Speech feature denoising and dereverberation via deep autoencoders for noisy reverberant speech recognition." Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on. IEEE, 2014.
[34] Rumelhart, David E., Geoffrey E. Hinton, and Ronald J. Williams. "Learning representations by back-propagating errors." nature323.6088 (1986): 533
[35] Cho, Kyunghyun, et al. "Learning phrase representations using RNN encoder-decoder for statistical machine translation." arXiv preprint arXiv |