參考文獻 |
Allen, J. B., & Berkley, D. A. (1979). “Image Method for Efficiently Simulating Small Room Acoustics,”. Journal of the Acoustical Society of America, vol. 65, no. 4, pp. 943-950.
Bees, D., Blostein, M., & Kabal, P. (1991). Reverberant speech enhancement using cepstral processing. in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 977-980.
Benesty, J., Sondhi, M. M., & Huang, Y. (2007). Springer handbook of speech processing :Ch. 4.6.
Delcroix, M., Yoshioka, T., Ogawa, A., & Kubo, Y. (2014). “Linear prediction-based dereverberation with advanced speech enhancement and recognition technologies for the REVERB challenge,”. in Proc. REVERB Challenge, pp. 1–8.
Delfarah, M., & Wang, D. L. (2017). “Features for masking-based monaural speech separation in reverberant conditions,”. IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 25, no. 5, pp. 1085–1094.
Erdogan, H., Hershey, J. R., Watanabe, S., & Roux, J. L. (2015). Phase-sensitive and recognition-boosted speech separation using deep recurrent neural networks. in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 708-712.
Garofolo, J. S., Lamel, L. F., Fisher, W. M., Fiscus, J. G., & Pallett, D. S. (1993). DARPA TIMIT acoustic-phonetic continous speech corpus CD-ROM. NIST speech disc 1-1.1. Tech. Rep, vol. 93.
Gillespie, B. W., Malvar, S. H., & Florencio, D. A. (2001). Speech dereverberation via maximum-kurtosis subband adaptive filtering. in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 3701-3704.
Habets, E. A. (2010). Room impulse response generator. Technische Universiteit Eindhoven.
Han, K., Wang, Y., & Wang, D. (2014). “Learning spectral mapping for speech dereverberation,”. IEEE International Conference on Acoustic, Speech and Signal Processing, pp. 4661–4665.
He, K., Zhang, X., Ren, S., & Sun, J. (2015). Deep residual learning for image recognition. in Proc. Computer Vision and Pattern Recognition, pp. 770-778.
Hussain, T., Siniscalchi, S. M., Lee, C. -C., Wang, S. -S., Tsao, Y., & Liao, W. -H. (2017). Experimental Study on Extreme Learning Machine Applications for Speech Enhancement. IEEE Access, vol. 5, pp. 25542-25554.
Jin, Z., & Wang, D. L. (2009). “Supervised learning approach to monaural segregation of reverberant speech,”. IEEE Transactions on Audio, Speech, and Language Processing, vol. 17, no. 4, pp. 625–638.
Lee, W. J., Wang, S. S., Chen, F., Lu, X., Chien, S. Y., & Tsao, Y. (2018). Speech Dereverberation Based on Integrated Deep and Ensemble Learning Algorithm. IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 5454-5458.
Li, J., Akagi, M., & Suzuki, Y. (2006). “Noise reduction based on microphone array and post-fltering for robust hands-free speech recognition in adverse environments,”. Ph.D. dissertation, School of Information Science, Japan Advanced Institute of Science and Technology, Japan.
Loizou, P. C. (2007). Speech Enhancement: Theory and Practice.
Lu, X., Tsao, Y., Matsuda, S., & Hori, C. (2014). Ensemble modeling of denoising autoencoder for speech spectrum restoration. in Proc. INTERSPEECH.
Ma, J., Hu, Y., & Loizou, P. C. (2009). Objective measures for predicting speech intelligibility in noisy conditions based on new band-importance functions. J Acoust Soc Am, 125(5), pp. 3387-3405.
Miyoshi, M., & Kaneda, Y. (1988). “Inverse filtering of room acoustics,”. IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 36, no. 2, pp. 145–152.
Mohammadiha, N., & Doclo, S. (2016). “Speech dereverberation using nonnegative convolutive transfer function and spectro-temporal modeling,”. IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 24, no. 2, pp. 276–289.
Nair, V., & Hinton, G. E. (2010). Rectified linear units improve restricted Boltzmann machines. in Proceedings of theInternational Conference on Machine Learning, pp. 807-814.
Neely, S. T., & Allen, J. B. (1979). “Invertibility of a room impulse response,”. Journal of the Acoustical Society of America, vol. 66, pp. 165–169.
Nisa, H. K. (2021). Speech dereverberation based on HELM framework for cochlear implant coding strategy. Master′s Thesis, Institute of Electrical Engineering, National Central University.
Radlovic, B. D., Williamson, R. C., & Kennedy, R. A. (2000). Equalization in an acoustic reverberant environment: robustness results. IEEE Transactions on Speech and Audio Processing, vol. 8, no. 3, pp. 311-319.
Rix, A. W., Beerends, J. G., Hollier, M. P., & Hekstra, A. P. (2001). Perceptual evaluation of speech quality (pesq) - a new method for speech quality assessment of telephone networks and codecs. in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 749-752.
SrivastavaKR, GreffK, & SchmidhuberJ. (2015). Highway networks. CoRR, vol. abs/1505.00387.
STEVEN F.BOLL. (1979). Suppression of acoustic noise in speech using spectral subtraction. IEEE Transactions on Acoustics, Speech, and Signal Processing, pp. 113-120.
Taal, C. H., Hendriks, R. C., Heusdens, R., & Jensen, J. (2011). An Algorithm for Intelligibility Prediction of Time–Frequency Weighted Noisy Speech. IEEE Transactions on Audio, Speech, and Language Processing, vol. 19, no. 7, pp. 2125-2136.
Virtanen, T., Gemmeke, J., & Raj, B. (2013). Active-Set Newton Algorithm for Overcomplete Non-Negative Representations of Audio. IEEE Transactions on Audio, Speech, and Language Processing., vol. 21, no. 11, pp. 2277-2289.
Wang, D., & Lim, J. (1982). The unimportance of phase in speech enhancement. IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 30, no. 4, pp. 679-681.
Wang, Y., Narayanan, A., & Wang, D. (2014). On training targets for supervised speech separation. IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 22, no. 12, pp. 1849-1858.
Williamson et al. (2016). Complex Ratio Masking for Monaural Speech Separation. IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 24, no. 3, pp. 483-492.
Williamson, D. S., & Wang, D. L. (2017). Time-Frequency Masking in the Complex Domain for Speech Dereverberation and Denoising. IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 25, no. 7, pp. 1492-1501.
Wu, M., & Wang, D. L. (2006). “A two-stage algorithm for one-microphone reverberant speech enhancement,”. IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 14, no. 3, pp. 774–784.
Xiao et al. (2016). “Speech dereverberation for enhancement and recognition using dynamic features constrained deep neural networks and feature adaptation,”. EURASIP Journal on Advances in Signal Processing, vol. 2016, no. 1, pp. 1-18.
Yoshioka, T., & Nakatani, T. (2012). “Generalization of multi-channel linear rediction methods for blind MIMO impulse response shortening,”. IEEE Transactions on Audio, Speech, and Language Processing, vol. 20, no. 10, pp. 2707–2720.
Zhang, X. L., & Wang, D. L. (2016). “A deep ensemble learning method for monaural speech separation,”. IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 24, no. 5, pp. 967–977.
中華民國衛生福利部統計處. (2020). 擷取自 https://dep.mohw.gov.tw/DOS/cp-2976-13827-113.html
高士喆. (2014). 「語音增強使用感知激勵頻譜振福之貝氏估計器」. 國立台北科技大學電機工程研究所. 碩士論文.
陳星瑋. (2019). 「基於深度神經網路之多聲道聲源方位估計與語音增強」. 國立交通大學電信工程研究所. 碩士論文.
黃國原. (2009). 「模擬人工電子耳頻道數、刺激速率與雙耳聽對噪音環境下中文語音辨識之影響」. 國立中央大學電機工程研究所. 碩士論文.
黃銘緯. (2005). 「台灣地區噪音下漢語語音聽辨測試」. 國立台北護理學院聽語障礙科學研究所. 碩士論文.
楊宗翰. (2012). 「使用適應波束形成與增益衰減後濾波器之殘響消除方法」. 國立交通大學電機與控制工程研究所. 碩士論文. |