參考文獻 |
[1] Juang B. H., “Speech recognition in adverse environment,” Computer Speech and language, 5, pp275-294, 1991.
[2] Imai, S., “Cepstral analysis synthesis on the mel frequency scale,” Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP ′83., vol.8, no., pp.93-96, 1983.
[3] Mansour, D. and Juang, B.H., “The short-time modified coherence representation and noisy speech recognition, ” Acoustics, Speech and Signal Processing, IEEE Transactions on, vol.37, no.6, pp.795-804, 1989.
[4] Singer, H., Umezaki, T. and Itakura, F., “Low bit quantization of the smoothed group delay spectrum for speech recognition,” Acoustics, Speech, and Signal Processing, 1990. ICASSP-90., 1990 International Conference on, vol., no., pp.761-764 vol.2, 3-6, 1990.
[5] Shannon, B. J. and Paliwal, K. K., “Feature extraction from higher-lag autocorrelation coefficients for robust speech recognition,” Science Direct Speech Communication, Vol.48, pp. 1458-1485, 2006.
[6] Junqin, Wu. and Junjun, Yu., “An improved arithmetic of MFCC in speech recognition system,” Electronics, Communications and Control (ICECC), 2011 International Conference on, vol., no., pp.719-722, 9-11., 2011.
[7] Xiaojia, Zhao. and DeLiang, Wang., “Analyzing noise robustness of MFCC and GFCC features in speaker identification,” Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on, vol., no., pp.7204-7208, 26-31., 2013.
[8] Jun, Qi., Dong, Wang., Yi, Jiang. and Runsheng, Liu., “Auditory features based on Gammatone filters for robust speech recognition,” Circuits and Systems (ISCAS), 2013 IEEE International Symposium on , vol., no., pp.305-308, 19-23., 2013.
[9] Davis, S. and Mermelstein, P., “Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences,” Acoustics, Speech and Signal Processing, IEEE Transactions on, vol.28, no.4, pp.357-366, 1980.
[10] Zufeng, Weng., Lin, Li. and Donghui, Guo., “Speaker recognition using weighted dynamic MFCC based on GMM,”Anti-Counterfeiting Security and Identification in Communication (ASID), 2010 International Conference on, vol., no., pp.285-288, 18-20., 2010.
[11] Mitra, V., Franco, H., Graciarena, M. and Mandal, A., “Normalized amplitude modulation features for large vocabulary noise-robust speech recognition,” Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on, vol., no., pp.4117-4120, 25-30., 2012.
[12] Devi, M. R. and Ravichandran, T., “A novel approach for speech feature extraction by Cubic-Log compression in MFCC,” Pattern Recognition, Informatics and Mobile Engineering (PRIME), 2013 International Conference on, vol., no., pp.182-186, 21-22., 2013.
[13] Wilpon, J. G., Rabiner, L., Chin-Hui, Lee. and Goldman, E.R., “Automatic recognition of keywords in unconstrained speech using hidden Markov models,” Acoustics, Speech and Signal Processing, IEEE Transactions on, vol.38, no.11, pp.1870-1878, 1990.
[14] Dong, Yu., Li, Deng. and Seide, F., “The Deep Tensor Neural Network With Applications to Large Vocabulary Speech Recognition,” Audio, Speech, and Language Processing, IEEE Transactions on, vol.21, no.2, pp.388-396, 2013.
[15] Hai-Son, Le., Oparin, I., Allauzen, A., Gauvain, J. and Yvon, F., “Structured Output Layer Neural Network Language Models for Speech Recognition,” Audio, Speech, and Language Processing, IEEE Transactions on, vol.21, no.1, pp.197-206, 2013.
[16] 王小川,「語音訊號處理」,全華圖書股份有限公司,2009。
[17] Shamsul Alam, S.M. and Khan, S., “Response of different window methods in speech recognition by using dynamic programming,” Electrical Engineering and Information & Communication Technology (ICEEICT), 2014 International Conference on, vol., no., pp.1,6, 10-12., 2014.
[18] Nickel, R. M., “Feature-Automatic speech character identification,” Circuits and Systems Magazine, IEEE, vol.6, no.4, pp.10,31, Fourth Quarter 2006.
[19] 王祐邦,“Advanced DSP Final Report:Speech Signal Time-Frequency Analysis and Mel-FilterCepstral Coefficient ─A Tutorial,” 2010.
[20] 林品宏,「關鍵詞萃取系統及語音聲控車之應用」,國立中央大學碩士論文,2012。
[21] Ronsenberg, A.E., Lee, C.H. and Soong, F.K., “Cepstral channel normalization techniques for HMM-based speaker verification,” International Conference on Spoken Language Processing (ICSLP), pp. 1835-1838, 1994.
[22] Viikki, O. and Laurila, K., “Cepstral domain segmental feature vector normalization for noise robust speech recognition,” Science Direct Speech Communication, Vol. 25, pp. 133-147, 1998.
[23] Tiberewala, S. and Hermansky, H., “Multiband and adaptation approaches to robust speech recognition,” Eurospeech97, 1997, pp. 107-110, 1997.
[24] Rose, R. C. and Paul, D. B., “A hidden Markov model based keyword recognition system,” Acoustics, Speech, and Signal Processing, 1990. ICASSP-90., 1990 International Conference on, vol., no., pp.129-132 vol.1, 3-6., 1990.
[25] 張智傑,「多種語音特徵的合併及其在智慧型手機上之應用」,國立中央大學碩士論文,2014。
[26] 蔡炎興,「關鍵詞萃取即語者辨識系統之研製」,國立中央大學碩士論文,2003。
[27] 簡忠弘,「關鍵詞辨認系統的研究與實現」,國立清華大學碩士論文,1997。
[28] J Jian Zhi-Hua; Yang Zhen, “Voice conversion using Viterbi algorithm based on Gaussian mixture model,” Intelligent Signal Processing and Communication Systems, 2007. ISPACS 2007. International Symposium on, vol., no., pp.32-35, 2007.
[29] 「大五碼」,台灣財團法人資訊工業策進會,1983。
[30] Oxenham, A. J. and Plack, C. J., “Suppression and the upward spread of masking,” Journal of the Acoustical Society of America, 104 (6), pp. 3500-3510, 1998.
[31] 「遮蔽效應 Masking Effect」,國立中央大學音視訊處理實驗室。http://vaplab.ce.ncu.edu.tw/chinese/pcchang/course2009a/avsp/Masking%20Effect.pdf
[32] Xuan, Zhu., Yining, Chen., Jia, Liu. and Runsheng, Liu., “Feature selection in Mandarin large vocabulary continuous speech recognition,” Signal Processing, 2002 6th International Conference on, vol.1, no., pp.508-511 vol.1, 26-30., 2002.
[33] 呂易宸,「語音門禁系統」,國立中央大學碩士論文,2011。
[34] Ney, H., “The use of a one-stage dynamic programming algorithm for connected word recognition,” Acoustics, Speech and Signal Processing, IEEE Transactions on, vol.32, no.2, pp.263-271, 1984.
[35] Jhing-Fa, Wang., Chung-Hsien, Wu., Chaug-Ching, Haung. and Jau-Yien, Lee., “Integrating neural nets and one-stage dynamic programming for speaker independent continuous Mandarin digit recognition,” Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference on, vol., no., pp.69,72 vol.1, 14-17., 1991.
[36] 林佑輯,「互動式語音導覽系統」,國立中央大學碩士論文,2010。
[37] “MAT Speech Database,” 中華民國計算語言學學會。
[38] 高志杰,「粒子群演算法應用於梅爾濾波器組之研究」,國立中央大學碩士論文,2013。