摘要(英) |
The purpose of this research was to add a speech enhancement process that could further improve speech intelligibility and the performance of automatic scene classification and auto-matching noise reduction system after the application of the adaptive directional microphone strategy. The speech enhancement system is divided into two parts, one is the noise-estimation strategy and another the speech-estimation function. Noise-estimation algorithms used in the research are: Minimum Statistics (MS), Minima-Controlled Recursive Averaging (MCRA), Improved Minima-Controlled Recursive Averaging (IMCRA), Minima-Controlled Recursive Averaging-Loizou (MCRA-L), Constrained Variance Spectral Smoothing (CVS), Forward-Backward MCRA(MCRA-FB); Speech-estimation function: Maximum-Likelihood (ML), Log-Spectral Amplitude (LSA), Maximum A Posteriori Amplitude (MAPA), Wiener-type, Wiener Filter.
In this research, The MATLAB (The MathWorks, Natick, Massachusetts, USA) software was first used to simulate the speech enhancement system. The simulation was mainly to evaluate the speech quality of the signal after speech enhancement process with different signal-to-noise ratio (SNR) of the input speech noise signal, and then to select the best combination of the speech enhancement system. Finally, the selected speech enhancement system was implemented with automatic scene classification and auto-matching noise reduction system in TMS320C6713 DSP Starter Kit (Texas Instruments, Dallas, Texas, USA), and compared with the output signal in the original noise reduction system. To show the performance of the selected speech enhancement system, the objective perceptual evaluation of speech quality (PESQ) approach and the subjective speech reception threshold (SRT) were further used to evaluate the quality of speech with the SNR range between 30dB to -30dB.
In the objective evaluation, the simulated results showed that the PESQ score was increased by 0.45 when the speech enhancement CVS with MAPA was used for the input signal with 30dB SNR and by 0.65 for 10 dB SNR. For the hardware implementation, only the speech enhancement MCRA with MAPA was used for real-time processing. The experimental results indicated that speech enhancement system could decrease the speech quality by 0.36 for the input signal with 30dB SNR. When the SNR was below 10dB, the automatic scene classification system would automatically select the function of microphone noise reduction strategy. With the speech enhancement system, our overall hardware implementation could effectively reduce speech distortion and improve speech quality. The PESQ score was increased by 0.27 for the input signal with 0 dB SNR.
The SRT from five normal hearing subjects (between 23 to 26 years old) in different noise conditions were measured with the HINT Pro system (Bio-logic, Chicago, IL, USA) for subjective evaluation. Our experimental results showed that speech enhancement could not improve the SRT of the subjects, but become worse than original system. The average SRT of the subjects was increased by 8.54dB because the volume of the signal processed by the speech enhancement system became too small, even though the objective speech quality was improved. The above-mentioned experimental results suggested that the speech enhancement system could provide better speech quality in high SNR when the system used shorter frame length despite of some distortion in low SNR. Nevertheless, the speech enhancement system was able to greatly improve speech intelligibility when the system used longer frame length. If the amplifier stage was included in the system, the whole system could achieve the same performance as that of the objective evaluation.
|
參考文獻 |
Arslan, L., McCree, A., and Viswanathan, V (1995). “New methods for adaptive noise suppression,” IEEE International Conference on Acoustics, Speech, and Signal Processing, 1, 812-815.
Boll, S. F. (1979). “Suppression of Acoustic Noise in Speech Using Spectral Subtraction,” IEEE Transactions on Acoustics, Speech, and Signal Processing, 27, 113-120.
Cohen, I. (2002). "Noise estimation by minima controlled recursive averaging for robust speech enhancement," IEEE Signal Processing Letters, 9, 12-25.
Cohen, I. (2002). "Optimal speech enhancement under signal presence uncertainty using log-spectral amplitude estimator," IEEE Signal Processing Letters, 9, 113-116.
Cohen, I. (2003). "Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging," IEEE Transactions on Speech and Audio Processing, 11, 466-475.
Derakhshan, N., Akbari, A., and Ayatollahi, A. (2009). “Noise power spectrum estimation using constrained variance spectral smoothing and minima tracking,” Speech Communication, 51, 1098-1113.
Doblinger, G. (1995). “Computationally efficient speech enhancement by spectral minima tracking in subbands,” Proc. Euro-Speech, 2, 1513-1516.
Ephraim, Y., and Malah, D. (1984). "Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator," IEEE Transactions on Acoustics, Speech and Signal Processing, 32, 1109-1121.
Ephraim, Y., and Malah, D. (1985). "Speech enhancement using a minimum mean-square error log-spectral amplitude estimator," IEEE Transactions on Acoustics, Speech and Signal Processing, 33, 443-445.
Hamacher, V., Chalupper, J., Eggers, J., Fischer, E., Kornagel, U., Puder, H. and Rass, U. (2005). “Signal processing in high-end hearing aids:state of the art, challenges, and future trends,” EURASIP Journal on Applied Signal Processing, 18, 2915-2929.
Hirsch, H. G., and Ehrlicher, C. (1995). “Noise estimation techniques for robust speech recognition,” IEEE International Conference on Acoustics, Speech, and Signal Processing, 1, 153-156.
Hu, Y., and Loizou, P. C. (2004). “Speech enhancement based on wavelet thresholding the multitaper spectrum,” IEEE Transactions on Speech and Audio Processing, 12, 59-67.
ITU-T (2001). Perceptual evaluation of speech quality (PESQ): An Objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech codecs. ITU-T P.862.
Li, J. (2006). “Noise reduction based on microphone array and post-filtering for robust hands-free speech recognition in adverse environments,” Signal Processing, 2006 8th International Conference, 1.
Lotter, T., and Vary, P. (2005). “Speech enhancement by MAP spectral amplitude estimation using a super-Gaussian speech model,” EURASIP Journal on Applied Signal Processing, 7, 1110-1126.
Martin, R. (1994). "Spectral subtraction based on minimum statistics," 7th European Signal Processing Conference, 94, 1182-1185.
Martin, R. (2001). “Noise power spectral density estimation based on optimal smoothing and minimum statistics,” IEEE Transactions on Speech and Audio Processing, 9, 504-512.
McAulay, R. J., and Malpass, M. L. (1980). “Speech enhancement using a soft-decision noise suppression,” IEEE Transactions on Acoustics, Speech, and Signal Processing, 9, 504-512.
NOISEX-92 (1993). NOISEX-92 noise database, Signal Processing Information Base by the Signal Processing Society and the National Science Foundation. http://spib.rice.edu/spib.html.
Rangachari, S., and Loizou, P. C. (2006). “A noise-estimation alogorithm for highly non-stationary environments,” Speech Communication, 48, 220-231.
TI (2003) "TMS320C6713 DSK Technical Reference, 506735-0001 Rev. B."
Wolfe, P. J., and Godsill, S. J. (2003). “Efficient alternatives to the Ephraim and Malah suppression rule for audio signal enhancement,” EURASIP Journal on Applied Signal Processing, 10, 1043-1051.
衛生福利部社會及家庭署網站,西元2013年資料。
http://www.sfaa.gov.tw/SFAA/default.aspx
行政院內政部統計處網站,西元2014年資料。
http://www.moi.gov.tw/stat/index.aspx
科林聽力團隊 (2011). “助聽器學,” 科林儀器股份有限公司, 台灣, 新北市。
黃銘緯 (2005). “台灣地區噪音下漢語語音聽辨測試,” 碩士論文, 國立台北護理學院聽語障礙科學研究所。
陽吉文 (2006). “以麥克風陣列與語音預估做語音增強之研究,” 碩士論文, 國立清華大學電機工程學系碩士班。
李銘浚 (2007). “應用獨立成分分析、對數頻譜預估、及頻率成分調整技術做語音增強之研究,” 碩士論文, 國立清華大學電機工程學系碩士班。
黃承德 (2009). “以麥克風陣列與語音預估為基礎的語音增強之研究,” 碩士論文, 國立清華大學電機工程學系碩士班。
陳淼海 (2009). “基於盲訊號分離語音增強技術之遠距離雜訊語音辨識,” 碩士論文, 國立成功大學電信工程研究所。
蕭任柏 (2009). “在感知訊號上使用子空間分析之語音增強技術,” 碩士論文, 國立交通大學電機工程研究所。
廖育志 (2011). “結合雜訊抑制語帶聲語音重建之語音增強系統,” 碩士論文, 國立清華大學電機工程學系碩士班。
洪千焙 (2011). “正向反向最小控制遞迴平均雜訊預估於語音增強之研究,”碩士論文, 南台科技大學電機工程研究所。
許詠傑 (2009). “以軟體為基準的助聽器模擬平台之發展-噪音消除,” 碩士論文, 國立中央大學電機工程研究所。
沈宗穎 (2011). “以軟體為基準的助聽器模擬平台之發展-模擬Unitron、Widex和Oticon噪音消除策略,” 碩士論文, 國立中央大學電機工程研究所。
劉庭安 (2012). “運用TMS320C6713開發可自動情境分類之雙麥克風除噪系統,” 碩士論文, 國立中央大學電機工程研究所。
楊彥明 (2014). “運用TMS320C6713開發可自動匹配之雙麥克風
除噪系統,” 碩士論文, 國立中央大學電機工程研究所。 |