運用 TMS320C6713 開發可語音增強之雙麥克風除噪系統

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：45

、訪客IP：3.137.221.252

姓名

陳政鋒(Zeng-fong Chen) 查詢紙本館藏

畢業系所

電機工程學系

論文名稱

運用 TMS320C6713 開發可語音增強之雙麥克風除噪系統
(Development of a speech enhancement dual-microphone noise reduction system utilizing TMS320C6713)

相關論文

★ 獨立成份分析法於真實環境中聲音訊號分離之探討	★ 口腔核磁共振影像的分割與三維灰階值內插
★ 數位式氣喘尖峰氣流量監測系統設計	★ 結合人工電子耳與助聽器對中文語音辨識率的影響
★ 人工電子耳進階結合編碼策略的中文語音辨識成效模擬--結合助聽器之分析	★ 中文發聲之神經關聯性的腦功能磁振造影研究
★ 利用有限元素法建構3維的舌頭力學模型	★ 以磁振造影為基礎的立體舌頭圖譜之建構
★ 腎小管之草酸鈣濃度變化與草酸鈣結石關係之模擬研究	★ 口腔磁振影像舌頭構造之自動分割
★ 微波輸出窗電性匹配之研究	★ 以軟體為基準的助聽器模擬平台之發展-噪音消除
★ 以軟體為基準的助聽器模擬平台之發展-回饋音消除	★ 模擬人工電子耳頻道數、刺激速率與雙耳聽對噪音環境下中文語音辨識率之影響
★ 用類神經網路研究中文語音聲調產生之神經關聯性	★ 教學用電腦模擬生理系統之建構

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

本研究的目的是在可自動情境分類與補償之雙麥克風除噪系統的後端加入語音增強功能，主要是為了讓雙麥克風系統能夠藉由語音增強的功能，進一步提高對訊號的除噪效果，讓整個系統的輸出能夠有更高的語音理解度。語音增強系統主要分為噪音估測策略和語音估測函數兩部分，研究中所使用的噪音估測策略: 最小統計法(Minimum Statistics, MS)、最小控制遞迴平均(Minima-Controlled Recursive Averaging, MCRA)、改良最小控制遞迴平均(Improved Minima-Controlled Recursive Averaging, IMCRA)、Loizou改良最小控制遞迴平均(MCRA-L)、變異量控制平滑係數(Constrained Variance Spectral Smoothing, CVS)、正反向最小控制遞迴平均法(MCRA-FB)；語音估測函數:最大相似度策略(Maximum-Likelihood, ML)、對數頻譜振幅估測(Log-Spectral Amplitude, LSA)、最大後驗振幅估測(Maximum A Posteriori Amplitude, MAPA)、改良版韋納濾波器(Wiener-type)和韋納濾波器(Wiener Filter)。
在本研究中使用MATLAB(The MathWorks, Natick, Massachusetts, USA)來進行軟體上的語音增強系統的模擬，軟體模擬主要是針對不同訊噪比(Signal-to-Noise Ratio, SNR)的噪音語音輸入訊號進行語音增強處理後語音品質的評估，然後評選出最好的語音增強系統的搭配；最後將語音增強系統結合可自動情境分類與補償之雙麥克風除噪系統，將其實現在TMS320C6713開發板(Texas Instruments, Dallas, Texas, USA)，並與未加入語音增強前的除噪系統進行語音品質的評估比較，語音品質的評估主要是使用語音品質客觀評量(Perceptual Evaluation of Speech Quality, PESQ)與主觀的語音接收閥值(Speech Reception Threshold, SRT)做為評估的指標，輸入訊號的SNR範圍落在30dB到-30dB之間。
在客觀評量方面，軟體模擬結果顯示，當使用CVS搭配上MAPA時，在輸入訊號的SNR值為30 dB的情況下，對於PESQ也有0.45的改善，而在訊號的SNR值為10 dB時，PESQ更有高達0.65的改善，為了在開發板上進行即時運算，在硬體上只能使用MCRA搭配上MAPA的語音增強系統，硬體上實現上實驗結果顯示，在SNR值為30 dB的情況下造成了PESQ下降0.36，而在SNR值低於10 dB以下時，由於自動情境分類系統會自動開啟方向性麥克風，此時在方向性麥克風與語音增強的雙重作用下，能有效的降低語音的失真與提高語音品質，在SNR值為0 dB時，PESQ則有了最高0.27的提升。
在主觀評量方面，使用HINT Pro聽力檢查儀(Bio-logic, Chicago, IL, USA)對五位年齡介於23到26歲之間的受測者進行在不同噪音環境下的SRT測試，實驗結果顯示，受測者的SRT平均上升了8.54dB，加入語音增強系統後SRT不但沒有改善，反而還變得比原本差了，這是因為經語音增強處理後的聲音音量變得太小，導致語音品質雖然改善了，可是SRT卻不降反升。由以上的實驗結果可以驗證，使用較短的音框長度時，加入語音增強系統後雖然會在低噪音環境下造成些許的失真，不過在高噪音環境下仍然能夠有效的提升語音品質，而在使用較長的音框長度且音框間有疊合時，語音增強的效果更是能大幅度的提高語音理解度，由於為了能夠在開發板上即時運算，只能使用較短的音框長度，如果能加入放大器，更能使整個系統在實際運用上，達到跟客觀評量一樣的效果。

摘要(英)

The purpose of this research was to add a speech enhancement process that could further improve speech intelligibility and the performance of automatic scene classification and auto-matching noise reduction system after the application of the adaptive directional microphone strategy. The speech enhancement system is divided into two parts, one is the noise-estimation strategy and another the speech-estimation function. Noise-estimation algorithms used in the research are: Minimum Statistics (MS), Minima-Controlled Recursive Averaging (MCRA), Improved Minima-Controlled Recursive Averaging (IMCRA), Minima-Controlled Recursive Averaging-Loizou (MCRA-L), Constrained Variance Spectral Smoothing (CVS), Forward-Backward MCRA(MCRA-FB); Speech-estimation function: Maximum-Likelihood (ML), Log-Spectral Amplitude (LSA), Maximum A Posteriori Amplitude (MAPA), Wiener-type, Wiener Filter.
In this research, The MATLAB (The MathWorks, Natick, Massachusetts, USA) software was first used to simulate the speech enhancement system. The simulation was mainly to evaluate the speech quality of the signal after speech enhancement process with different signal-to-noise ratio (SNR) of the input speech noise signal, and then to select the best combination of the speech enhancement system. Finally, the selected speech enhancement system was implemented with automatic scene classification and auto-matching noise reduction system in TMS320C6713 DSP Starter Kit (Texas Instruments, Dallas, Texas, USA), and compared with the output signal in the original noise reduction system. To show the performance of the selected speech enhancement system, the objective perceptual evaluation of speech quality (PESQ) approach and the subjective speech reception threshold (SRT) were further used to evaluate the quality of speech with the SNR range between 30dB to -30dB.
In the objective evaluation, the simulated results showed that the PESQ score was increased by 0.45 when the speech enhancement CVS with MAPA was used for the input signal with 30dB SNR and by 0.65 for 10 dB SNR. For the hardware implementation, only the speech enhancement MCRA with MAPA was used for real-time processing. The experimental results indicated that speech enhancement system could decrease the speech quality by 0.36 for the input signal with 30dB SNR. When the SNR was below 10dB, the automatic scene classification system would automatically select the function of microphone noise reduction strategy. With the speech enhancement system, our overall hardware implementation could effectively reduce speech distortion and improve speech quality. The PESQ score was increased by 0.27 for the input signal with 0 dB SNR.
The SRT from five normal hearing subjects (between 23 to 26 years old) in different noise conditions were measured with the HINT Pro system (Bio-logic, Chicago, IL, USA) for subjective evaluation. Our experimental results showed that speech enhancement could not improve the SRT of the subjects, but become worse than original system. The average SRT of the subjects was increased by 8.54dB because the volume of the signal processed by the speech enhancement system became too small, even though the objective speech quality was improved. The above-mentioned experimental results suggested that the speech enhancement system could provide better speech quality in high SNR when the system used shorter frame length despite of some distortion in low SNR. Nevertheless, the speech enhancement system was able to greatly improve speech intelligibility when the system used longer frame length. If the amplifier stage was included in the system, the whole system could achieve the same performance as that of the objective evaluation.

關鍵字(中)

★ 語音增強
★ 適應性方向性麥克風
★ TMS320C6713
★ 自動情境分類

關鍵字(英)

★ speech enhancement
★ adaptive directional microphone strategy
★ TMS320C6713
★ automatic scene classification

論文目次

目錄
摘要 I
Abstract IV
致謝 VI
目錄 VII
圖目錄 X
表目錄 XIII
第一章序論 1
1.1 前言 1
1.2 研究動機 4
1.3 文獻回顧 6
1.3.1 語音頻譜的估測 8
1.3.2 噪音頻譜的估測 10
1.3.3 國內相關之研究 14
1.4 研究目的 16
1.5 論文內容架構 17
第二章語音增強策略 19
2.1 語音估測函數 19
2.1.1 通用係數介紹 19
2.1.2 最大相似度策略 20
2.1.3 最小化均方誤差估測 23
2.1.4 對數頻譜振幅估測 25
2.1.5 最大後驗振幅估測 26
2.1.6 改良版韋納濾波器 27
2.2 噪音估測法 28
2.2.1 最小統計法 28
2.2.2 最小控制遞迴平均 32
2.2.3 改良最小控制遞迴平均 35
2.2.4 Loizou改良最小控制遞迴平均 39
2.2.5 變異量控制平滑係數和最小值追蹤 42
2.2.6 正反向最小控制遞迴平均法 47
第三章軟體模擬 48
3.1 實驗語料與噪音語料 49
3.2 噪音頻譜估測策略的實現與實驗流程 49
3.2.1 實現方法與實驗環境 50
3.2.2 噪音頻譜估測策略模擬實驗 52
3.3 語音增強系統的模擬與實驗流程 57
3.3.1 實驗流程與實驗語料 57
3.3.2 語音增強系統模擬實驗一 60
3.3.3 語音增強系統模擬實驗二 65
3.4 語音增強系統對PESQ大小值的模擬結果 67
第四章硬體實現方法與結果比較 69
4.1 TMS320C6713開發板與麥克風電路 69
4.2 語音增強系統的硬體實現與結果 72
4.2.1 語音增強系統的實現與實驗環境 73
4.2.2 實驗一 75
4.2.3 實驗二 79
4.3 語音增強系統硬體實現之主觀評量 84
4.3.1 實驗環境與流程介紹 85
4.3.2 實驗方法與結果討論 87
4.4 硬體平台的實驗結果討論 89
第五章結論與未來展望 93
5.1 結論 93
5.2 未來展望 97
參考文獻 99
附錄 104

參考文獻

Arslan, L., McCree, A., and Viswanathan, V (1995). “New methods for adaptive noise suppression,” IEEE International Conference on Acoustics, Speech, and Signal Processing, 1, 812-815.

Boll, S. F. (1979). “Suppression of Acoustic Noise in Speech Using Spectral Subtraction,” IEEE Transactions on Acoustics, Speech, and Signal Processing, 27, 113-120.

Cohen, I. (2002). "Noise estimation by minima controlled recursive averaging for robust speech enhancement," IEEE Signal Processing Letters, 9, 12-25.

Cohen, I. (2002). "Optimal speech enhancement under signal presence uncertainty using log-spectral amplitude estimator," IEEE Signal Processing Letters, 9, 113-116.

Cohen, I. (2003). "Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging," IEEE Transactions on Speech and Audio Processing, 11, 466-475.

Derakhshan, N., Akbari, A., and Ayatollahi, A. (2009). “Noise power spectrum estimation using constrained variance spectral smoothing and minima tracking,” Speech Communication, 51, 1098-1113.

Doblinger, G. (1995). “Computationally efficient speech enhancement by spectral minima tracking in subbands,” Proc. Euro-Speech, 2, 1513-1516.
Ephraim, Y., and Malah, D. (1984). "Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator," IEEE Transactions on Acoustics, Speech and Signal Processing, 32, 1109-1121.

Ephraim, Y., and Malah, D. (1985). "Speech enhancement using a minimum mean-square error log-spectral amplitude estimator," IEEE Transactions on Acoustics, Speech and Signal Processing, 33, 443-445.

Hamacher, V., Chalupper, J., Eggers, J., Fischer, E., Kornagel, U., Puder, H. and Rass, U. (2005). “Signal processing in high-end hearing aids:state of the art, challenges, and future trends,” EURASIP Journal on Applied Signal Processing, 18, 2915-2929.

Hirsch, H. G., and Ehrlicher, C. (1995). “Noise estimation techniques for robust speech recognition,” IEEE International Conference on Acoustics, Speech, and Signal Processing, 1, 153-156.

Hu, Y., and Loizou, P. C. (2004). “Speech enhancement based on wavelet thresholding the multitaper spectrum,” IEEE Transactions on Speech and Audio Processing, 12, 59-67.

ITU-T (2001). Perceptual evaluation of speech quality (PESQ): An Objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech codecs. ITU-T P.862.

Li, J. (2006). “Noise reduction based on microphone array and post-filtering for robust hands-free speech recognition in adverse environments,” Signal Processing, 2006 8th International Conference, 1.

Lotter, T., and Vary, P. (2005). “Speech enhancement by MAP spectral amplitude estimation using a super-Gaussian speech model,” EURASIP Journal on Applied Signal Processing, 7, 1110-1126.

Martin, R. (1994). "Spectral subtraction based on minimum statistics," 7th European Signal Processing Conference, 94, 1182-1185.

Martin, R. (2001). “Noise power spectral density estimation based on optimal smoothing and minimum statistics,” IEEE Transactions on Speech and Audio Processing, 9, 504-512.

McAulay, R. J., and Malpass, M. L. (1980). “Speech enhancement using a soft-decision noise suppression,” IEEE Transactions on Acoustics, Speech, and Signal Processing, 9, 504-512.

NOISEX-92 (1993). NOISEX-92 noise database, Signal Processing Information Base by the Signal Processing Society and the National Science Foundation. http://spib.rice.edu/spib.html.

Rangachari, S., and Loizou, P. C. (2006). “A noise-estimation alogorithm for highly non-stationary environments,” Speech Communication, 48, 220-231.

TI (2003) "TMS320C6713 DSK Technical Reference, 506735-0001 Rev. B."

Wolfe, P. J., and Godsill, S. J. (2003). “Eﬃcient alternatives to the Ephraim and Malah suppression rule for audio signal enhancement,” EURASIP Journal on Applied Signal Processing, 10, 1043-1051.

衛生福利部社會及家庭署網站，西元2013年資料。
http://www.sfaa.gov.tw/SFAA/default.aspx

行政院內政部統計處網站，西元2014年資料。
http://www.moi.gov.tw/stat/index.aspx

科林聽力團隊 (2011). “助聽器學,” 科林儀器股份有限公司, 台灣, 新北市。

黃銘緯 (2005). “台灣地區噪音下漢語語音聽辨測試,” 碩士論文, 國立台北護理學院聽語障礙科學研究所。

陽吉文 (2006). “以麥克風陣列與語音預估做語音增強之研究,” 碩士論文, 國立清華大學電機工程學系碩士班。

李銘浚 (2007). “應用獨立成分分析、對數頻譜預估、及頻率成分調整技術做語音增強之研究,” 碩士論文, 國立清華大學電機工程學系碩士班。

黃承德 (2009). “以麥克風陣列與語音預估為基礎的語音增強之研究,” 碩士論文, 國立清華大學電機工程學系碩士班。

陳淼海 (2009). “基於盲訊號分離語音增強技術之遠距離雜訊語音辨識,” 碩士論文, 國立成功大學電信工程研究所。

蕭任柏 (2009). “在感知訊號上使用子空間分析之語音增強技術,” 碩士論文, 國立交通大學電機工程研究所。

廖育志 (2011). “結合雜訊抑制語帶聲語音重建之語音增強系統,” 碩士論文, 國立清華大學電機工程學系碩士班。

洪千焙 (2011). “正向反向最小控制遞迴平均雜訊預估於語音增強之研究,”碩士論文, 南台科技大學電機工程研究所。

許詠傑 (2009). “以軟體為基準的助聽器模擬平台之發展-噪音消除,” 碩士論文, 國立中央大學電機工程研究所。

沈宗穎 (2011). “以軟體為基準的助聽器模擬平台之發展-模擬Unitron、Widex和Oticon噪音消除策略,” 碩士論文, 國立中央大學電機工程研究所。

劉庭安 (2012). “運用TMS320C6713開發可自動情境分類之雙麥克風除噪系統,” 碩士論文, 國立中央大學電機工程研究所。

楊彥明 (2014). “運用TMS320C6713開發可自動匹配之雙麥克風
除噪系統,” 碩士論文, 國立中央大學電機工程研究所。

指導教授

吳炤民(Chao-Min Wu)

審核日期

2015-7-1

推文