應用於人工電子耳編碼策略之H-ELM架構的語音回響消除法;Speech Dereverberation Based on H-ELM framework for Cochlear Implant Coding Strategy

NCU Institutional Repository > 資訊電機學院 > 電機工程研究所 > 博碩士論文 > Item 987654321/85111

請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/85111

題名:	應用於人工電子耳編碼策略之H-ELM架構的語音回響消除法;Speech Dereverberation Based on H-ELM framework for Cochlear Implant Coding Strategy
作者:	女哲藹;Nisa, Harisma Khoirun
貢獻者:	電機工程學系
關鍵詞:	階層式極限學習機（HELM）;回響;映射目標;盲法目標;特徵學習;電子耳;Hierarchical extreme Learning Machine (HELM);dereverberation;feature learning;mapping target;masking target;CI strategies
日期:	2021-01-26
上傳時間:	2021-03-18 17:41:53 (UTC+8)
出版者:	國立中央大學
摘要:	在現實環境中，人類的語音會被背景噪聲與回響所干擾，而對於電子耳的使用者來說，影響更是嚴重，因為回響會降低電子耳接收的語音品質與清晰度。本研究的目的是使用深度學習來增強語音的清晰度及品質。階層式極限學習機(Hierarchical Extreme Learning Machine, HELM)架構包含了是Original HELM與 Highway HELM，兩者皆能利用各個不同的回響環境進行快速訓練來有效地抑制回響。研究中使用了映射目標和理想比率遮罩(Ideal Ratio Masking, IRM)來作為HELM的訓練目標，並利用台灣地區噪音下漢語語音聽辨測試（Taiwan Mandarin Hearing in Noise, TMHINT）語料以及短時客觀與音理解度(Short-Time Objective Intelligibility, STOI)評估HELM的性能。實驗結果顯示，在短時客觀與音清晰度(STOI)的評估指標下，使用映射目標時，改善幅度可從0.677至0.683，而遮罩目標的改善幅度則是0.677至0.641。不過兩種架構對於回響抑制的結果並無明編碼顯差異。Original HELM及Highway HELM改善幅度分別是0.683至0.706、0.683至0.707。以HELM架構抑制回響後的語音更進一步地經過人工電子耳電子耳編碼策略處理，包括了進階聯合編碼(advanced combination encoder, ACE)、包絡增強 (Envelope Enhancement, EE) 、基本頻率調變(Fundamental frequency modulation, F0mod)等方法，以模擬電子耳使用者的聆聽表現。結果顯示採用映射的HELM架構可改善有效改善ACE及EE策略的言語理解度。;Human speech activity in the real condition is distorted by background noise and reverberant conditions, which affects the speech intelligibility and speech quality especially for cochlear implant (CI) users. Environmental noise especially in reverberant condition represents one of the challenges for CI user speech understanding in everyday life. The purpose of this study is to increase the intelligibility and perceived quality of the speech component using machine learning. The Hierarchical Extreme Learning Machine (HELM) framework, including HELM original and HELM Highway, demonstrated the attenuation of reverberation which have effectively and quickly learning. Feature learning based on training target mapping and ideal ratio masking (IRM) were applied on this framework to evaluate the performance of speech enhancement. The Taiwan Mandarin Hearing in Noise (TMHINT) dataset and short-time objective intelligibility (STOI) test were used to evaluate the performance of the HELM framework. The experimental results showed that average STOI scores of the mapping training target (0.677 to 0.683) achieved better results compared to masking training target (0.677 to 0.641) to attenuate reverberant effect. However, both framework HELM original (0.683 to 0.706) and HELM Highway (0.683 to 0.707) had no significant effect on the result. The deverberant speech processed by the HELM framework, was further processed by the cochlear implant sound coding strategies. Advanced Combination Encoder (ACE), Envelop Enhancement (EE) and Fundamental Frequency (F0mod), to simulate the listening performance of CI users. The results showed that HELM mapping framework could improve speech intelligibility in both ACE and EE strategies.
顯示於類別:	[電機工程研究所] 博碩士論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	114	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....