博碩士論文 108522059 完整後設資料紀錄

DC 欄位 語言
DC.contributor資訊工程學系zh_TW
DC.creator陳柏勳zh_TW
DC.creatorPo-Hsun Chenen_US
dc.date.accessioned2021-9-27T07:39:07Z
dc.date.available2021-9-27T07:39:07Z
dc.date.issued2021
dc.identifier.urihttp://ir.lib.ncu.edu.tw:444/thesis/view_etd.asp?URN=108522059
dc.contributor.department資訊工程學系zh_TW
DC.description國立中央大學zh_TW
DC.descriptionNational Central Universityen_US
dc.description.abstract使用發音的運動資訊合成語音,能為現實應用帶來益處,例如聲帶受損的病患、需要靜音通話的場景,或是在高噪音的環境中。在這項研究中,我們探索了另類數據,即電子硬顎圖 (Electropalatography, EPG),並提出了一種新穎的多模態 EPG 轉語音 (EPG-to-Speech, EPG2S) 合成系統。我們的模型有兩項目標:(1) 僅使用 EPG 信號合成語音。 (2) 如果我們可以在有噪聲的環境中同時獲得語者的語音信號,我們就可以利用 EPG 信號進行語音增強 (SE)。在 EPG2S 系統中我們研究了兩種融合策略,分別為後期融合 (Late Fusion, LF) 和早期融合 (Early Fusion, EF)。在漢語語料庫上的實驗結果表明,第一個目標中,與加入真實世界噪聲的語音相比,所提出的多模態 EPG2S 系統平均皆優於 SNR 為 -5dB 或更低的背景噪聲。第二個目標中,這些系統在 PESQ、STOI 和 ESTOI 這些語音評估指標中,優於僅使用語音訊號的 SE 系統。這些結果驗證了使用 EPG 信號合成語音的可行性以及將其納入 SE 系統的有效性。zh_TW
dc.description.abstractSynthesized speech from articulatory movement can bring benefits to patients with vocal cord disorders, situations requiring silence, or in high-noise environments. In this study, we explore alternative data, namely electropalatography (EPG), and propose a novel multimodal EPG-to-speech (EPG2S) synthesis system. Our model has two goals: (1) Synthesize speech using only EPG signal. (2) If we can obtain the speaker′s audio signal in a noisy environment simultaneously, we can perform speech enhancement (SE) by leveraging the EPG signal. Two fusion strategies are investigated for the EPG2S system, namely late fusion (LF) and early fusion (EF). Experimental results on a Mandarin corpus. In the first goal, compared to speech with real-world noises, the proposed multimodal EPG2S systems outperform background noise at an SNR level of -5dB or lower on average. In the second goal, these systems outperform the audio-only SE counterparts in PESQ, STOI, and ESTOI speech evaluation metrics. These results verify the feasibility of using EPG signals to synthesize speech and the effectiveness of incorporating it into the SE system.en_US
DC.subject多模態zh_TW
DC.subject電子硬顎圖zh_TW
DC.subject語音合成zh_TW
DC.subject語音增強zh_TW
DC.subjectmultimodalen_US
DC.subjectelectropalatographyen_US
DC.subjectspeech synthesisen_US
DC.subjectspeech enhancementen_US
DC.titleEPG2S:基於電子硬顎圖訊號的語音生成技術zh_TW
dc.language.isozh-TWzh-TW
DC.titleEPG2S: Speech Synthesis Technology Based on Electropalatography Signalen_US
DC.type博碩士論文zh_TW
DC.typethesisen_US
DC.publisherNational Central Universityen_US

若有論文相關問題,請聯絡國立中央大學圖書館推廣服務組 TEL:(03)422-7151轉57407,或E-mail聯絡  - 隱私權政策聲明