中大機構典藏-NCU Institutional Repository-提供博碩士論文、考古題、期刊論文、研究計畫等下載:Item 987654321/98642
English  |  正體中文  |  简体中文  |  全文笔数/总笔数 : 83776/83776 (100%)
造访人次 : 60166527      在线人数 : 823
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜寻范围 查询小技巧:
  • 您可在西文检索词汇前后加上"双引号",以获取较精准的检索结果
  • 若欲以作者姓名搜寻,建议至进阶搜寻限定作者字段,可获得较完整数据
  • 进阶搜寻


    jsp.display-item.identifier=請使用永久網址來引用或連結此文件: https://ir.lib.ncu.edu.tw/handle/987654321/98642


    题名: 基於語音理解度與Holo-Spectrum之語音強化演算法;A new Speech Intelligibility Enhancement Algorithm Based on Holo-Spectrum (SIEH)
    作者: 陳哲維;Chen, Zhe-Wei
    贡献者: 機械工程學系
    关键词: 語音理解度;Holo-spectrum;調幅訊號;語音強化;EMD;UPEMD;Speech Intelligibility;Holo-Spectrum;Amplitude Modulation;Speech Enhancement;EMD;UPEMD
    日期: 2025-07-09
    上传时间: 2025-10-17 13:02:29 (UTC+8)
    出版者: 國立中央大學
    摘要: 語音增強的目標是改善在嘈雜環境中語音的品質與理解度,為語音通訊與語音辨識等應用奠定基礎。語音可以在時域中進行增強(如 subspace method),或是在傅立葉頻域中處理(如維納濾波、頻譜減法及基於統計模型的方法MMSE)。然而,這些方法通常在提升語音品質的同時會引入失真,有時甚至產生令人困擾的「樂音」(musical noise)。此外,這些方法並未特別針對語音理解度的提升而設計。
    本研究針對語音訊號中的調幅訊號(Amplitude Modulation, AM),提出一套全新的語音理解度強化演算法,重點在於提升語音於嘈雜環境中的理解度與品質。根據文獻指出,語音中 AM 的 1 至 16 Hz 頻段攜帶關鍵語言資訊,與語音理解度高度相關;相對地,低於 1 Hz 或高於 16 Hz 的 AM 成分多屬非語言性訊息,與語音理解度較無關。實驗亦發現,自然界中多數背景噪聲的變化速度緩慢(例如 AM 小於 1 Hz)。
    基於上述觀察,本研究設計一個雙階段語音增強系統:第一階段使用 MMSE-STSA(Minimum Mean Square Error Short-Time Spectral Amplitude Estimator)進行初步降噪,以抑制整體背景噪聲;第二階段則應用基於 UPEMD(Uniform Phase Empirical Mode Decomposition)之 Holo-spectrum 分解,針對 AM 結構進行模態分解與強化,進一步濾除低頻噪聲趨勢模態,並保留語音理解度相關成分。
    實驗結果顯示,所提方法在多種噪聲情境下,相較於僅使用 MMSE-STSA,在訊噪比(SNR)、PESQ(Perceptual Evaluation of Speech Quality)客觀指標以及人耳主觀評估均有顯著提升,證實本研究方法在語音品質與理解度方面具實質效益。
    ;The goal of speech enhancement is to improve the quality and intelligibility of speech in noisy environments, providing a foundation for applications such as speech communication and automatic speech recognition. Enhancement can be performed in the time domain (e.g., subspace methods) or in the Fourier frequency domain (e.g., Wiener filtering, spectral subtraction, and statistical model-based methods such as MMSE). However, these approaches often introduce distortion while improving speech quality, and may even produce annoying "musical noise." Moreover, they are typically not designed with speech intelligibility enhancement in mind.
    This study proposes a novel speech intelligibility enhancement algorithm focusing on the amplitude modulation (AM) components of speech signals, aiming to improve both speech intelligibility and quality under noisy conditions. According to the literature, the 1–16 Hz AM frequency band in speech carries critical linguistic information and is highly correlated with speech intelligibility. In contrast, AM components below 1 Hz or above 16 Hz are generally non-linguistic and less relevant to intelligibility. Furthermore, it is observed that most natural background noises exhibit slow temporal fluctuations, typically with AM below 1 Hz.
    Based on these observations, a two-stage speech enhancement framework is designed. In the first stage, the MMSE-STSA (Minimum Mean Square Error Short-Time Spectral Amplitude Estimator) is applied to perform initial noise suppression. In the second stage, a Holo-spectrum decomposition based on UPEMD (Uniform Phase Empirical Mode Decomposition) is employed to decompose and enhance the AM structure of the speech signal. This stage further removes low-frequency noise trend modes while preserving intelligibility-related components. Experimental results demonstrate that the proposed method outperforms MMSE-STSA in various noise conditions, yielding improvements in signal-to-noise ratio (SNR), PESQ (Perceptual Evaluation of Speech Quality), and subjective listening tests, confirming its effectiveness in enhancing both speech quality and intelligibility.
    显示于类别:[機械工程研究所] 博碩士論文

    文件中的档案:

    档案 描述 大小格式浏览次数
    index.html0KbHTML16检视/开启


    在NCUIR中所有的数据项都受到原著作权保护.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明