中大機構典藏-NCU Institutional Repository-提供博碩士論文、考古題、期刊論文、研究計畫等下載:Item 987654321/98642
English  |  正體中文  |  简体中文  |  Items with full text/Total items : 83776/83776 (100%)
Visitors : 60166527      Online Users : 823
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version


    Please use this identifier to cite or link to this item: https://ir.lib.ncu.edu.tw/handle/987654321/98642


    Title: 基於語音理解度與Holo-Spectrum之語音強化演算法;A new Speech Intelligibility Enhancement Algorithm Based on Holo-Spectrum (SIEH)
    Authors: 陳哲維;Chen, Zhe-Wei
    Contributors: 機械工程學系
    Keywords: 語音理解度;Holo-spectrum;調幅訊號;語音強化;EMD;UPEMD;Speech Intelligibility;Holo-Spectrum;Amplitude Modulation;Speech Enhancement;EMD;UPEMD
    Date: 2025-07-09
    Issue Date: 2025-10-17 13:02:29 (UTC+8)
    Publisher: 國立中央大學
    Abstract: 語音增強的目標是改善在嘈雜環境中語音的品質與理解度,為語音通訊與語音辨識等應用奠定基礎。語音可以在時域中進行增強(如 subspace method),或是在傅立葉頻域中處理(如維納濾波、頻譜減法及基於統計模型的方法MMSE)。然而,這些方法通常在提升語音品質的同時會引入失真,有時甚至產生令人困擾的「樂音」(musical noise)。此外,這些方法並未特別針對語音理解度的提升而設計。
    本研究針對語音訊號中的調幅訊號(Amplitude Modulation, AM),提出一套全新的語音理解度強化演算法,重點在於提升語音於嘈雜環境中的理解度與品質。根據文獻指出,語音中 AM 的 1 至 16 Hz 頻段攜帶關鍵語言資訊,與語音理解度高度相關;相對地,低於 1 Hz 或高於 16 Hz 的 AM 成分多屬非語言性訊息,與語音理解度較無關。實驗亦發現,自然界中多數背景噪聲的變化速度緩慢(例如 AM 小於 1 Hz)。
    基於上述觀察,本研究設計一個雙階段語音增強系統:第一階段使用 MMSE-STSA(Minimum Mean Square Error Short-Time Spectral Amplitude Estimator)進行初步降噪,以抑制整體背景噪聲;第二階段則應用基於 UPEMD(Uniform Phase Empirical Mode Decomposition)之 Holo-spectrum 分解,針對 AM 結構進行模態分解與強化,進一步濾除低頻噪聲趨勢模態,並保留語音理解度相關成分。
    實驗結果顯示,所提方法在多種噪聲情境下,相較於僅使用 MMSE-STSA,在訊噪比(SNR)、PESQ(Perceptual Evaluation of Speech Quality)客觀指標以及人耳主觀評估均有顯著提升,證實本研究方法在語音品質與理解度方面具實質效益。
    ;The goal of speech enhancement is to improve the quality and intelligibility of speech in noisy environments, providing a foundation for applications such as speech communication and automatic speech recognition. Enhancement can be performed in the time domain (e.g., subspace methods) or in the Fourier frequency domain (e.g., Wiener filtering, spectral subtraction, and statistical model-based methods such as MMSE). However, these approaches often introduce distortion while improving speech quality, and may even produce annoying "musical noise." Moreover, they are typically not designed with speech intelligibility enhancement in mind.
    This study proposes a novel speech intelligibility enhancement algorithm focusing on the amplitude modulation (AM) components of speech signals, aiming to improve both speech intelligibility and quality under noisy conditions. According to the literature, the 1–16 Hz AM frequency band in speech carries critical linguistic information and is highly correlated with speech intelligibility. In contrast, AM components below 1 Hz or above 16 Hz are generally non-linguistic and less relevant to intelligibility. Furthermore, it is observed that most natural background noises exhibit slow temporal fluctuations, typically with AM below 1 Hz.
    Based on these observations, a two-stage speech enhancement framework is designed. In the first stage, the MMSE-STSA (Minimum Mean Square Error Short-Time Spectral Amplitude Estimator) is applied to perform initial noise suppression. In the second stage, a Holo-spectrum decomposition based on UPEMD (Uniform Phase Empirical Mode Decomposition) is employed to decompose and enhance the AM structure of the speech signal. This stage further removes low-frequency noise trend modes while preserving intelligibility-related components. Experimental results demonstrate that the proposed method outperforms MMSE-STSA in various noise conditions, yielding improvements in signal-to-noise ratio (SNR), PESQ (Perceptual Evaluation of Speech Quality), and subjective listening tests, confirming its effectiveness in enhancing both speech quality and intelligibility.
    Appears in Collections:[Graduate Institute of Mechanical Engineering] Electronic Thesis & Dissertation

    Files in This Item:

    File Description SizeFormat
    index.html0KbHTML16View/Open


    All items in NCUIR are protected by copyright, with all rights reserved.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明