中大機構典藏-NCU Institutional Repository-提供博碩士論文、考古題、期刊論文、研究計畫等下載:Item 987654321/88368
English  |  正體中文  |  简体中文  |  全文筆數/總筆數 : 80990/80990 (100%)
造訪人次 : 41777037      線上人數 : 2055
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜尋範圍 查詢小技巧:
  • 您可在西文檢索詞彙前後加上"雙引號",以獲取較精準的檢索結果
  • 若欲以作者姓名搜尋,建議至進階搜尋限定作者欄位,可獲得較完整資料
  • 進階搜尋


    請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/88368


    題名: 可自定義之語者驗證系統與其特徵擷取模組之硬體實現;A Hardware Implementation of Feature Extraction for Self-Defined Speaker Verification System
    作者: 王喬立;Wang, Chiao-Li
    貢獻者: 電機工程學系
    關鍵詞: 語音辨識;語者驗證;語音特徵擷取;FPGA;SoC
    日期: 2022-04-15
    上傳時間: 2022-07-14 00:36:31 (UTC+8)
    出版者: 國立中央大學
    摘要: 近年來,在人機互動的社會中使用語音辨識來驅動設備或是控制設備的語音系統越來越普遍。其中,語者驗證已經被廣泛的探索並大幅提高了它的有效性,透過分析語者們的聲紋找出之間的特徵差異來進行驗證。然而,目前基於複雜且架構龐大的神經網路做法仍有許多缺點,像是只能在規格極高的邊緣裝置上執行,或是將語音片段擷取後上傳至雲端進行處理,進而衍伸出個人隱私問題。為了解決這些問題,可在終端運算之語者驗證系統是語音人機互動中重要的任務。
    本論文提出可自定義之語者驗證系統與其特徵擷取模組之硬體實現。經過各個模組的耗時分析後,在Xilinx ZCU104開發板 Programmable Logic端上實現梅爾倒頻譜參數 (Mel-Frequency Cepstral Coefficients) 預處理模組,並經由AXI匯流排將擷取出的語音特徵傳回 Processing System端進行後處理。其中MFCC硬體架構在FPGA上的功耗為4.26W,在150MHz操作頻率下,一時長為2秒的語音可在53.6毫秒內處理完畢,且在後續的後處理中保有高準確率,滿足實時系統的標準。
    ;In recent years, the devices that use speaker recognition to drive or control in a human-computer interactive society have become increasingly common. Among these, speaker verification has been widely explored and its effectiveness has been significantly improved by analyzing the voiceprints of speakers to identify differences in features between them. However, the current approach based on complex and large neural networks still has many drawbacks, such as it can only be performed on highly-specified edge devices, or the voice clips are captured and uploaded to the cloud for processing, which can lead to personal privacy issues. To address these issues, local speaker verification systems are an important task in speech human-computer interaction.
    This paper proposed a self-defined speaker verification system and its hardware implementation of feature extraction module. After time-consuming analysis of each module, the Mel-Frequency Cepstral Coefficients pre-processing module is implemented on the programmable logic side of the Xilinx ZCU104 development board and the extracted features data are sent back to the processing system side for post-processing. The MFCC hardware architecture consumes 4.26W on the FPGA, and a 2-second speech can be processed in 53.6ms at 150MHz operating frequency. The overall system can meet the real-time standards with high accuracy in the post-processing.
    顯示於類別:[電機工程研究所] 博碩士論文

    文件中的檔案:

    檔案 描述 大小格式瀏覽次數
    index.html0KbHTML48檢視/開啟


    在NCUIR中所有的資料項目都受到原著作權保護.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明