中大機構典藏-NCU Institutional Repository-提供博碩士論文、考古題、期刊論文、研究計畫等下載:Item 987654321/77747
English  |  正體中文  |  简体中文  |  全文筆數/總筆數 : 80990/80990 (100%)
造訪人次 : 41738908      線上人數 : 1096
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜尋範圍 查詢小技巧:
  • 您可在西文檢索詞彙前後加上"雙引號",以獲取較精準的檢索結果
  • 若欲以作者姓名搜尋,建議至進階搜尋限定作者欄位,可獲得較完整資料
  • 進階搜尋


    請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/77747


    題名: 強健性喚醒詞辨認之嵌入式系統實作;Embedded System Implementation of Robust Wake Word Detection
    作者: 邱毅青;Chiu, Yi-Chin
    貢獻者: 資訊工程學系
    關鍵詞: 喚醒詞;噪音消除;嵌入式系統
    日期: 2018-08-17
    上傳時間: 2018-08-31 14:54:48 (UTC+8)
    出版者: 國立中央大學
    摘要: 近年來,智慧音箱產品如火如荼的發展,亞馬遜的智慧音箱Echo成功改變消費者的家電使用習慣,語音助理Alexa使消費者能夠用語音即可下達指令,讓生活更加便利,與智慧音箱相關的技術有分前端及後端,前端指的是裝置端,也就是智慧音箱前端的技術,包含噪音消除、語音增強、回聲消除、聲音活動偵測、喚醒詞辨認等等,而後端為伺服器端,則包含語音辨識、語意理解等等,也使得各家廠商在這些技術上都投注了不少心血。
    本論文結合前人之研究來實作強健性喚醒詞辨認嵌入式系統,系統包含智慧音箱中的兩大技術,喚醒詞辨認以及噪音消除技術,喚醒詞辨認是將聲音經由梅爾倒頻譜係數(Mel-Frequency Cipstal Coefficients, MFCC)找出特徵後,利用卷積神經網路訓練,輸出各喚醒詞類別的機率來判定是否被辨認;噪音消除則是將聲音利用短時傅立葉轉換(Short-Time Fourier Transform, STFT)將混合訊號的時頻結果,取出能量後放入遞迴神經網路訓練,得到噪音及語音的遮罩,再應用於廣義特徵波束成形器(GEV Beamformer)上,達到噪音消除之效果。
    ;In recent years, smart speaker gets into full swing, amazon smart speaker, Echo, successfully changed customers’ habits of using home appliances, and voice assistant Alexa enables customers to command via voice. Smart speaker related technology are divided into front-end and back-end, front-end refers to the device, namely smart speaker front-end technology, including noise reduction, speech enhancement, echo cancellation, voice activity detection, etc., and back-end technology refers to server end, including speech recognition and semantic understanding, and so on. These technologies make each firms bet a lot of efforts.
    In this thesis, we combined previous research and implemented robust wake word detection on embedded system, the system consists of two techniques in smart speakers, wake word detection and noise reduction, wake word detection is voice through the Mel cepstrum coefficient (MFCC) to extract the characteristics as input on convolution neural network and the output are probabilities of each class of wake word. Probabilities determine whether wake words are identified; Noise reduction use short-time Fourier Transform (STFT) results of the time-frequency mixed signals, after taking out the energy and put it into the recursive neural network to train, then we get the output, noise mask and speech mask, applying these masks on GEV beamformer to achieve noise reduction.
    顯示於類別:[資訊工程研究所] 博碩士論文

    文件中的檔案:

    檔案 描述 大小格式瀏覽次數
    index.html0KbHTML211檢視/開啟


    在NCUIR中所有的資料項目都受到原著作權保護.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明