中大機構典藏-NCU Institutional Repository-提供博碩士論文、考古題、期刊論文、研究計畫等下載:Item 987654321/77747
English  |  正體中文  |  简体中文  |  Items with full text/Total items : 80990/80990 (100%)
Visitors : 41638974      Online Users : 1765
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version


    Please use this identifier to cite or link to this item: http://ir.lib.ncu.edu.tw/handle/987654321/77747


    Title: 強健性喚醒詞辨認之嵌入式系統實作;Embedded System Implementation of Robust Wake Word Detection
    Authors: 邱毅青;Chiu, Yi-Chin
    Contributors: 資訊工程學系
    Keywords: 喚醒詞;噪音消除;嵌入式系統
    Date: 2018-08-17
    Issue Date: 2018-08-31 14:54:48 (UTC+8)
    Publisher: 國立中央大學
    Abstract: 近年來,智慧音箱產品如火如荼的發展,亞馬遜的智慧音箱Echo成功改變消費者的家電使用習慣,語音助理Alexa使消費者能夠用語音即可下達指令,讓生活更加便利,與智慧音箱相關的技術有分前端及後端,前端指的是裝置端,也就是智慧音箱前端的技術,包含噪音消除、語音增強、回聲消除、聲音活動偵測、喚醒詞辨認等等,而後端為伺服器端,則包含語音辨識、語意理解等等,也使得各家廠商在這些技術上都投注了不少心血。
    本論文結合前人之研究來實作強健性喚醒詞辨認嵌入式系統,系統包含智慧音箱中的兩大技術,喚醒詞辨認以及噪音消除技術,喚醒詞辨認是將聲音經由梅爾倒頻譜係數(Mel-Frequency Cipstal Coefficients, MFCC)找出特徵後,利用卷積神經網路訓練,輸出各喚醒詞類別的機率來判定是否被辨認;噪音消除則是將聲音利用短時傅立葉轉換(Short-Time Fourier Transform, STFT)將混合訊號的時頻結果,取出能量後放入遞迴神經網路訓練,得到噪音及語音的遮罩,再應用於廣義特徵波束成形器(GEV Beamformer)上,達到噪音消除之效果。
    ;In recent years, smart speaker gets into full swing, amazon smart speaker, Echo, successfully changed customers’ habits of using home appliances, and voice assistant Alexa enables customers to command via voice. Smart speaker related technology are divided into front-end and back-end, front-end refers to the device, namely smart speaker front-end technology, including noise reduction, speech enhancement, echo cancellation, voice activity detection, etc., and back-end technology refers to server end, including speech recognition and semantic understanding, and so on. These technologies make each firms bet a lot of efforts.
    In this thesis, we combined previous research and implemented robust wake word detection on embedded system, the system consists of two techniques in smart speakers, wake word detection and noise reduction, wake word detection is voice through the Mel cepstrum coefficient (MFCC) to extract the characteristics as input on convolution neural network and the output are probabilities of each class of wake word. Probabilities determine whether wake words are identified; Noise reduction use short-time Fourier Transform (STFT) results of the time-frequency mixed signals, after taking out the energy and put it into the recursive neural network to train, then we get the output, noise mask and speech mask, applying these masks on GEV beamformer to achieve noise reduction.
    Appears in Collections:[Graduate Institute of Computer Science and Information Engineering] Electronic Thesis & Dissertation

    Files in This Item:

    File Description SizeFormat
    index.html0KbHTML211View/Open


    All items in NCUIR are protected by copyright, with all rights reserved.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明