強健性喚醒詞辨認之嵌入式系統實作

DC 欄位	值	語言
DC.contributor	資訊工程學系	zh_TW
DC.creator	邱毅青	zh_TW
DC.creator	Yi-Chin Chiu	en_US
dc.date.accessioned	2018-8-17T07:39:07Z
dc.date.available	2018-8-17T07:39:07Z
dc.date.issued	2018
dc.identifier.uri	http://ir.lib.ncu.edu.tw:88/thesis/view_etd.asp?URN=105522030
dc.contributor.department	資訊工程學系	zh_TW
DC.description	國立中央大學	zh_TW
DC.description	National Central University	en_US
dc.description.abstract	近年來，智慧音箱產品如火如荼的發展，亞馬遜的智慧音箱Echo成功改變消費者的家電使用習慣，語音助理Alexa使消費者能夠用語音即可下達指令，讓生活更加便利，與智慧音箱相關的技術有分前端及後端，前端指的是裝置端，也就是智慧音箱前端的技術，包含噪音消除、語音增強、回聲消除、聲音活動偵測、喚醒詞辨認等等，而後端為伺服器端，則包含語音辨識、語意理解等等，也使得各家廠商在這些技術上都投注了不少心血。本論文結合前人之研究來實作強健性喚醒詞辨認嵌入式系統，系統包含智慧音箱中的兩大技術，喚醒詞辨認以及噪音消除技術，喚醒詞辨認是將聲音經由梅爾倒頻譜係數(Mel-Frequency Cipstal Coefficients, MFCC)找出特徵後，利用卷積神經網路訓練，輸出各喚醒詞類別的機率來判定是否被辨認；噪音消除則是將聲音利用短時傅立葉轉換(Short-Time Fourier Transform, STFT)將混合訊號的時頻結果，取出能量後放入遞迴神經網路訓練，得到噪音及語音的遮罩，再應用於廣義特徵波束成形器(GEV Beamformer)上，達到噪音消除之效果。	zh_TW
dc.description.abstract	In recent years, smart speaker gets into full swing, amazon smart speaker, Echo, successfully changed customers’ habits of using home appliances, and voice assistant Alexa enables customers to command via voice. Smart speaker related technology are divided into front-end and back-end, front-end refers to the device, namely smart speaker front-end technology, including noise reduction, speech enhancement, echo cancellation, voice activity detection, etc., and back-end technology refers to server end, including speech recognition and semantic understanding, and so on. These technologies make each firms bet a lot of efforts. In this thesis, we combined previous research and implemented robust wake word detection on embedded system, the system consists of two techniques in smart speakers, wake word detection and noise reduction, wake word detection is voice through the Mel cepstrum coefficient (MFCC) to extract the characteristics as input on convolution neural network and the output are probabilities of each class of wake word. Probabilities determine whether wake words are identified; Noise reduction use short-time Fourier Transform (STFT) results of the time-frequency mixed signals, after taking out the energy and put it into the recursive neural network to train, then we get the output, noise mask and speech mask, applying these masks on GEV beamformer to achieve noise reduction.	en_US
DC.subject	喚醒詞	zh_TW
DC.subject	噪音消除	zh_TW
DC.subject	嵌入式系統	zh_TW
DC.title	強健性喚醒詞辨認之嵌入式系統實作	zh_TW
dc.language.iso	zh-TW	zh-TW
DC.title	Embedded System Implementation of Robust Wake Word Detection	en_US
DC.type	博碩士論文	zh_TW
DC.type	thesis	en_US
DC.publisher	National Central University	en_US

博碩士論文 105522030 完整後設資料紀錄