博碩士論文 105521050 完整後設資料紀錄

DC 欄位 語言
DC.contributor電機工程學系zh_TW
DC.creator郝平正zh_TW
DC.creatorPing-Cheng Haoen_US
dc.date.accessioned2019-11-5T07:39:07Z
dc.date.available2019-11-5T07:39:07Z
dc.date.issued2019
dc.identifier.urihttp://ir.lib.ncu.edu.tw:88/thesis/view_etd.asp?URN=105521050
dc.contributor.department電機工程學系zh_TW
DC.description國立中央大學zh_TW
DC.descriptionNational Central Universityen_US
dc.description.abstract本論文提出離線自定義語音語者喚醒詞系統,是一種能夠讓使用者自行定義語音喚醒詞並以此用來喚醒設備。系統執行時分為兩個階段:訓練階段及測試比對階段。訓練階段為自行定義並錄製一段任何語言的喚醒詞,利用語音活動檢測裁切出語音片段,然後以梅爾倒頻譜算法做為語音前處理,抽取出聲音特徵以供後續使用,再利用高斯混合模型EM算法將語音特徵訓練成聲紋模型,同時利用高斯分布隱藏馬可夫模型的Baum-Welch算法訓練對應之語音序列,兩者合起來即是特定語者語音的資料模型。 比對階段為輸入任意語音段,同樣使用梅爾倒頻譜算法抽取聲音特徵,將此特徵透過高斯混合模型的log probability從資料集中找出正確語者,而後利用隱藏馬可夫模型Viterbi算法計算出未知語音的序列,最後計算出高斯混合模型的相似程度以及編輯距離算法比對未知語音與資料語音的狀態序列匹配度,若通過門檻值即成功喚醒。 此系統可以在少量訓練資料的情況下達到準確的結果,並且於比對階段時透過先搜索聲紋再比對語音的方法省去隱藏馬可夫模型算法對整個資料集採用窮舉法運算的時間,最後將此系統實現在嵌入式開發板中評估驗證效能,結果顯示本系統能在real time運作情況下達到高準確率與低誤喚醒率。zh_TW
dc.description.abstractWe propose an Self-define Wake-Up-Word Recognition system and its embedded system Implementation. To execute whole system, It is divided into two phases: training phase and testing-comparison phase. In the training phase, a wake-up word of any language is recorded, and the voice segment is cut out by using the Voice Activity Detection, and then we use the Mel-Frequency Cepstral Coefficients as the pre-processing to extract the speech feature for follow-up use. The Expectation-Maximization Algorithm is used to train the Gaussian Mixture Model, and the Baum-Welch algorithm is used to train the Hidden Markov Model. These two models are combined to be a data model of a speaker′s speech dataset. In the testing-comparison phase, an unknown voice segment is inputted. The Voice Activity Detection and Mel-Frequency Cepstral Coefficients are still used for cutting and extracting. Next, this feature will be calculated through the log likelihood of the Gaussian Mixture Model to find the correspond speaker, and the Viterbi algorithm is used to calculate the state sequence of the unknown speech through Hidden Markov Model. Finally we calculate Gaussian Mixture Model similarity and use Levenshtein Distance to compare dataset state sequence with the unknown speech state sequence. If both of them pass the threshold, then it is a successful wake-up voice control, if not, it means waking up fails. This system can work well with a small amount of training data, and the system is implemented on the embedded board to test performance. The results show that the system can achieve high accuracy and low false alarm under real time operation.en_US
DC.subject自定義喚醒詞zh_TW
DC.subject梅爾倒譜係數zh_TW
DC.subject高斯混合模型zh_TW
DC.subject隱藏馬可夫模型zh_TW
DC.subject編輯距離zh_TW
DC.subject嵌入式系統zh_TW
DC.subjectCustomized Wake-Up-Worden_US
DC.subjectMel-Frequency Cepstral Coefficientsen_US
DC.subjectGaussian Mixture Modelen_US
DC.subjectHidden Markov Modelen_US
DC.subjectLevenshtein Distanceen_US
DC.subjectEmbedded Systemen_US
DC.title離線自定義語音語者喚醒詞系統與嵌入式開發實現zh_TW
dc.language.isozh-TWzh-TW
DC.titleSelf-defined Wake-Up-Word Recognition and its Embedded System Implementationen_US
DC.type博碩士論文zh_TW
DC.typethesisen_US
DC.publisherNational Central Universityen_US

若有論文相關問題,請聯絡國立中央大學圖書館推廣服務組 TEL:(03)422-7151轉57407,或E-mail聯絡  - 隱私權政策聲明