應用於語音關鍵字辨識與語者辨識的嵌入式神經網路分類器

DC 欄位	值	語言
DC.contributor	軟體工程研究所	zh_TW
DC.creator	廖冠富	zh_TW
DC.creator	Kuan-Fu Liao	en_US
dc.date.accessioned	2019-7-18T07:39:07Z
dc.date.available	2019-7-18T07:39:07Z
dc.date.issued	2019
dc.identifier.uri	http://ir.lib.ncu.edu.tw:88/thesis/view_etd.asp?URN=106525005
dc.contributor.department	軟體工程研究所	zh_TW
DC.description	國立中央大學	zh_TW
DC.description	National Central University	en_US
dc.description.abstract	關鍵字辨識(KWS)系統為語音助理之類的智慧型系統提供了啟動的便利性和耗能的平衡，但是裝置安全和使用者隱私依然難以獲得充分的保障。本研究提出一個應用於語音關鍵字辨識與語者辨識的嵌入式神經網路分類器。首先使用SOM神經網路對語音特徵進行非監督式的大分類，再使用多層前饋式神經網路分別進行KWS辨識和語者辨識。在Google Speech Commands資料集上，SOM-MFNN比起傳統的MFNN網路減少了82.12%的運算量和32.45%的記憶體使用量，取得了比MFNN高4%的辨識率提升，在自建的中文KWS資料集，我們的系統可以提升1%的辨識率，證明SOM-MFNN確實能提高語音指令辨識率的同時降低資源使用量。在語者辨識上，傳統MFNN已有98.43%的辨識率，足以保護裝置安全性。運算量與記憶體使用量小於傳統MFNN的SOM-MFNN，能夠提供須常駐運行的KWS系統一個優良的分類器模型，並且可以結合語者辨識的功能保護裝置安全性。	zh_TW
dc.description.abstract	Keyword spotting (KWS) systems facilitate achieving a balance between easy activation and low energy consumption in voice assistant systems. However, device security and user privacy cannot be fully guaranteed when using such systems. This study proposed an embedded neural network classifier applicable to voice KWS and speaker identification. First, a self-organizing map (SOM) neural network was adopted to roughly classify voice features by using unsupervised classification. Next, a multilayer feed-forward neural network (MFNN) was employed to perform KWS and speaker identification. The results revealed that when the Google Speech Commands Dataset was used, the SOM-MFNN used 82.12% less computation resources and 32.45% less memory compared with the conventional MFNN. The identification rate of the SOM-MFNN also exceeded that of the MFNN by 4%. When using the self-established Chinese KWS dataset, the proposed system improved the identification rate by 1%, verifying that the SOM-MFNN can improve the identification of voice commands while reducing resource consumption. Regarding speaker identification, the conventional MFNN exhibited an identification rate of 98.43%, demonstrating sufficient device security. In sum, the SOM-MFNN, which uses less computation resources and memory than does the conventional MFNN, can serve as an outstanding classifier for KWS systems that are constantly in operation. The SOM-MFNN can also be integrated with speaker identification function to ensure device security.	en_US
DC.subject	語音關鍵字辨識	zh_TW
DC.subject	語者辨識	zh_TW
DC.subject	自組織圖神經網路	zh_TW
DC.subject	多重前饋式神經網路	zh_TW
DC.subject	嵌入式神經網路	zh_TW
DC.subject	分類器	zh_TW
DC.title	應用於語音關鍵字辨識與語者辨識的嵌入式神經網路分類器	zh_TW
dc.language.iso	zh-TW	zh-TW
DC.title	An Embedded Neural Network Classifier for Keywords Spotting and Speaker Recognizing	en_US
DC.type	博碩士論文	zh_TW
DC.type	thesis	en_US
DC.publisher	National Central University	en_US

博碩士論文 106525005 完整後設資料紀錄