強健性語音辨識及語者確認之研究

DC 欄位	值	語言
DC.contributor	電機工程學系	zh_TW
DC.creator	凌欣暉	zh_TW
DC.creator	xing-hung lan	en_US
dc.date.accessioned	2010-8-3T07:39:07Z
dc.date.available	2010-8-3T07:39:07Z
dc.date.issued	2010
dc.identifier.uri	http://ir.lib.ncu.edu.tw:88/thesis/view_etd.asp?URN=965201096
dc.contributor.department	電機工程學系	zh_TW
DC.description	國立中央大學	zh_TW
DC.description	National Central University	en_US
dc.description.abstract	本論文可分為三個部分：關鍵詞萃取、特徵參數統計值正規化法及語者確認。在關鍵詞萃取方面，採用次音節中的右相關音素模型串連來產生關鍵詞與無關詞模組語音辨識系統經常因環境不匹配的影響而使辨識率大幅的下降，特徵參數統計值正規化技術有低複雜度及運算快速的優點，本論文以ARUORA 2語料庫來評估效能，統計圖等化法結合ARMA低通濾波器可將統計圖等化法之辨識率由84.93％提升至86.37％，而使用統計圖等化法結合調適性ARMA濾波器則可提升至86.91％。語者確認系統是利用參數核函數結合高斯混合模型及支撐向量機模型，藉以提升系統效能。使用各語者的高斯混合模型參數建立超級向量，以雜訊屬性補償(NAP)修正超級向量，在訓練階段中，需將超級向量做正規化，之後利用正規化後的超級向量訓練SVM模型。而在仿冒者的選取上，則是選取與目標語者特徵最相似的前n名仿冒語音，使得訓練出來的SVM 模型更有鑑別力。而測試時以測試分數正規化技術調整距離值。從NIST 2001語料庫實驗結果顯示，64mixture的參數核函數(NAP)結合測試分數正規化之確認系統可達最好的相等錯誤率及決策成本函數分別為4.17%及0.0491。	zh_TW
dc.description.abstract	This thesis consists of three main parts：Keyword Spotting、Cepstral Feature normalization and speaker verification.In the Keyword Spotting, the use of sub-syllable models to establish the keyword and filler module. Environment mismatch is the major source of performance degradation in speech recognition. Cepstral Feature normalization Technique has been popularly used as a powerful approach to produce robust features. A common advantage of these methods is its low computation complexity. The experimental results on Aurora 2 database had shown that the Histogram Equalization and ARMA filter front-end achieved 86.37%, and Histogram Equalization and Adaptive ARMA filter front-end achieved achieved 86.91% digit recognition rates. The speaker verification combines the Gaussian Mixture Model (GMM) and Support Vector Machine (SVM) with Kernel Function.From the UBM, we can use map to get the parameters of the GMM. We used the new features to establish target supervector and imposter supervector,then we do the NAP process to modify supervector. In the train stage, we used the target supervector and imposter supervector to train SVM model. About the imposters selection, we choose the top n speaker’s whose characteristics are similar to the target which can let the model become more discriminative.In the testing stage, we used the test normalization to adjust the distance.From the experiment on NIST 2001 SRE, we can find 64mixture parametric kernel combined with result in better EER and DCF which are 4.17% and 0.0491 respectively.	en_US
DC.subject	語音辨識	zh_TW
DC.subject	語者確認	zh_TW
DC.subject	支撐向量機	zh_TW
DC.subject	強健特徵參數	zh_TW
DC.subject	關鍵詞萃取	zh_TW
DC.subject	Keyword Spotting	en_US
DC.subject	Speech Recognition	en_US
DC.subject	speaker verification	en_US
DC.subject	Support vector machine	en_US
DC.subject	robust features	en_US
DC.title	強健性語音辨識及語者確認之研究	zh_TW
dc.language.iso	zh-TW	zh-TW
DC.title	A Study of Robust Speech Recognition and Speaker Verification	en_US
DC.type	博碩士論文	zh_TW
DC.type	thesis	en_US
DC.publisher	National Central University	en_US

博碩士論文 965201096 完整後設資料紀錄