整合高斯混合與具性能指標支撐向量機模型之語者確認研究

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：20

、訪客IP：18.190.176.253

姓名

游智翔(Chih-hsiang Yu) 查詢紙本館藏

畢業系所

電機工程學系

論文名稱

整合高斯混合與具性能指標支撐向量機模型之語者確認研究
(A Hybrid Model of GMM and SVM with Representative Labels for Speaker Verification)

相關論文

★ 小型化 GSM/GPRS 行動通訊模組之研究	★ 語者辨識之研究
★ 應用投影法作受擾動奇異系統之強健性分析	★ 利用支撐向量機模型改善對立假設特徵函數之語者確認研究
★ 結合高斯混合超級向量與微分核函數之語者確認研究	★ 敏捷移動粒子群最佳化方法
★ 改良式粒子群方法之無失真影像預測編碼應用	★ 粒子群演算法應用於語者模型訓練與調適之研究
★ 粒子群演算法之語者確認系統	★ 改良式梅爾倒頻譜係數混合多種語音特徵之研究
★ 利用語者特定背景模型之語者確認系統	★ 智慧型遠端監控系統
★ 正向系統輸出回授之穩定度分析與控制器設計	★ 混合式區間搜索粒子群演算法
★ 基於深度神經網路的手勢辨識研究	★ 人體姿勢矯正項鍊配載影像辨識自動校準及手機接收警告系統

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

本論文主要針對語者確認系統上，提出新的辨識流程，使得系統效能得到提升，此架構包含了高斯混合模型和具性能指標支撐向量機模型的整合應用。
　　其中，具性能指標支撐向量機，主要是在原始特徵向量中，加入所定義的性能指標，使得向量維度增高，讓整個系統更具鑑別力。而在提出的系統架構中，測試句與所有註冊模型算分數，以決定類別標籤，依據Top1減Top2的分數，並觀察是否大於或等於臨界值，若大於或等於，則使用Top1的類別標籤，使測試句的特徵向量增維，並和含類別標籤的支撐向量機算距離值，反之，則進入原本傳統的語者確認系統。
　　從實驗結果顯示，在提出的架構中，高斯混合模型選定為128-mixture並定臨界值為0.3時，系統性能可達最好的相等錯誤率及決策成本函數為14.43%和0.1743，比起支撐向量機語者確認系統的效能17.86%和0.2175，改善了3.43%和0.0414，而比起傳統的語者確認系統的效能15.87%和0.1912，改善了1.44%和0.0169。

摘要(英)

This thesis proposes a new recognition system to improve performance for speaker verification. The proposed system combines the Gaussian Mixture Model (GMM) and Support Vector Machine (SVM) with representative labels.
The SVM with representative labels is built by adding the defined class labels to the original feature vectors to increase the dimension of feature vectors and make the system more discriminative. In the proposed system, each input segment is sent to compute the log-likelihood ratio with all the enrolled models to decide the class labels. Accordingly, if the difference of the scores between Top1 and Top2 is greater than a chosen threshold, the class labels for the top1 speaker will be added as extra features to the original feature vectors. Then the augmented feature vectors are applied to the SVM classifier. Otherwise, we verify the speaker using the GMM-UBM baseline system.
The experimental result shows that with a 128-mixture GMM and a 0.3 threshold, the proposed system obtains a 3.43% EER and 4.14% DCF improvement over the SVM speaker verification system, and a 1.44% EER and 1.69% DCF improvement over the baseline system.

關鍵字(中)

★ 高斯混合模型
★ 支撐向量機
★ 語者確認

關鍵字(英)

★ support vector machine
★ speaker verification
★ gaussian mixture model

論文目次

摘要．．．．．．．．．．．．．．．．．．．．．．．．．．．i
目錄．．．．．．．．．．．．．．．．．．．．．．．．．．iii
附圖目錄．．．．．．．．．．．．．．．．．．．．．．．． vi
附表目錄．．．．．．．．．．．．．．．．．．．．．．．．vii
第一章緒論
1.1 研究動機．．．．．．．．．．．．．．．．．．．．．． 1
1.2 語者辨識概述．．．．．．．．．．．．．．．．．．．． 2
1.3 研究方向．．．．．．．．．．．．．．．．．．．．．． 4
1.4 章節概要．．．．．．．．．．．．．．．．．．．．．． 4
第二章語音處理與語者辨識基本技術
2.1 特徵參數擷取．．．．．．．．．．．．．．．．．．．． 6
2.2 語者模型建立．．．．．．．．．．．．．．．．．．．． 8
2.2.1 高斯混合語者模型．．．．．．．．．．．．．．．．． 9
2.2.2 語者模型訓練流程．．．．．．．．．．．．．．．．． 10
2.2.3 向量量化．．．．．．．．．．．．．．．．．．．．． 11
2.2.4 EM 演算法．．．．．．．．．．．．．．．．．．．． 14
2.3 貝式調適法．．．．．．．．．．．．．．．．．．．．． 15
2.4 語者辨識．．．．．．．．．．．．．．．．．．．．．． 20
2.5 語者確認．．．．．．．．．．．．．．．．．．．．．． 21
2.6 相等錯誤率與偵測錯誤交易曲線圖．．．．．．．．．．． 22
第三章系統架構
3.1 支撐向量機．．．．．．．．．．．．．．．．．．．．． 24
3.2 支撐向量機之語者模型訓練．．．．．．．．．．．．．． 32
3.3 具性能標支撐向量機．．．．．．．．．．．．．．．．． 34
3.4 語者確認系統．．．．．．．．．．．．．．．．．．．． 35
3.4.1 支撐向量機語者確認系統．．．．．．．．．．．．．． 35
3.4.2 整合高斯混合與具性能指標支撐向量機模型語者確認系統．．．．．．．．．．．．．．．．．．．．．．．．．．． 36
第四章語者辨識實驗之研究
4.1 語音資料庫．．．．．．．．．．．．．．．．．．．．． 38
4.2 支撐向量機應用於語者確認系統．．．．．．．．．．．． 40
4.2.1 實驗一支撐向量機語者確認系統．．．．．．．．．． 40
4.2.2 實驗二整合型語者確認系統．．．．．．．．．．．． 43
第五章結論與未來展望
5.1 結論．．．．．．．．．．．．．．．．．．．．．．．． 47
5.2 未來展望．．．．．．．．．．．．．．．．．．．．．． 48
參考文獻．．．．．．．．．．．．．．．．．．．．．．．． 49

參考文獻

[1] L. R. Rabiner and B. H. Juang, Fundamentals of Speech Recognition, Prentice Hall, New Jersey, 1993.
[2] X. Huang, A. Acero and H. W. Hon, Spoken Language Processing, Prentice Hall, 2001.
[3] J. T. Tou, R. C. Gonzalez, Pattern Recognition Principles, Addison Wesley, 1974.
[4] L. S. Lee, Y. Lee, “Voice Access of Global Information for Broad-Band Wireless: Technologies of Today and Challenges of Tomorrow,” Proceedings of the IEEE, vol. 89, no. 1, pp. 41-57, January 2001.
[5] R. Vergin and D. O’Shaughnessy and A. Farhat, “Generalized Mel Frequency Coefficients for Large-Vocabulary Speaker-Independent Continuous-Speech Recognition,” IEEE Trans. Speech and Audio Processing, vol. 7, no. 5, pp. 525-532, September 1999.
[6] T. K. Moon, “The Expectation-Maximization Algorithm,” IEEE Signal Processing Magazine, vol. 13, no. 6, pp. 47-60, November 1996.
[7] D. A. Reynolds and R. C. Rose, “Robust Text-Independent Speaker Identification Using Gaussian Mixture Models,” IEEE Trans. Speech and Audio Processing, vol. 3, no. 1, pp. 72-83, January 1995.
[8] D. Reynolds and T. Quatieri, “Speaker Verification Using Adapted Gaussian Mixture Models,” Digital Signal Processing 10, PP. 19-41, 2000.
[9] A. Martin, G. Doddington, T. Kamn, M. Ordowski, and M, Przybocki, “The DET curve in assessment of detection task performance,” in Proceedings of European Conference on Speech Communication and Technology, pp. 1895-1898, 1997.
[10] V. Wan and W. M. Campbell, “Support vector machines for speaker verification and identification,” in Proc. Neural Networks for Signal Processing X, pp. 775–784, 2000.
[11] V. Wan and S. Renals, “SVMSVM: Support Vector Machine speaker verification methodology,” in Proc. IEEE ICASSP, 2003.
[12] Johan A.K. Suykens, Tony Van Gestel, Jos De Brabanter, Bart De Moor and Joos Vandewalle, Least Squares Support Vector Machines, World Scientific, 2002
[13] S. Raghavan, G.. Lazarou and J. Picone, “Speaker Verification Using Support Vector Machines,” in Proc. IEEE, 2006.
[14] M.H. Liu, B.Q. Dai, Y.L. Xie, Z.Q. Yao, “A New Hybrid GMM/SVM for speaker verification,” in Proc. IEEE ICPR, 2006.
[15] V. N. Vapnik, The Nature of Statistical Learning Theory, Springer, 1995.
[16] M.H. Liu, B.Q. Dai, Y.L. Xie, Z.Q. Yao, “Improved GMM-UBM/SVM for speaker verification,” in Proc. IEEE ICASSP, 2006.
[17] C. P. Chen and J. Bilmes, “MVA Processing of Speech Features”, Audio, Speech and Language Processing, vol. 15, pp257-270, 2007.
[18] “The NIST Year 2001 Speaker Recognition Evaluation Plan”, http://www.nist.gov/speech/tests/spk/2001/
[19] Chih-Wei Hsu, Chih-Chung Chang, and Chih-Jen Lin, “A Practical Guide to Support Vector Classification”, available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.
[20] 吳金池， “語者辨識系統之研究＂，國立中央大學電機工程研究所碩士論文，民國九十一年。
[21] 賴彥輔， “語者辨識之研究＂，國立中央大學電機工程研究所碩士論文，民國九十二年。
[22] 陳柏仁， “應用投票演算法之語者確認系統研究＂，國立中央大學電機工程研究所碩士論文，民國九十六年。
[23] 朱映霖， “利用支撐向量機改善最小錯誤鑑別式之語者辨識方法＂，國立中央大學電機工程研究所碩士論文，民國九十六年。

指導教授

莊堯棠(Yau-Tarng Juang)

審核日期

2008-6-23

推文