粒子群演算法之語者確認系統

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：48

、訪客IP：52.14.126.74

姓名

蘇樺(Hua Su) 查詢紙本館藏

畢業系所

電機工程學系

論文名稱

粒子群演算法之語者確認系統
(PSO Algorithm for Speaker Verification Systems)

相關論文

★ 小型化 GSM/GPRS 行動通訊模組之研究	★ 語者辨識之研究
★ 應用投影法作受擾動奇異系統之強健性分析	★ 利用支撐向量機模型改善對立假設特徵函數之語者確認研究
★ 結合高斯混合超級向量與微分核函數之語者確認研究	★ 敏捷移動粒子群最佳化方法
★ 改良式粒子群方法之無失真影像預測編碼應用	★ 粒子群演算法應用於語者模型訓練與調適之研究
★ 改良式梅爾倒頻譜係數混合多種語音特徵之研究	★ 利用語者特定背景模型之語者確認系統
★ 智慧型遠端監控系統	★ 正向系統輸出回授之穩定度分析與控制器設計
★ 混合式區間搜索粒子群演算法	★ 基於深度神經網路的手勢辨識研究
★ 人體姿勢矯正項鍊配載影像辨識自動校準及手機接收警告系統	★ 非監督式快速語者調適演算法研究

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 ( 永不開放)

摘要(中)

在本論文中著重於語者確認後端的研究，當有了測試語料後，希望能對該測試語料做到最佳的辨識效能，因此主要的研究方向為測試語音與各註冊語者模型的處理。首先系統採用正規化計分方式，並加入粒子群演算法來優化模型參數，粒子群演算法是一種最佳化演算法，透過模擬鳥群或魚群搜索食物的方式來找尋最佳解，屬於群體智慧的方法，其粒子具有記憶性，且該演算法計算簡單與快速收斂，故將其應用於語者確認語料的建模上，藉由其優化的特性以建立更加精確的語者模型，使得系統更具有鑑別力。再者，本論文將簡單線性迴歸分析應用於語者確認系統中，簡單線性迴歸分析是統計學裡重要的分析方法，常用來分析資料之間的相關性，此處將語者確認結果建立簡單線性迴歸模型，透過普通最小平方法的估計，及判定係數的分析，對語者確認的結果做結合，使得系統對測試語音的辨識更加精準，進而提升系統的辨識效能。

摘要(英)

This thesis focused on speaker verification between test corpus and registered speaker models. First of all, the thesis introduces score normalization approaches to the speaker verification system. Then, we apply Particle Swarm Optimization algorithm to optimize model parameters. The main idea of PSO method is like fish foraging behavior. All particles of PSO have memories. The algorithm has simple calculation and fast convergence. With its optimized features to build a more accurate speaker model, the system is more discernment.
In addition, the thesis also introduces a regression analysis method to speaker verification system. Regression analysis is a useful statistics analysis method. We build the regression model for each speaker by ordinary least squares estimation and the coefficients of determination analysis. Experiments showed that the proposed method can improve performance of the speaker verification system.

關鍵字(中)

★ 粒子群演算法
★ 語者確認

關鍵字(英)

★ particle swarm optimization algorithm
★ speaker verification

論文目次

目錄
摘要I
Abstract II
目錄 III
圖目錄 V
表目錄 VI
第一章緒論 1
1.1研究動機 1
1.2語者辨識架構概述 2
1.3語者調適概述 4
1.4研究方向 5
1.5文獻探討 5
1.6章節概要 8
第二章語者確認系統之技術 10
2.1 特徵參數擷取 11
2.2 高斯混合模型 12
2.3 語者模型之訓練 13
2.3.1向量量化 14
2.3.2 EM演算法 17
2.4語者模型調適 18
2.4.1貝式調適法 19
第三章語者確認 22
3.1 GMM-UBM 22
3.2 KL距離之語者確認 24
3.3 測試正規化 25
第四章粒子群演算法 27
4.1 粒子群演算法概念 27
4.2 慣性權重 30
4.3 粒子群演算法應用於語者確認 31
第五章迴歸分析法 35
5.1 迴歸分析法概念 35
5.2 普通最小平方法 36
4.3 判定係數 38
5.4 語者確認分數之迴歸分析 42
5.5 語者確認分數的結合 43
第六章實驗與討論47
6.1 語音語料 47
6.2語者確認效能評估 48
6.2.1相等錯誤率 48
6.2.2決策成本函數 49
6.3 實驗結果50
6.3.1實驗一三種確認系統之比較50
6.3.2實驗二迴歸分析應用於語者確認之實驗 52
6.3.3實驗三粒子群演算法應用於語者確認 54
6.3.3實驗四迴歸分析和粒子群演算法之實驗 56
7.1結論 59
7.2 未來展望 60
參考文獻 61

參考文獻

[1] 呂易宸, “語音門禁系統,” 中央大學電機工程學系碩士論文, 民國100年.
[2] S. Furui, “An Overview of Speaker Recognition Technology,” Workshop on Automatic Speaker Recognition, Identification, pp. 1–9, 1994. [3] D. Burton, “Text Dependent Speaker Verification Using Vector Quantization Source Coding,” Transactions on Acoustics, Speech and Signal Processing, vol.35, pp. 133-143, 1987.
[4] A. Roland and C. Michael and L. T. Harvey, “Score Normalization for Text Independent Speaker Verification Systems,” ScienceDirect Digital Signal Processing, vol.10, pp. 42-54, 2000.
[5] B. Chen and J. W. Kuo and W. H. Tsai, “Lightly Supervised and Data Driven Approaches to Mandarin Broadcast News Transcription,” International Conference on Acoustics, Speech, and Signal Processing, vol.1, pp. I - 777-80, 2004.
[6] M. Bacchiani and B. Roark, “Unsupervised Language Model Adaptation,” International Conference on Acoustics, Speech, and Signal Processing, vol.1, pp. I-224 - I-227, 2003.
[7] 張文杰, “模型調適之語者識別系統,” 中央大學電機工程學系碩士論文, 民國94年.
[8] J. L. Gauvain and C. H. Lee, “Maximum a Posteriori Estimation for Multivariate Gaussian Mixture Observations of Markov Chains,” Transactions on Speech and Audio Processing, vol.2, no.2, pp. 291-298, 1994.
[9] C. B. de Lima and A. Alcaim and J. A. Apolinario, “On the Use of PCA in GMM and AR Vector Models for Text Independent Speaker Verification,” International Conference on Digital Signal Processing, vol.2, pp. 595-598, 2002.
[10] Y. Kida and H. Yamamoto and C. Miyajima and K. Tokuda and T. Kitamura, “Minimum Classification Error Interactive Training for Speaker Identification,” International Conference on Acoustics, Speech, and Signal Processin, vol.1, pp. 641-644, 2005.
[11] H. J. Song and H. S. Kim, “Bilinear Model Based Maximum Likelihood Linear Regression Speaker Adaptation Framework,” Signal Processing Letters, vol.16, issue 12, pp. 1063-1066, 2009.
[12] C. H. Huang and J. T. Chien and H. M. Wangb, “A New Eigenvoice Approach to Speaker Adaptation,” International Symposium on Chinese Spoken Language Processing, pp. 109-112, 2004.
[13] M. Tonomura and T. Kosaka and S. Matsunaga, “Speaker Adaptation Based on Transfer Vector Field Smoothing Using Maximum a Posteriori Probability Estimation,” International Conference on Acoustics, Speech, and Signal Processing, vol.1, pp. 688-691, 1995.
[14] V. Chatzis and A. G. Bors and I. Pitas, “Multimodal Decision Level Fusion for Person Authentication,” Transactions on Systems, Man and Cybernetics, Part A: Systems and Humans, vol.29, pp. 674-680, 1999.
[15] S. R. Madikeri and H. A. Murthy, “Mel Filter Bank Energy Based Slope Feature and Its Application to Speaker Recognition,” National Conference on Communications, pp. 1-4, 2011.
[16] 吳金池, “語者辨識系統之研究,” 中央大學電機工程學系碩士論文, 民國91年.
[17] J. Wu, “An Effective Hybrid Semi Parametric Regression Strategy for Artificial Neural Network Ensemble and Its Application Rainfall Forecasting,” International Joint Conference on Computational Sciences and Optimization, pp. 1324-1328, 2011.
[18] X. Bin and Z. Cang, “The Application of Multiple Regression Analysis Forecast in Economical Forecast the Demand Forecast of Our Country Industry Lavation Machinery in the Year of 2008 and 2009,” International Workshop on Knowledge Discovery and Data Mining, pp. 405-408, 2009.
[19] W. X. SUN and S. Ti and Z. Hai, “Study on Bus Passenger Capacity Forecast Based on Regression Analysis Including Time Series,” International Conference on Measuring Technology and Mechatronics Automation, vol.2, pp. 381-384, 2009.
[20] H. Gabriel and G. Jiwen and M. J. de la Paix and N. D. Jairu and B. W. Oyelola, “Statistical Analysis of Categorical Climatic Variables：Case of Temperature and Rain Fall,” International Conference on Environmental Science and Information Application Technology, vol.3, pp. 412-415, 2010.
[21] L. Guo and X. Deng, “Application of Improved Multiple Linear Regression Method in Oilfield Output Forecasting,” International Conference on Information Management, Innovation Management and Industrial Engineering , vol.1, pp. 133-136, 2009.
[22] L. Yingying and N. Dongxiao, “Application of Principal Component Regression Analysis in Power Load Forecasting for Medium and Long Term,” International Conference on Advanced Computer Theory and Engineering, vol.3, pp. V3-201-V3-203, 2010.
[23] 朱映霖, “利用支撐向量機改善最小錯誤鑑別式之語者辨識方法,” 中央大學電機工程學系碩士論文, 民國96年.
[24] A. Colorni and M. Dorigo and V. Maniezzo, “Distributed Optimization by Ant Colonies,” Appeared in Proceedings of ECAL91 European Conference on Artificial Life, Paris, France, Elsevier Publishing, pp. 134–142, 1991.
[25] A. G. Abro and J. Mohamad-Saleh, “Enhanced Global Best Artificial Bee Colony Optimization Algorithm,” Sixth UKSim/AMSS European Symposium on Computer Modeling and Simulation, pp. 95–100, 2012.
[26] B. Santosa and M. K. Ningrum, “Cat Swarm Optimization for Clustering,” International Conference of Soft Computing and Pattern Recognition, pp. 54–59, 2009.
[27] P. Guo and X. Wang and Y. Han, “The Enhanced Genetic Algorithms for the Optimization Design,” International Conference on Biomedical Engineering and Informatics, vol.7, pp. 2990–2994,2010.
[28] K. Y. Chan and C. K. F. Yiu and S. Nordholm, “Multichannel Filters for Speech Recognition Using a Particle Swarm Optimization,” International Conference on Control, Automation, Robotics and Vision, pp. 937-942, 2012.
[29] M. Sheikhan, “Hybrid of PSO and SOM Neural Network for Immittance Spectral Frequency Quantization in AMR-WB Speech Codecs,” Conference on Information and Knowledge Technology, pp. 192-196, 2013.
[30] R. Luo and W. Cai and M. Chen and D. Zhu, “An Improved Particle Swarm Optimization Algorithm for Speaker Recognition,” International Conference on Advanced Computational Intelligence, pp. 641-644, 2012.
[31] U. Mahbub and P. P. Acharjee and S. A. Fattah, “An Acoustic Echo Cancellation Scheme Based on Particle Swarm Optimization Algorithm,” IEEE Region 10 Conference, pp. 759-762, 2010.
[32] C. Y. Chen and F. Ye, “Particle Swarm Optimization Particle Swarm Optimization Algorithm and Its Application to Clustering Analysis,” International Conference on Networking, Sensing and Control, vol.2,pp. 789-794, 2004.
[33] 賴易峰, “粒子群演算法應用於語者確認系統之研究,” 中央大學電機工程學系碩士論文, 民國101年.
[34] 吳昱宏, “粒子群演算法應用於語者模型訓練與調適之研究,” 中央大學電機工程學系碩士論文, 民國102年.
[35] M. Ben and R. Blouet and F. Bimbot, “A Monte Carlo Method for Score Normalization in Automatic Speaker Verification Using Kullback-Leibler Distances,” International Conference on Acoustics, Speech and Signal Processing, vol.1, pp. I-689-I-692, 2002.
[36] D. Yuan and L. Liang and Z. Xian-Yu and Z. Jian, “Studies on Model Distance Normalization Approach in Text Independent Speaker Verification,” Acta Automatica Sinica, vol.35, pp. 556-560, 2009.
[37] R. Auckenthaler and M. Carey and H. Lloyd-Thomas, “Score Normalization for Text Independent Speaker Verification Systems,” Digital Signal Processing, vol.10, pp. 42-54, 2000.
[38] 陳俊傑, “結構化語者模型之研究,” 中央大學電機工程學系碩士論文, 民國93年.
[39] T. Watanabe and K. Shinoda and K. Takagi and K. I. Iso, “High Speed Speech Recognition Using Tree Structured Probability Density Function,” International Conference on Acoustics, Speech, and Signal Processing, vol.1, pp. 556-559, 1995.
[40] B. Xiang and T. Berger, “Efficient Text Independent Speaker Verification with Structural Gaussian Mixture Models and Neural Network,” Transactions on Speech and Audio Processing, vol.11, no.5, pp. 447-456, 2003.
[41] 李憲昌, “維度經驗重心分享粒子群演算法,” 中央大學電機工程學系碩士論文, 民國102年.
[42] J. Kennedy and R. Eberhart, “Particle Swarm Optimization,” International Conference on Neural Networks, vol.4, pp. 1942-1948, 1995.
[43] Y. Shi and R. C. Eberhart, “Parameter Selection in Particle Swarm Optimization,” Evolutionary Programming VII. Lecture Notes in Computer Science, vol.1447, pp. 591–600, 1998.
[44] D. Matrouf and J. F. Bonastre, “Accurate Log Likelihood Ratio Estimation by Using Test Statistical Model for Speaker Verification,” Speaker and Language Recognition Workshope, pp. 1–5, 2006.
[45] 徐明龍, “商用統計學,” 鼎茂圖書, 1版, 民國100年.
[46] 邱皓政和林碧芳和許碧純和陳育瑜, “統計學：原理與應用,” 五南, 初版, 民國101年.
[47] 劉國鑑和蔡鴻星和沈美嬌和張水清和洪念民, “統計學,” 新文京開發出版股份, 民國100年.
[48] WIKIPEDDIA The Free Encyclopedia, Available at http://zh.wikipedia.org/zh-tw/%E6%9C%80%E5%B0%8F%E4%BA%8C%E4%B9%98%E6%B3%95
[49] 蕭文龍, “多變量分析最佳入門實用書,” 碁峰資訊股份有限公司, 民國96年.
[50] 林惠玲和陳正倉, “統計學:方法與應用,” 雙葉書廊, 3版, 民國93年.
[51] 管中閔, “統計學：觀念與方法,”華泰, 2版, 民國93年.
[52] M. L. Wei and H. Y. Lu, “The Correct Use and Interpretation of the Coefficient of Determination (R2) in Regression Analysis,” vol.47, pp. 1–7, 1999.
[53] The NIST Year 2001 Speaker Recognition Evaluation, Available at http://www.itl.nist.gov/iad/mig/tests/sre/2001/index.html.

指導教授

莊堯棠(Yau-tarng Juang)

審核日期

2014-7-7

推文