姓名 王光耀(Kuang-Yao Wang)  查詢紙本館藏   畢業系所 資訊工程學系
論文名稱 基於稀疏表示之語者辨識之研究
(A Study on Sparse Representation Based Speaker Recognition)
摘要(中) 語者辨識一直以來都是語音研究中熱門的主題,其應用也相當廣泛,以門禁系統為代表。目前的研究中,以i-vector為參數的系統有相當好的效果,另外,在辨識領域上,稀疏表示分類器(Sparse Representation Classifier, SRC)是目前研究的主流,因此,我們以i-vector和SRC作為基礎的系統,提出改進辦法。
本論文提出一套基於稀疏表示為基礎的辨識系統,在原有的架構流程下加入改進方式,首先是參數擷取的部分,以PPCA建構Supervector,並加入檢定的方式調整特徵值選取,使每個Component的維度可以針對資料的不同作調整,接著,我們在稀疏字典上加強,提出字典主成分選取的辦法,並對Session及Channel變異補償,使字典增加鑑別性,第三個部分,噪音字典,提出三種蒐集變異量的方式,分別利用Robust PCA、NAP、JFA的概念分解出噪音項,並希望以噪音基底吸收變異,達到去噪的效果,最後,以貝氏機率為概念的Approximate Bayesian Compressed Sensing (ABCS) 求解係數,其中,對係數做Semi-Gaussian Prior的假設,限制係數稀疏的特性。
摘要(英) Speaker recognition has always been a popular topic in speech recognition research, and is applied in many area. Here, we take "Access Control System" as one of the applications. Currently, i-vector based speaker recognition system has achieved great performance. On the other hand, there are many researches concentrating on Sparse Representation Classifier (SRC). We thus base our system on those two novel concepts, i-vector and SRC, and propose some method to improve the system.
In respect of feature extraction, we construct a Supervector with Probability Principal Component Analysis (PPCA), and choose the number of eigenvalues by bartlett test, so that we can select appropriate dimension for each components. In the second part of the system, we enhance the sparse dictionary, which includes choosing primary elements of the dictionary, compensating session and channel variability, and making the dictionary discriminative. In the third part, we propose noise dictionary by collecting the noise of Robust PCA, Nuisance Attribute Projection (NAP) and Joint Factor Analysis (JFA). We believe that noise basis can absorb some variability and achieve the effect of de-noising. Finally, we solve sparse coefficients using Approximate Bayesian Compressed Sensing (ABCS), which is a bayesian probability method, and restrict the sparse coefficients by assuming them being Semi-Gaussian distribution.
Experimental results verify that the selected features, the dictionary processing, as well as the method for solving coefficients, has given improvement to the recognition rate up to a certain extent.
關鍵字(中) ★ 語者辨識
★ 稀疏表示
★ 機率型主成分分析
★ Supervector
★ i-vector
論文目次 摘要 vi
Abstract vii
章節目次 viii
圖目錄 x
表目錄 xi
第一章 緒論 1
1.1 前言 1
1.2 研究動機與目的 2
1.3 研究方法與章節概要 4
第二章 語者辨識簡介及文獻探討 6
2.1 簡介(Introduction) 6
2.2 特徵參數 7
2.2.1線性預測倒頻譜(Linear Predictive Cepstrum Coefficients, LPCC) 7
2.2.2梅爾倒頻譜(Mel-scale Frequency Cepstral Coefficients, MFCC) 8
2.2.3韻律學參數擷取(Prosodic Feature) 8
2.2.4高斯混和模型之超級向量(GMM-Supervector) 9
2.3 變異補償演算法與辨識器方法 10
2.3.1高斯混合模型(Gaussian Mixture Model, GMM) 10
2.3.2支持向量機(Support Vector Machine, SVM) 11
2.3.3核化函數(Kernel Function) 12
2.3.4擾動屬性投影(Nuisance Attribute Projection, NAP) 15
2.3.5聯合因素分析(Joint Factor Analysis, JFA) 17
2.3.6 ZT-norm 18
第三章 基於PPCA之超級向量擷取 19
3.1 簡介(Introduction) 19
3.2 高斯混合模型之超級向量(GMM-Supervector) 19
3.3 基於機率型主成分分析之因素分析模型 20
3.4 巴雷特檢定(Bartlett Test) 23
3.5 i-vector 24
3.6 參數擷取架構 25
第四章 基於稀疏表示之語者辨識 26
4.1 簡介(Introduction) 26
4.2 稀疏表示分類器(Sparse Representation Classifier, SRC)
4.3 字典處理及變異補償 28
4.4 噪音字典 33
4.5 Approximate Bayesian Compressed Sensing(ABCS) 35
第五章 實驗結果 38
5.1 實驗設置與環境 38
5.2 PPCA-Supervector與基礎方法比較 39
5.3 字典處理及變異補償之效能比較 40
5.3.1對字典以SVD及RPCA建構之效果 40
5.3.2 NAP對變異補償之效果 41
5.3.3 Kernel SRC之效果 42
5.4 噪音字典對SRC效果的影響 43
5.5 ABCS求解係數之效果 44
第六章 結論及未來研究方向 45
參考文獻 46
指導教授 王家慶 審核日期 2013-8-27
