基於稀疏表示之語者辨識之研究; A Study on Sparse Representation Based Speaker Recognition

NCUIR > College of Electrical Engineering & Computer Science > Graduate Institute of Computer Science and Information Engineering > Electronic Thesis & Dissertation > Item 987654321/61589

Please use this identifier to cite or link to this item: http://ir.lib.ncu.edu.tw/handle/987654321/61589

Title:	基於稀疏表示之語者辨識之研究;A Study on Sparse Representation Based Speaker Recognition
Authors:	王光耀;Wang,Kuang-Yao
Contributors:	資訊工程學系
Keywords:	語者辨識;稀疏表示;機率型主成分分析;Supervector;i-vector
Date:	2013-08-27
Issue Date:	2013-10-08 15:23:08 (UTC+8)
Publisher:	國立中央大學
Abstract:	語者辨識一直以來都是語音研究中熱門的主題，其應用也相當廣泛，以門禁系統為代表。目前的研究中，以i-vector為參數的系統有相當好的效果，另外，在辨識領域上，稀疏表示分類器(Sparse Representation Classifier, SRC)是目前研究的主流，因此，我們以i-vector和SRC作為基礎的系統，提出改進辦法。本論文提出一套基於稀疏表示為基礎的辨識系統，在原有的架構流程下加入改進方式，首先是參數擷取的部分，以PPCA建構Supervector，並加入檢定的方式調整特徵值選取，使每個Component的維度可以針對資料的不同作調整，接著，我們在稀疏字典上加強，提出字典主成分選取的辦法，並對Session及Channel變異補償，使字典增加鑑別性，第三個部分，噪音字典，提出三種蒐集變異量的方式，分別利用Robust PCA、NAP、JFA的概念分解出噪音項，並希望以噪音基底吸收變異，達到去噪的效果，最後，以貝氏機率為概念的Approximate Bayesian Compressed Sensing (ABCS) 求解係數，其中，對係數做Semi-Gaussian Prior的假設，限制係數稀疏的特性。根據實驗結果顯示，不論是參數的改進、字典的處理、求解係數的方式，對辨識率都有一定程度的提升。 Speaker recognition has always been a popular topic in speech recognition research, and is applied in many area. Here, we take "Access Control System" as one of the applications. Currently, i-vector based speaker recognition system has achieved great performance. On the other hand, there are many researches concentrating on Sparse Representation Classifier (SRC). We thus base our system on those two novel concepts, i-vector and SRC, and propose some method to improve the system. In respect of feature extraction, we construct a Supervector with Probability Principal Component Analysis (PPCA), and choose the number of eigenvalues by bartlett test, so that we can select appropriate dimension for each components. In the second part of the system, we enhance the sparse dictionary, which includes choosing primary elements of the dictionary, compensating session and channel variability, and making the dictionary discriminative. In the third part, we propose noise dictionary by collecting the noise of Robust PCA, Nuisance Attribute Projection (NAP) and Joint Factor Analysis (JFA). We believe that noise basis can absorb some variability and achieve the effect of de-noising. Finally, we solve sparse coefficients using Approximate Bayesian Compressed Sensing (ABCS), which is a bayesian probability method, and restrict the sparse coefficients by assuming them being Semi-Gaussian distribution. Experimental results verify that the selected features, the dictionary processing, as well as the method for solving coefficients, has given improvement to the recognition rate up to a certain extent.
Appears in Collections:	[Graduate Institute of Computer Science and Information Engineering] Electronic Thesis & Dissertation

Files in This Item:

File	Description	Size	Format
index.html		0Kb	HTML	716	View/Open

社群 sharing

Loading...