基於稀疏表示之語者辨識之研究; A Study on Sparse Representation Based Speaker Recognition

NCU Institutional Repository > 資訊電機學院 > 資訊工程研究所 > 博碩士論文 > Item 987654321/61589

請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/61589

題名:	基於稀疏表示之語者辨識之研究;A Study on Sparse Representation Based Speaker Recognition
作者:	王光耀;Wang,Kuang-Yao
貢獻者:	資訊工程學系
關鍵詞:	語者辨識;稀疏表示;機率型主成分分析;Supervector;i-vector
日期:	2013-08-27
上傳時間:	2013-10-08 15:23:08 (UTC+8)
出版者:	國立中央大學
摘要:	語者辨識一直以來都是語音研究中熱門的主題，其應用也相當廣泛，以門禁系統為代表。目前的研究中，以i-vector為參數的系統有相當好的效果，另外，在辨識領域上，稀疏表示分類器(Sparse Representation Classifier, SRC)是目前研究的主流，因此，我們以i-vector和SRC作為基礎的系統，提出改進辦法。本論文提出一套基於稀疏表示為基礎的辨識系統，在原有的架構流程下加入改進方式，首先是參數擷取的部分，以PPCA建構Supervector，並加入檢定的方式調整特徵值選取，使每個Component的維度可以針對資料的不同作調整，接著，我們在稀疏字典上加強，提出字典主成分選取的辦法，並對Session及Channel變異補償，使字典增加鑑別性，第三個部分，噪音字典，提出三種蒐集變異量的方式，分別利用Robust PCA、NAP、JFA的概念分解出噪音項，並希望以噪音基底吸收變異，達到去噪的效果，最後，以貝氏機率為概念的Approximate Bayesian Compressed Sensing (ABCS) 求解係數，其中，對係數做Semi-Gaussian Prior的假設，限制係數稀疏的特性。根據實驗結果顯示，不論是參數的改進、字典的處理、求解係數的方式，對辨識率都有一定程度的提升。 Speaker recognition has always been a popular topic in speech recognition research, and is applied in many area. Here, we take "Access Control System" as one of the applications. Currently, i-vector based speaker recognition system has achieved great performance. On the other hand, there are many researches concentrating on Sparse Representation Classifier (SRC). We thus base our system on those two novel concepts, i-vector and SRC, and propose some method to improve the system. In respect of feature extraction, we construct a Supervector with Probability Principal Component Analysis (PPCA), and choose the number of eigenvalues by bartlett test, so that we can select appropriate dimension for each components. In the second part of the system, we enhance the sparse dictionary, which includes choosing primary elements of the dictionary, compensating session and channel variability, and making the dictionary discriminative. In the third part, we propose noise dictionary by collecting the noise of Robust PCA, Nuisance Attribute Projection (NAP) and Joint Factor Analysis (JFA). We believe that noise basis can absorb some variability and achieve the effect of de-noising. Finally, we solve sparse coefficients using Approximate Bayesian Compressed Sensing (ABCS), which is a bayesian probability method, and restrict the sparse coefficients by assuming them being Semi-Gaussian distribution. Experimental results verify that the selected features, the dictionary processing, as well as the method for solving coefficients, has given improvement to the recognition rate up to a certain extent.
顯示於類別:	[資訊工程研究所] 博碩士論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	716	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....