語者模型之補償技術與調適演算法在語者確認及辨識之應用(I); Compensation Techniques of the Speaker Model and Adaptation Algorithm to Speaker Verification and Recognition(I)

NCU Institutional Repository > 資訊電機學院 > 電機工程學系 > 研究計畫 > Item 987654321/42604

請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/42604

題名:	語者模型之補償技術與調適演算法在語者確認及辨識之應用(I);Compensation Techniques of the Speaker Model and Adaptation Algorithm to Speaker Verification and Recognition(I)
作者:	莊堯棠
貢獻者:	電機工程系
關鍵詞:	同質模型;背景模型;階層式特性語音;最小分類誤差;偏差補償;言語資訊確認;Cohort Model;Universal Background Model;Hierarchical EigenVoice;Minimum Classification Error;Bias Compensation;Verbal Information Verification;資訊科學--軟體
日期:	2005-07-01
上傳時間:	2010-11-30 17:03:21 (UTC+8)
出版者:	行政院國家科學委員會
摘要:	在語者確認中，同質模型和背景模型分別都是用來做正規化計分的，基本上這兩種方法在做決策時是有其不同的呈現，也都各有其優缺點。於是我們提試著做出一個能結合這兩種方法優點的新式計分方式，來改善確認的效果。一般而言，特性語音調適法在少量訓練樣本情況下有相當良好的調適效果，但仍有其待改善之處，於是我們提出一個新的調適方法，階層式特性語音調適法，透過隱藏式馬可夫模型中高斯成分的分群來延伸特性語音調適法，讓其擁有階層的樹狀結構，使得能夠自動地控制一些和新語者調適語句數量有關的調適參數。此外，為了能有較好的強健性，我們嘗試一種訓練語料的新方法，此乃結合最小分類誤差和特性語音調適法的技術，希望能有較佳的訓練準則來訓練模型。我們將會把所提出的新方法和傳統的最大相似度準則運用在特性語音調適法上來做比較，以驗證運用此法後的辨識率。再者，由於其在增加調適語料後未能再有顯著改善效果的問題，這部份還是有相當大的改善空間，我們仍希望能嘗試用其他的方法來做改善，因此再提出一個以結合偏差補償模型和特性聲音調適模型為基礎來作調適模型的方法，用來改善語者調適的速度。最後，我們希望能將所有新的語者調適技術和訓練方法融合在言語資訊確認系統上，來實現真正的辨識效果。 In speaker verification, the cohort model and Universal Background Model (UBM) have been separately used for scoring normalization. Theoretically, these two approaches represent two different paradigms for decision-making, and each has its own strengths and weakness. So we utilize a two-stage decision procedure for improving verification performance. Then a novel speaker adaptation method, Hierarchical EigenVoice (HEV) is proposed. This method extends the EigenVoice through clustering the Gaussian components of HMMs into a hierarchical tree structure. It enables to autonomously control a number of adaptation parameters (model complexity) depending on the amount of adaptation utterances from a new speaker. Furthermore, a new training approach based on different techniques (Minimum Classification Error and eigenvoices) in order to achieve a better robustness when only poor training data is provided. We will compare the proposed method with the classical ML/eigenvoice methods for a speaker identification task. Eventually, we present a framework by combining Speaker Verification (SV) system with Verbal Information Verification (VIV) system. 研究期間：9308 ~ 9407
關聯:	財團法人國家實驗研究院科技政策研究與資訊中心
顯示於類別:	[電機工程學系] 研究計畫

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	358	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....