改良式梅爾倒頻譜係數混合多種語音特徵之研究;Improved Mel Frequency Cepstral Coefficients Combined with Multiple Speech Features

NCU Institutional Repository > 資訊電機學院 > 電機工程研究所 > 博碩士論文 > Item 987654321/68744

請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/68744

題名:	改良式梅爾倒頻譜係數混合多種語音特徵之研究;Improved Mel Frequency Cepstral Coefficients Combined with Multiple Speech Features
作者:	唐曲亮;Tang,Chu-Liang
貢獻者:	電機工程學系
關鍵詞:	語音辨識;特徵合併;梅爾倒頻譜係數;關鍵詞萃取;speech recognition;feature combination;MFCC;keyword spotting
日期:	2015-07-13
上傳時間:	2015-09-23 14:22:58 (UTC+8)
出版者:	國立中央大學
摘要:	本篇論文主要研究的主題是語音辨識系統中的特徵值擷取以及特徵參數補償的部分，前者目的是將不同的特徵值做合併，其中將線性預估倒頻譜係數與梅爾倒頻譜係數結合的效果是最佳的，本論文使用高斯型的梅爾濾波器組來取代原本梅爾倒頻譜係數中的三角濾波器組，而經過實驗證實，將線性預估倒頻譜係數與梅爾倒頻譜係數以1:1的方式做合併效果是最好的，除了將特徵參數做合併之外，本論文還利用倒頻譜平均值與變異數正規化法來補償倒頻譜係數並提升整體系統的辨識效果。;This thesis studies the speech feature extracting and feature compensation in speech recognition. Several speech features are selected for combinations. The best one is cascading Linear Prediction Cepstral Coefficients (LPCC) and Mel-Frequency Cepstral Coefficient (MFCC). The MFCCs used here are obtained by utilizing a Gaussian Mel-Frequency band instead of using a triangular filter bank. And by experiments, it is found that the best combination ratio of LPCC and MFCC is 1:1. The thesis also showed that further improved performance is possible if Cepstral Mean and Variance Normalization (CMVN) is added.
顯示於類別:	[電機工程研究所] 博碩士論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	321	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....