經驗模態分解法之語音辨識

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：44

、訪客IP：18.117.172.189

姓名

陳厚君(Hou-Jyun Chen) 查詢紙本館藏

畢業系所

電機工程學系

論文名稱

經驗模態分解法之語音辨識
(An Empirical Mode Decomposition Method To Speech Recognition)

相關論文

★ 小型化 GSM/GPRS 行動通訊模組之研究	★ 語者辨識之研究
★ 應用投影法作受擾動奇異系統之強健性分析	★ 利用支撐向量機模型改善對立假設特徵函數之語者確認研究
★ 結合高斯混合超級向量與微分核函數之語者確認研究	★ 敏捷移動粒子群最佳化方法
★ 改良式粒子群方法之無失真影像預測編碼應用	★ 粒子群演算法應用於語者模型訓練與調適之研究
★ 粒子群演算法之語者確認系統	★ 改良式梅爾倒頻譜係數混合多種語音特徵之研究
★ 利用語者特定背景模型之語者確認系統	★ 智慧型遠端監控系統
★ 正向系統輸出回授之穩定度分析與控制器設計	★ 混合式區間搜索粒子群演算法
★ 基於深度神經網路的手勢辨識研究	★ 人體姿勢矯正項鍊配載影像辨識自動校準及手機接收警告系統

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

摘要
本篇論文重點在於語音信號分析處理這部分，根據黃鍔等人發表了一個新的資料處理方法—經驗模態分解法，這個方法利用系統變化的內部時間尺度來作為能量的直接析出，可將資料表達成內建模態函數，而這些函數即是原輸入訊號的基底，其具有完整性、幾乎正交性及可適性。可適性可表達原函數之物理特性，藉以處理非線性及非穩態性時間序列問題。因為這種方法的特性，再加上語音訊號也是非線性時間序列，而且瞭解說話內容文字特性及說話人的特性將有助於語音辨識，所以基底能夠表達原輸入訊號之物理特性將更加幫助我們作語音模型的訓練。是故改善傳統訊號分析方式，使訊號呈現其特性，為本研究之一大課題。
本論文利用經驗模態分解法找出與文字特性較有關的輸入基底，以訓練一套模型，以在辨識流程上求得較好的辨識率。

關鍵字(中)

★ 語音辨識
★ 經驗模態分解法

關鍵字(英)

★ Empirical Mode Decomposition Method
★ Speech Recognition

論文目次

目錄
摘要 Ⅰ
目錄 Ⅱ
附圖目錄 Ⅳ
表格目錄 Ⅵ
第一章緒論 1
1.1研究動機 1
1.2研究目標 2
1.3章節概要 3
第二章語音辨識基本技術 4
2.1特徵參數擷取 4
2.2隱藏式馬可夫模型 8
2.3模型的建立與訓練 11
2.3.1 Viterbi Search演算法 11
2.3.2 訓練流程 13
2.4連續語音辨認方法 15
2.5辨識流程 17
第三章希伯特黃轉換理論 19
3.1即時頻率 19
3.2內建模態函數 21
3.3經驗模態分解法 23
3.4希伯特黃頻譜 30
3.5經驗模態分解法應用於語音辨識 31
第四章實驗與討論 38
4.1實驗環境 .38
4.2實驗與討論 39
4.2.1實驗一　內建模態函數辨識率的比較 39
4.2.2實驗二　組合內建模態函數辨識率的比較 40
4.2.3實驗三　內建模態函數在數字連續辨識的辨識比較 42
4.2.4實驗四　內建模態函數在雜訊環境下辨識率比較 43
第五章結論與展望 57
5.1結論 .57
5.2未來展望 57
參考文獻 59

參考文獻

參考文獻
[1] E. Bedrosian, “A product theorem for Hilbert transform,” Proc. IEEE 51, pp. 868-869, 1963.
[2] L. Cohen, Time-frequency analysis, Englewood Cliffs, NJ: Prentice-Hall, 1995.
[3] J. R. Deller, Jr., J. G. Proakis, J. H. L. Hansen, Discrete-time processing of speech signals, Wiley-IEEE Press, 2000.
[4] D. Gabor, “Theory of communication,” Proc. IEE 93, pp. 429-457, 1946.
[5] N. E. Huang, “The empirical mode decomposition and the hilbert spectrum for nonlinear and non-stationary time series analysis,” NASA(manuscript), pp. 903-995, 1996.
[6] B. H. Juang, “The past, present, and future of speech processing,” IEEE Trans. Signal Processing, pp. 24-28, May 1998.
[7] T. Kawahara, C. H. Lee, and B. H. Juang, “Flexible speech understanding based on combined key-phrase Detection and Verification,” IEEE Trans. Speech and Audio Processing, vol. 6, Nov. 1998.
[8] M. W. Koo and S. J. Lee, “An utterance verification system based on subword modeling for a vocabulary independent speech,” Eurospeech 1999.
[9] M. W. Koo, C. H. Lee, and B. H. Juang, “Speech recognition and utterance verification based on a generalized confidence score,” IEEE Trans .on Speech and Audio Processing, vol. 9, Nov. 2001.
[10] C. M. Liu, C. C. Chiu, and H. Y. Chang “Design of vocabulary -independent mandarin keyword spotters,” IEEE Trans. Speech and Audio Processing, vol. 8, July 2000.
[11] Q. Li, B. H. Juang, Q. Zhou, and C. Lee, “Automatic verbal information verification for user authentication,” IEEE Trans. Speech and Audio Processing, vol. 8, Sep. 2000.
[12] S. E. Levinson, L. R. Rabiner, and M. M. Sondhi, “An introduction to the application of the theory of probabilistic function of a markov process to automatic speech recognition,” The Bell System Technical Journal, vol. 62, April 1983.
[13] C. S. Liu, H. C. Wang and C. H. Lee, “Speaker verification using normalized log-likelihood score,” IEEE Trans. Speech and Audio Processing, pp.57-60, Jan. 1996
[14] N. Moreau and D. Jouvet “Use of a confidence measure based in frame level likelihood ratios for the rejection of incorrect data,” Eurospeech, 1999.
[15] H. Ney, “The use of a one stage dynamic programming algorithm for connected word recognition,” IEEE Trans. Acoustic, Speech, Signal Processing, vol.32, April 1984.
[16] J. Neyman and E. S. Pearson, “On the problem of the most efficient tests of statistical hypotheses,” phil. Trans. R. Soc. A, vol. 231, pp. 289-337, 1933.
[17] J. Neyman and E. S. Pearson, “On the use and interpretation of certain test criteria for purpose of statistical inference,” Biometrika, pt I, vol. 20A, pp.175-240, 1928.
[18] M. G. Rahim, C. H. Lee, and B. H. Juang, “Discriminative utterance verification for connected digits recognition,” IEEE Trans. Speech and Audio Processing, vol. 5, no. 3, May 1997.
[19] L. R. Rabiner, “A tutorial on hidden markov models and selected application in speech recognition,” Proceedings of the IEEE, vol. 77, no. 2, Feb. 1989.
[20] L. R. Rabiner and B. H. Juang, Fundamentals of speech recognition, Prentice Hall, New Jersey, 1993.
[21] L. R. Rabiner and R. W. Schafer, Digital Processing of Speech Recognition Signals, Prentice-Hall Co. Ltd, 1978.
[22] E. Roseberg, J. Delong, C. H. Lee, B. H. Juang and F. K. Soong, “The use of cohort normalized scores for speaker recognition,” Pro. ICSLP 92. pp.599-602. Oct. 1992.
[23] J. T. Tou, Pattern recognition principles, Addison-Wesley, 1974.
[24] L. X. and B. X. Wang “Utterance verification for spontaneous mandarin speech keyword spotting,” IEEE Proceedings ICII 2001, vol.3. pp. 397-401
[25] Y. Zhzng, D. Zhand and Z. Shu, “A novel text-independent speaker verification method based on the global speaker model,” IEEE Trans. Systems, Man, and Cybernetics, vol. 30, pp. 598-602, 2000.

指導教授

莊堯棠(Yau-Tarng Juang)

審核日期

2005-7-4

推文