關鍵詞萃取及語者辨識系統之研製

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：58

、訪客IP：18.118.137.243

姓名

蔡炎興(Yan-Hsing Tsai) 查詢紙本館藏

畢業系所

電機工程學系

論文名稱

關鍵詞萃取及語者辨識系統之研製
(A System for Keyword Spotting and Speaker Recognition)

相關論文

★ 小型化 GSM/GPRS 行動通訊模組之研究	★ 語者辨識之研究
★ 應用投影法作受擾動奇異系統之強健性分析	★ 利用支撐向量機模型改善對立假設特徵函數之語者確認研究
★ 結合高斯混合超級向量與微分核函數之語者確認研究	★ 敏捷移動粒子群最佳化方法
★ 改良式粒子群方法之無失真影像預測編碼應用	★ 粒子群演算法應用於語者模型訓練與調適之研究
★ 粒子群演算法之語者確認系統	★ 改良式梅爾倒頻譜係數混合多種語音特徵之研究
★ 利用語者特定背景模型之語者確認系統	★ 智慧型遠端監控系統
★ 正向系統輸出回授之穩定度分析與控制器設計	★ 混合式區間搜索粒子群演算法
★ 基於深度神經網路的手勢辨識研究	★ 人體姿勢矯正項鍊配載影像辨識自動校準及手機接收警告系統

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

本論文的研究主題是針對前人的關建詞萃取、確認技術加以改進，並結合語者辨識技術建構一套系統。本論文主體可分為三個部分。在關鍵詞萃取方面，關鍵詞與無關詞模組是用次音節模型來建立的，目的是使最後建立的系統更具有可攜性。另外，除了找出關鍵詞模組與無關詞模組的混合數有最好的搭配方式外，還應用了GCS辨識演算法在關鍵詞的萃取上，使得辨識時就具有部分的拒絕能力，最後再應用類似Beam Search的概念來對系統加速。而關鍵詞的確認上，同樣的我們使用了次音節模型來作假設測試，並且提出了一個不用訓練每個次音節臨界值的方法，使得以後建立確認系統可以更快速。最後，簡略地介紹我們使用語者辨識技術的方法。結合前人的語者識別技術【27】，建構一套系統，並且利用Visual C++的MFC、SDK將我們的語音辨識核心技術包起來，實現視窗化的使用者介面，使得我們的理論能做到即時的線上測試。

關鍵字(中)

★ 語者識別
★ 關鍵詞萃取
★ 語音辨識

關鍵字(英)

★ Speaker Recognition
★ Keyword Spotting

論文目次

摘要 ............................................................I
目錄 ............................................................II
附圖目錄 ........................................................V
表格目錄 ........................................................VII
第一章緒論 .....................................................1
1.1 研究動機 ....................................................1
1.2 研究目標 ....................................................2
1.3 論文大綱 ....................................................3
第二章語音辨識基本技術 .........................................4
2.1 特徵參數擷取 ................................................4
2.2 隱藏式馬可夫模型 ............................................8
2.3 聲學模型 ....................................................11
2.4 模型訓練與參數預估 ..........................................16
2.4.1 訓練演算法 ................................................16
2.4.2 訓練流程圖 ................................................19
第三章關鍵詞萃取與確認 .........................................21
3.1 概論 ........................................................21
3.2 關鍵詞萃取架構 ..............................................22
3.2.1 關鍵詞模組 ................................................22
3.2.2 無關詞模型 ................................................23
3.2.3 辨識模組的排列 ............................................24
3.3 辨識演算法 ..................................................25
3.4 辨識流程 ....................................................28
3.5 廣義信任分數 ................................................29
3.6 關鍵詞確認 ..................................................33
3.6.1 確認流程 ..................................................33
3.6.2 次音節的假設測試 ..........................................35
3.6.3 錯誤率的計算 ..............................................38
3.7 系統加速 ....................................................40
第四章語者辨識與確認 ...........................................42
4.1 語者辨識 ....................................................42
4.2 語者確認 ....................................................45
4.2.1 語者模型 ..................................................45
4.2.2 全域語者模型(Global Speaker Model) ........................46
第五章實驗與結果 ...............................................48
5.1 實驗環境 ....................................................48
5.2 關鍵詞萃取實驗 ..............................................50
5.2.1 混合數對辨識率的影響 .....................................50
5.2.2 廣義信任分數（GCS） .......................................52
5.3 關鍵詞確認實驗值 ...........................................55
5.3.2 關鍵詞確認 ................................................58
5.4 系統加速 ....................................................61
5.5 系統實現 ....................................................62
第六章結論與展望 ...............................................68
6.1 結論 ........................................................68
6.2 未來展望 ....................................................69
參考文獻 ........................................................71

參考文獻

[1] M.-W. Koo, C.-H. Lee, and B.-H Juang, “ Speech Recognition and Utterance Verification Based on a Generalized Confidence Score, ” IEEE Trans .on Speech and Audio Processing, vol. 9, No. 8, Nov. 2001.
[2] Chi-Min Liu, Chin-Chih Chiu, and Hung-Yuan Chang “ Design of Vocabulary -Independent Mandarin Keyword Spotters, ” IEEE Trans. on Speech and Audio Processing, vol. 8, No. 4, July 2000.
[3] Qi Li, B.-H, Juang, Qiru Zhou, and C.-H. Lee, “ Automatic Verbal Information Verification for User Authentication, ” IEEE Trans. on Speech and Audio Processing, vol. 8, No. 5, Sep. 2000.
[4] T. Kawahara, C.-H. Lee, and B.-H. Juang, “ Flexible Speech Understanding Based on Combined Key-Phrase Detection and Verification, ” IEEE Trans. on Speech and Audio Processing, vol. 6, No. 6, Nov. 1998.
[5] B. H. Juang, “ The past, present, and future of speech processing, ” IEEE Trans. on Signal Processing, pp. 24-28, May 1998.
[6] M. G. Rahim, C.-H. Lee, and B.-H. Juang, “ Discriminative Utterance Verification for Connected Digits Recognition, ” IEEE Trans. on Speech and Audio Processing, vol. 5, No. 3, May 1997.
[7] D. Burshtein, “ Robust parametric modeling of duration in hidden Markov models, ” IEEE Trans. on Speech Audio Processing, vol. 4, pp. 240-242, May 1996.
[8] H. Ney, “ The use of a one stage dynamic programming algorithm for connected word recognition, ” IEEE Trans on. Acoustic, Speech, Signal Processing, vol.32, No.2, April 1984.
[9] S. E. Levinson, L. R. Rabiner, and M. M. Sondhi, “ An Introduction to the Application of the Theory of Probabilistic Function of a Markov Process to Automatic Speech Recognition, ” The Bell System Technical Journal, vol. 62, No. 4, April 1983.
[10] J. Neyman and E. S. Pearson, “ On the problem of the most efficient tests of statistical hypotheses, ” phil. Trans. R. Soc. A, vol. 231, pp. 289-337, 1933.
[11] Lin Xin and Bing-Xi Wang “ Utterance Verification For Spontaneous Mandarin Speech Keyword Spotting, ” IEEE Proceedings ICII 2001, Beijing, pp. 397-401 vol.3
[12] Myoung-Wan Koo and Sun-Jeong Lee, “ An Utterance Verification System Based on Subword Modeling For A Vocabulary Independent Speech, ” Eurospeech 1999.
[13] N. Moreau and D, Jouvet “ Use of A Confidence Measure Based in Frame Level Likelihood Ratios for The Rejection of Incorrect Data, ” Eurospeech, 1999.
[14] Tatsuya Kawahara, C.-H. Lee and B.-H. Juang “ Combining Key-Phrase Detection and Subword-Based Verification For Flexible Speech Understanding, ” in Proc IEEE Int. Conf. Acoustic, Speech, Signal Processing, Munich, Germany, May 1997, pp. 1159-1162
[15] L. R. Rabiner, “ A Tutorial on Hidden Markov Models and Selected Application in Speech Recognition, ” Proceedings of the IEEE, vol. 77, No. 2, Feb. 1989.
[16] L. R. Rabiner and B. H. Juang, “ Fundamentals of Speech Recognition, ” Prentice Hall, New Jersey, 1993.
[17] John R. Deller, Jr., John G. Proakis, John H. L. Hansen, “ Discrete-Time Processing of Speech Signals ”, 1987
[18] L. R. Rabiner and R. W. Schafer, “ Digital Processing of Speech Recognition Signals, ” Prentice-Hall Co. Ltd, 1978
[19] Tou, J. T., “ Pattern Recognition Principles, ” Addison-Wesley, 1974
[20] J. Neyman and E. S. Pearson, “ On the use and interpretation of certain test criteria for purpose of statistical inference, ” Biometrika, pt I, vol. 20A, pp.175-240, 1928.
[21] Y. Zhzng, D. Zhand and Z. Shu, “ A novel text-independent speaker verification method based on the global speaker model, ” IEEE Trans. on Systems, Man, and Cybernetics, 30(5):598-602, 2000.
[22] Chi-Shi Liu, Hsiao-Chuan Wang and Chin-Hui Lee, “ Speaker verification using normalized log-likelihood score, ” IEEE Trans. on Speech and Audio Processing, Jan. 1996, pp.57-60.
[23] E. Roseberg, J. Delong, C. H. Lee, B. H. Juang and F. K. Soong, “ The Use of Cohort Normalized Scores for Speaker Recognition, ” Pro. ICSL 92. Banff, pp.599-602. Oct. 1992.
[24] 蔡永琪，“ 基於次音節單元之關鍵詞辨識 ”，國立中央大學碩士論文，中華民國八十四年六月
[25] 黃國彰，“ 關鍵詞萃取與確認之研究 ”，國立中央大學碩士論文，中華民國八十五年六月
[26] 王維邦，“ 連續國語語音關鍵詞萃取系統之研究與發展 ”，國立中央大學碩士論文，中華民國八十六年六月
[27] 吳金池，“ 語者辨識系統之研究 ”，國立中央大學碩士論文，中華民國九十年五月

指導教授

莊堯棠(Yau-Tarng Juang)

審核日期

2003-6-10

推文