關鍵詞萃取系統及語音聲控車之應用

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：13

、訪客IP：18.225.72.231

姓名

林品宏(Ping-Hung Lin) 查詢紙本館藏

畢業系所

電機工程學系

論文名稱

關鍵詞萃取系統及語音聲控車之應用
(A Keyword Spotting Technique and It’s Application to A Voice-activated car)

相關論文

★ 小型化 GSM/GPRS 行動通訊模組之研究	★ 語者辨識之研究
★ 應用投影法作受擾動奇異系統之強健性分析	★ 利用支撐向量機模型改善對立假設特徵函數之語者確認研究
★ 結合高斯混合超級向量與微分核函數之語者確認研究	★ 敏捷移動粒子群最佳化方法
★ 改良式粒子群方法之無失真影像預測編碼應用	★ 粒子群演算法應用於語者模型訓練與調適之研究
★ 粒子群演算法之語者確認系統	★ 改良式梅爾倒頻譜係數混合多種語音特徵之研究
★ 利用語者特定背景模型之語者確認系統	★ 智慧型遠端監控系統
★ 正向系統輸出回授之穩定度分析與控制器設計	★ 混合式區間搜索粒子群演算法
★ 基於深度神經網路的手勢辨識研究	★ 人體姿勢矯正項鍊配載影像辨識自動校準及手機接收警告系統

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

本論文的研究主題是針對前人的關鍵詞萃取中的特徵參數擷取作改良，將前人所用之LPC方法改為MFCC方法，並結合語音辨識系統建構一套聲控車系統。本論文主體可分為兩個部分，在關鍵詞萃取部分，關鍵詞與無關詞模組是用次音節模型來建立，目的是使的系統更有可攜性。第二部分是將建立出來的模型，利用Visual Basic 6的開發環境，應用一階動態辨識演算法，將我們的辨識技術製作成視窗化的人機介面，達到即時辨識的效果，並且可以根據辨識的結果，與市售的遙控車結合，讓車子可以依照使用者所講的方向移動。

摘要(英)

The topic of the thesis is modifying part of keyword spotting that feature extracting, we substitute method Mel-frequency cepstral coefficients for method Linear prediction coefficients, and construct a voice-activated car by speech recognition.
There are two topics in the thesis. In the first part, we focus on keyword spotting, and our keyword models and garbage models are building by sub-syllable models, and the advantage is that the system can save a lot of time. In the second part, we use Visual Basic 6 to make a human-machine interface for real-time recognition, and we combine the human-machine interface with remote control car to make a voice-activated car.

關鍵字(中)

★ 梅爾倒頻譜係數
★ 關鍵詞萃取
★ 語音聲控

關鍵字(英)

★ keyword spotting
★ Mel-frequency cepstral coefficients
★ voice-activated

論文目次

第一章緒論 1
1.1 研究動機 1
1.2 文獻回顧 1
1.3 論文大綱 6
第二章語音訊號處理 7
2.1 短時段語音處理[41] 7
2.1.1 取音框 7
2.1.3 能量計算 9
2.2 特徵參數擷取 9
2.2.1 梅爾倒頻譜 9
2.3 隱藏式馬可夫模型 13
2.4 聲學模型 15
2.5 模型訓練與參數重估 20
第三章關鍵詞萃取 25
3.1 關鍵詞萃取架構 25
3.1.1 關鍵詞模型 25
3.1.2 無關詞模型 26
3.2 辨識流程 26
3.2.1 辨識模組的排列 26
3.2.2 辨識演算法 27
第四章實驗與結果 31
4.1 實驗環境 31
4.2 關鍵詞萃取實驗 33
第五章系統應用 36
5.1 辨識流程 36
5.2 系統介紹 37
6.1 結論 41
6.2 未來展望 41
參考文獻 43

參考文獻

[1] 蔡佳君,國語發音和方法,台灣學生書局,1993.
[2] L. R. Rabiner, Ronald W. Schafer, Digital Processing of Speech Signal, Prentice-Hall, INC.1978.
[3] S. Furui, Digital Speech Processing, Synthesis, and Recognition ,Marcel Dekker, INC.1989.
[4] B.H. Juang, “Speech recognition in adverse environment”, Computer Speech and language, 5, pp275-294,1991.
[5] A. V. Oppenheim, R. W. Schafer, J. R. Buck, Discrete-Time Signal Processing, 曾建誠, 陳常侃, 王鵬華, 丁建均, 第二版, 離散時間訊號處理, 全華科技圖書股份有限公司, 2004.
[6] B. H. Juang, L. R. Rabiner, J. G. Wilpon, “On the Use of Bandpass Liftering in Speech Recognition,” IEEE Trans. Assp-35, NO.7, pp. 947-954, July. 1984.
[7] Y. Tohkura, “Weighted Cepstral Distance Measure for Speech Recognition,” IEEE Trans. Assp-35, NO.10, pp.1414-1422, Oct. 1987.
[8] F. K. Soong, M. Mohan, “A Frequency -Weighted Itakura Spectral Distortion Measure and Its Application to Speech Recognition in Noise,” IEEE Trans. on Assp Vol. 36, NO 1, Jan. 1988.
[9] K.K. Paliwal and M.M. Sondhi, “Recognition of Noisy Speech Using Cumulant-Based Linear Prediction Analysis,” Proc. ICASSP, pp.429-432, 1990.
[10] S. IMAI, “Cepstral Analysis Synthesis on The Mel Frequency Scale,” Proc. ICASSP, pp. 93 – 96, 1983.
[11] D. Mansour, B. H. Juang, “The Short-Time Modified Coherence Representation and Noisy Speech Recognition,” IEEE Trans. on Assp Vol 37, NO 6, pp. 795-804, June 1989.
[12] H. Singer, T. Umezaki, F. Itakura, “Low Bit Quantization of the Smoothed Group Delay Spectrum for Speech Recognition,” Proc. ICASSP, pp. 761-765, 1990.
[13] J. G. Wilpon et al.,”Automatic Recognition of Keywords in Unconstrained Speech Using Hidden Markov Models,” IEEE Trans. on Assp Vol.38, NO11, pp.1870-1878, Nov.1990.
[14] L. R. Rabiner, “A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition,” Proc. IEEE Vol 77, NO.2, pp. 257-286, Feb.1989.
[15] R. P. Lippmann, “An Introduction’’ to Computing with Neural Nets,” IEEE ASSP Mag. Vol 4, pp. 4 – 22, 1987
[16] D. E. Rumelhart, B. Widrow, M. A. Lehr, “The Basic Ideas in Neural Networks,” Communication of the ACM Vol 37, NO.3, March 1994
[17] J. R. Rohilcek et al., ”Continuous Hidden Markov Modeling For Speaker-Independent Word Spotting,” Proc. Int. Conf. on Assp., pp.627-630, 1989.
[18] R. C. Rose, D. B. Paul, ”A Hidden Markov Model Based Keyword Recognition System,” Proc. Int. Conf. on Assp. Vol.1, pp.129-132, 1990
[19] R. C. Rose, ”Discriminant Wordspotting Techniques For Rejecting Non-Vocabulary Utterances In Unconstrained Speech,” Proc. Int. Conf. on Assp. Vol.2, pp.105-108, 1992.
[20] L. Bahl et al.,” Maximum Mutual Information Estimation of Hidden Markov Model Parameters for Speech Recognition,” Proc. Int. Conf. on Assp. Vol.11, pp. 49-52, 1986.
[21] C. Torre, A. Acero, ”Discriminative Training of Garbage Model for Non-Vocabulary Utterance Rejection,” Proc. Int. Conf. on Spoken Language Processing, June 1994.
[22] Moreno et al, ”Rejection Techniques in Continuous Speech Recognition Using Hidden Markov Model,” Proc. European Conf, on Signal Processing, pp.1383-1386, 1990.
[23] M. W. Feng, B. Mazor, “Continuous Word Spotting for Applications in Telecommunication,” Proc. Int. Conf. on Spoken Language Processing, pp. 21-24, 1992.
[24] R.W. Christiansen, C. K. Rushforth, ” Detecting and Locating Key Words in -Continuous Speech Using Linear Predictive Coding,” IEEE Trans. on Assp vol.25,No. 5, pp.361-367, Oct. 1977.
[25] A. Higgins, R. Wohlford, “Keyword recognition using template concatenation,” Proc. IEEE int Conf. on Assp. Vol.10, pp.1233-1236, 1985.
[26] H. W. Hon, K. F. LEE, “CMU Robust Vocabulary-Independent Speech Recognition System,” Proc. IEEE int Conf. on Assp. Vol.2, pp.889-892, May 1991.
[27] J. R. Bellegarda, D. Nahamoo, ” Tied Mixture Continuous Parameter Modeling for Speech Recognition,” IEEE Trans. on Assp Vol.38, pp.2033-2045,1990.
[28] B. H. Juang, L. Rabiner, “Mixture Autoregressive Hidden Markov Models for Speech Signals,” IEEE Trans. on Assp Vol.33, pp. 1404-1412,Dec. 1985.
[29] X. D. Huang, M. A. Jack, “Semi-continuous hidden Markov models for speech signals”, Computer Speech and Language Vol.3, pp.239-257,1989.
[30] L. F. Larnel, S. Seneff, “Speech Database Development`: Design and Analysis of the Acoustic-Phonetic Corpus,” Proc. MIT Speech Recognition Workshop, July 1986.
[31] R. Schwartz, Y. L. Chow,” The N-best algorithms: an efficient and exact procedure for finding the N most likely sentence hypotheses,” Proc. ICASSP, pp.81-84, 1990.
[32] S. R. Young, W. H. Ward, ” Recognition Confidence measures for spontaneous spoken dialog,” Proc. European. Conf. on Speech Communications, pp.1177-1179, 1993.
[33] R. A. Sukkar, J. G. Wilpon,” A two pass classifier for utterance rejection in keyword spotting,” Proc. Int. Conf. on Assp. Vol.2, pp.451-454, April 1993.
[34] W. Chou, B. H. Juang, C. H. Lee,” Segmental GPD training of HMM based speech recognizer,” Proc. Int. Conf. on Assp. Vol.1, pp.473-476, 1992.
[35] L. Villarrubia, A. Acero, ”Rejection techniques for digit recognition in telecommunication applications,” Proc. Int. Conf. on Assp. Vol.2, pp.455-458, 1993.
[36] J. G. Wilpon, D. M. DeMarco, R. P. Mikkilineni,” Isolated Word Recognition Over the DDD Telephone Network Results of Two Extensive Field Studies,” Proc. IEEE Int. Conf. on Assp Vol.1, pp.55-58,1988.
[37] B. Chigier,” Rejection and Keyword Spotting Algorithms for a Directory Assistance City Name Recognition Application,” Proc. ICASSP, pp.93-96,1992.
[38] Y. Gao et al, ” Tangerine：a large vocabulary Mandarin dictation system,” Proc. ICASSP Vol.1, pp.77-80,1995.
[39] C. E. Mokbel, G. F. A. Chollet,” Automatic Word Recognition in Cars,” IEEE Trans. on Assp Vol.3, NO.5, pp.346-356, Sept. 1995.
[40] 廖弘源, 吳宗憲教授便利生活的多媒體人機通訊發明, Feb 2011,第122期國科會工程科技E-paper.
[41] 王小川, 語音訊號處理,修訂二版, 全華圖書股份有限公司, 2009年2月.
[42] R. Vergin, D. O’Shaughnessy, A. Farhat,”Generalized Mel Frequency Cepstral Coefficients for Large-Vocabulary Speaker-Independent Continuous-Speech Recognition,” IEEE Trans. on Speech and Audio processing VOL. 7, NO.5, pp.525-532, 1999.
[43] C. Ai et al, ”Pipeline Damage and Leak Detection Based on Sound Spectrum LPCC and HMM,” Intelligent Systems Design and Applications, 2006.,829-833,2006.
[44] R. M. Nickel, “Feature - Automatic Speech Character Identification,” IEEE Circuits and Systems Magazine, Vol.6, pp.10-31,2006.
[45] 黃國彰, 「關鍵詞萃取與確認之研究”,國立中央大學碩士論文」,中華民國八十五年六月.
[46] 蔡炎興, 「關鍵詞萃取即語者辨識系統之研製」,國立中央大學碩士論文,中華民國九十二年六月.
[47] L. Gu, S. A. Zahorian, “A New Robust Algorithm for Isolated Word Endpoint Detection,” IV-4161 International Conference on Acoustics, Speech, and Signal Processing, Orlando, FL, pp.13-17, May 2002.
[48] 黃明哲,易習Visual Basic 6 程式語言基礎入門, 經緯國際股份有限公司,2009年3月.
[49] 黃明哲,易習Visual Basic 6 程式語言進階應用, 經緯國際股份有限公司,2009年3月.
[50] 陳永達,詹可文, 微電腦控制 : 專題製作 : VB串並列埠控制, 全華圖書股份有限公司, 初版,2004年.

指導教授

莊堯棠(Yau-Tarng Juang)

審核日期

2012-6-15

推文