快速演算法在大字彙關鍵詞萃取上的應用

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：86

、訪客IP：3.135.192.215

姓名

楊鎮光(Zhen-Guang Yang ) 查詢紙本館藏

畢業系所

電機工程研究所

論文名稱

快速演算法在大字彙關鍵詞萃取上的應用

相關論文

★ 小型化 GSM/GPRS 行動通訊模組之研究	★ 語者辨識之研究
★ 應用投影法作受擾動奇異系統之強健性分析	★ 利用支撐向量機模型改善對立假設特徵函數之語者確認研究
★ 結合高斯混合超級向量與微分核函數之語者確認研究	★ 敏捷移動粒子群最佳化方法
★ 改良式粒子群方法之無失真影像預測編碼應用	★ 粒子群演算法應用於語者模型訓練與調適之研究
★ 粒子群演算法之語者確認系統	★ 改良式梅爾倒頻譜係數混合多種語音特徵之研究
★ 利用語者特定背景模型之語者確認系統	★ 智慧型遠端監控系統
★ 正向系統輸出回授之穩定度分析與控制器設計	★ 混合式區間搜索粒子群演算法
★ 基於深度神經網路的手勢辨識研究	★ 人體姿勢矯正項鍊配載影像辨識自動校準及手機接收警告系統

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

在傳統whole word based的關鍵詞萃取辨識系統中,辨識效能常因關鍵詞彙的增加而導致辨識率下降及辨識時間增加,所謂的快速演算法,就是藉由關鍵詞字彙結構的相關性,將關鍵詞予以分類並加以結構化,因而能藉由樹枝狀的搜尋架構,大幅的減少辨識時間,而隨著關鍵詞彙的增加,辨識率仍能維持ㄧ定水準而不墬,這就是將快速演算法應用在大字彙關鍵詞萃取的目的.
在作法上,我們先將關鍵詞分成幾個次部分(subsets),而不同關鍵詞的次部分會包含相同的共同次字彙(common subword),如同樹枝一般,在辨識出前N個最佳的共同的次字彙之後,就能夠減小搜尋範圍,捨去不可能入選的關鍵詞,針對相似度比較高的關鍵詞進行最後的確認.進而達到快速的目的.
除了演算法本身之外,論文中還針對多項能夠提昇辨識率的方案進行實驗,這些方案包含了將無關詞對語音特徵的機率加上一縮小權值,以使關鍵詞的切音區更加準確.使用動態的權值,讓不同的測試語句都有相對應最佳的縮小權值.另外鑒於測試和訓練語料取得環境的不同(分別為電話及麥克風錄音),我們以CMS加上Cepstrum weighting分別對訓練語料及測試語料進行處理,並重新訓練次音節模型,最後,將處理前後(指有無加上CMS及Cepstrum weighting)的機率值混合考慮,並由實驗找出最佳的混合比例.由實驗結果可以發現,動態權值及機率混合考慮這兩種方法如配合使用,可達最佳辨識率Top1為91.32%.而僅使用單一權值的辨識效果最差,Top1達83.67%.
為了使關鍵詞萃取系統更加完整,關鍵詞拒絕的能力是有必要被加入的,在實驗結果方面,加入關鍵詞拒絕後的正確率為81.51%.

關鍵字(中)

★ Cepstrum Weighting
★ CMS
★ 快速演算法
★ 樹枝狀
★ 關鍵字萃取

關鍵字(英)

論文目次

第一章序論 1
1.1 研究動機…………………………………………………1
1.2 關鍵詞萃取的基本定義………………………………….1
1.3 快速演算法的概念……………………………………….1
1.4技術回顧………………………………………………….2
1.5 論文大綱………………………………………………….4
第二章語音辨識的基本技術 5
2.1 概論…………………………………………………5
2.2 特徵參數的求取…………………………………………5
2.3 隱藏式馬可夫模型………………………………………7
2.3.1 隱藏式馬可夫模型的描述…………………………8
第三章系統架構 11
3.1 概論……………………………………………….11
3.2 模型參數………………………………………………11
3.3 訓練與辨識的演算法…………………………………12
3.3.1 訓練演算法…………………………………………..12
3.3.2 辨識模組與辨識演算法….…………………………15
第四章快速演算法 16
4.1 概論……………………………………………….16
4.2 快速演算法……………………………………………16
4.3 無關詞模組對特徵值機率的縮小權值.……………..21
4.3.1 靜態的縮小權值…………..…………………………21
4.3.2 動態調整縮小權值………..…………………………21
4.4 兩種對特徵值處理的方法─Cepstrum Mean Subtraction和Cepstrum Weighting…………………….……………22
4.4.1 Cepstrum Mean Subtraction和Cepstrum Weighting…22
4.4.2將Cepstrum+Delta Cepstrum及Cepstrum+Delta Cepstrum+CMS+Cepstrum Weighting的機率值加權計算…………………………………………………23
4.5 關鍵詞的拒絕能力.…………………………………23
4.5.1 關鍵詞拒絕的原理…………………………………..23
4.5.2 訓練反模型(anti-model)的方法……………………..24
4.5.3 訓練臨界值τk的方法………………………………25
4.5.4 關鍵詞拒絕的演算法………………………………..26
4.5.5 錯誤率的計算………………………………………..26
第五章實驗與結果 28
5.1 概論……………………………………………….28
5.2 實驗環境………………………………………………28
5.3 大字彙的關鍵詞萃取實驗………………….…………..28
第六章結論
6.1 結論…………………………………………………...…38
6.2 未來發展………………………………………………...39
參考文獻

參考文獻

[1]Torsten Zeppenfeld et al., “ Improving the MS-TDNN for Word
Spotting ”, ICASSP ’93, pp. II-475~II-478.
[2]S. V. Kosonocky et al., “ A Continuous Density Neural Tree Network Word Spotting System ”, ICASSP ’95, pp. 1870~1878.
[3]Jay G. Wilpon et al., “ Automatic Recognition of keywords in Unconstrained Speech Using Hidden Markov Models ”, IEEE Trans on Assp, Vol. 38, No. 11, Nov 1990, pp. 1870~1878.
[4]R. C. Rose et al., “ A Hidden Markov Model Based Keyword Recognition System ”, ICASSP ’90, pp. 129~132.
[5]Rohilcek, J., Russel, W., Roukos, S., and Gish, H.（1989） “ Continuos Hidden Markov Modells for Speaker Independent Word Spotting, ” Proc. Int. Conf. On Acoust., Speech, and Signal Processing, pp. 627~630.
[6]Rose, B., and Paul, D.（1990） “ A Hidden Maekov Model Based Keyword Recognition System, ” Proc. Int. Conf. On Acoust., Speech, and Signal Processing, I , pp. 129~132.
[7]Rose, R.（1992） “ Discriminant Word Spotting Techniques for Rejecting Non-vocabulary Utterances in Unconstrained Speech ”, Proc. Int. Conf. On Acoust., Speech, and Signal Processing, II, pp. 105~108.
[8]Bahl, L., Brown, P., Souza, P., and Mercer, R.（1986） “Maximum Mutual Information Estimation of Hidden Markov Model Parameters for Speech Recognition, ” Proc. Int. Conf. on Acoust., Speech, and Signal Processing, I , pp. 49~52.
[9]A.L. Higgins and R.E. Wohlford,”Keyword recognition using template concatenation,”in Proc. IEEE Int. Conf. Acust., Speech, Signal Processing, Apr.1985, pp 1233-1236
[10]J .G.Wilpon,L. R. Rabiner,C. H.Lee, and E. R. Goldman,”Automatic recognition of keywords in unconstrained speech using hidden Markov models,”IEEE Trans. Acoust.,Speech,Signal Processing, vol.11,pp 1870-1878 ,Nov. 1990
[11]R.C. Rose and D.B.Paul ,”A hidden Markov model based keyword recognition system,”in Proc. IEEE Int .Conf Acoust.,Speech,Signal Processing ,Apr.1990,pp.129-130
[12]Christiansen, R. W. and Rushforth, C.K., “ Deteding and Locating Key Words in Continuous Speech Using Linear Predictive Coding. ” IEEE Trans. on Acoustics, Speech, and Signal Processing, Vol. ASSP-25, No. 5, pp. 361~367, October 1977.
[13]Higgins, A. L. and Wohford, R. E., “Keyword Recognition Using Template Concatenation ” Proc. IEEE Int Conf. Acous., Speech, and Signal Processing, pp. 1233~1236, Tampa, Florida, March 1985.
[14]H. W. Hon and K. F. Lee, “ CMU robust vocabulary independent speech recognition system, ” Proc. Int. Conf. On Acoust., Speech, and Signal., pp. 889~892, May 1991.
[15]J. R. Bellegarda and D. Nahamoo, “ The mixture continuous parameter modeling for speech recognition , ” IEEE Trans. on Acoust, Speech and Signal. Proc., vol. ASSP-38, no. 12, pp. 2033~2045, 1990.
[16]B. H. Juang and L.R. Rabiner, “ Mixture Autoregressive Hidden Markov Models for Speech Signal ”, IEEE Trans. ASSP, vol. 33, pp. 1404~1412, Dec. 1985.
[17]X. D. Huang and M. A. Jack, “ Semi-continuous Hidden Markov models for speech signals ” Computer, Speechand Language, vol. 3 pp. 239~257, 1989.
[18]L. F. Larnel, and S. Seneff, “ speech database development：design and analysis of the acoustic-phonetic corpus, ” Proc. MIT Speech Recognition Workshop, July 1986.
[19]Richard Schwarz and Yen-Lu Chow, “ The N-Best Algorithm：An Efficient and Exact Procedure for Finding The N Most Likely Sentence Hypothese ”, Proc. Speech&Natural Language Workshop Oct., 1989., pp. 199~202.
[20]E.F. Huang,H.C.Chuan,and F.K. Soong,”A Fsat Algorithm for Large Vocabulary Keyword Spotting Application”,IEEE,Trans Speech and Audio Processing,VOL,2,NO.3,JULY 1994,PP,449-452
[21]Wilpon, J. G., DeMarco, D. M., and Mikkilineni, R. P., “ Isolated Word recognition over the DDD telephone network-result of two Extensive field studues, ” Proc. IEEE Int. Conf. Acous., Speech and Sig. Processing, 1S. 1. 10, pp. 55~57, New York City, NewYork, Apri, 1988.
[22]Chigier, B.（1992） “ Rejection and Keyword Spotting Algorithms for a Directory Assistance City Name Recognition Application, ” Proc. ICASSP, pp. 93~96.
[23]L. R. Rabiner and B. H. Juang, “ Fundamentals of Speech Recognition ”, Prentice-Hall Co. Ltd, 1993.
[24]F. K. Soong and A. F. Rosenberg, “ On the Use of Instantaneous and Transitional Spectral Information in Speaker Recognition ”, Proc. ICASSP, pp. 877~880, 1986.
[25]F. Itakura and T. Umezaki, “ Distance Measure for Speech Recognition Based on the Smoothed Group Delay Spectrum ”, Proc. ICASSP, pp. 1257~1260, 1987.
[26]D. Mansour and B. H. Juang, “ A Familiy of Distortion Measure Based upon Projection for Robust Speech Recognition ”, IEEE Trans. ASSP, Vol. 37, pp. 1659~1671, 1989.
[27]K. K. Paliwai and M. M. Sondhi, “ Recognition of Noisy Speech using Cumulant-Based Linear Prediction Analysis ”, Proc. ICASSP, pp. 429~432, 1990.
[28]D. Mansour and B. H. Juang, “ The Short-Time Modified Coherence Representation and Noisy Speech Recognition ”, IEEE Trans. ASSP, Vol. 37, pp. 795~804, June 1989.
[29]L. R. Rabiner and R. W. Schafer, “ Digital Processing of Speech Recognition Signals ”, Prentice-Hall Co. Ltd, 1978.
[30]Mokbel,C., Monne,J. and Jouvet, D.:”One-Line Adaption of a Speech Recognizer to Variations in Telephone Line Conditions”,European Conference of Speech Communication and Technology (EURPOSPEECH),pp.1247-1250,1993
[31] Mokbel,C.,Paches-Ieal,P., Monne,J. and Jouvet, D.:”Compensation of Telephone Line Effect for Robust Speech Recognition”,Int Conf. Spoken Language Processing,pp.987-990,1994
[32]Becchetti,C. and L.P. Ricotti,Speech Recognition,John Wiley& Sons,1999.
[33]Rabiner,L. and B.H. Juang,”Fundamentals of Speech Recognition”,Prentice-Hall,1993.

指導教授

莊堯棠(Yau-Tarng Juang)

審核日期

2001-6-6

推文