結合隱藏式馬可夫模型與類神經網路之國語語音辨識

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：36

、訪客IP：3.129.42.71

姓名

林志榮(Zhe-Run Lin) 查詢紙本館藏

畢業系所

電機工程學系

論文名稱

結合隱藏式馬可夫模型與類神經網路之國語語音辨識

相關論文

★ 小型化 GSM/GPRS 行動通訊模組之研究	★ 語者辨識之研究
★ 應用投影法作受擾動奇異系統之強健性分析	★ 利用支撐向量機模型改善對立假設特徵函數之語者確認研究
★ 結合高斯混合超級向量與微分核函數之語者確認研究	★ 敏捷移動粒子群最佳化方法
★ 改良式粒子群方法之無失真影像預測編碼應用	★ 粒子群演算法應用於語者模型訓練與調適之研究
★ 粒子群演算法之語者確認系統	★ 改良式梅爾倒頻譜係數混合多種語音特徵之研究
★ 利用語者特定背景模型之語者確認系統	★ 智慧型遠端監控系統
★ 正向系統輸出回授之穩定度分析與控制器設計	★ 混合式區間搜索粒子群演算法
★ 基於深度神經網路的手勢辨識研究	★ 人體姿勢矯正項鍊配載影像辨識自動校準及手機接收警告系統

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

在語者相關系統方面，三種系統的辨識率皆在九成以上，其中HMM-NN-Net、NN-NN-Net狀態模型更可達百分之百，並且在經過對收斂條件適當的調整後，HMM-NN-Net狀態模型的辨識率以微幅的差距超越隱藏式馬可夫模型。
在語者無關系統方面，HMM-NN-Net狀態模型以94.25﹪的辨識率領先其他模型，進一步證明了新方法的可行性。同時，利用HMM-NN-Net與NN-NN-Net兩種狀態模型的比較，對類神經網路收斂問題，做完整的分析。

摘要(英)

Hidden Markov model (HMM) was widely used for speech recognition and has been proved useful in dealing with the statistical and sequential aspects of the speech signal. However, their discriminative properties are weak if they are trained with the maximum likelihood. On the other hand, neural networks (NN) have powerful classification capability but are not well-suited for dealing with time-varying input patterns. In this study, a hybrid HMM-NN speech recognition system that combines the advantages of both models is presented. Three neural net state models, HMM-NN-Net, HMM-HMM-Net and NN-NN-Net, are developed for the proposed hybrid HMM-NN system. All the experimental results are compared with the one obtained from HMM.
In the speaker-dependent experiment, the recognition rates of all the three models are above the level of 90 percent. Furthermore, in spite of the results of HMM-HMM-Net models, all error rates approach to zero after adjusting the criterion.
In the speaker-independent case, HMM-NN-Net model achieves a recognition rate of 94.25 percent and has the best performance compared with other models. Besides, NN-NN-Net model requires less training time than HMM-NN-Net model although its recognition capability cannot compete with HMM-NN-Net model.
The experimental results indicate that the hybrid HMM-NN recognition system based on HMM-NN-Net model improves the performance of traditional HMM system. It is also found that the criterion of neural net state models was related to the recognition capability.

關鍵字(中)

★ 隱藏式馬可夫模型
★ 類神經網路模型
★ 語者相關系統
★ 語者無關系統

關鍵字(英)

論文目次

摘要I
AbstractII
誌謝III
目錄IV
附圖目錄VI
表格目錄VIII
第一章導論1
1.1 研究動機1
1.2 文獻回顧1
1.3 研究目標2
1.4 方法簡介3
1.4.1 以訓練樣本建立辨識模型3
1.4.2 輸入測試樣本進行辨識4
1.5 論文大綱5
第二章理論基礎6
2.1 特徵參數的求取6
2.2 隱藏式馬可夫模型7
2.3 類神經網路11
2.3.1 倒傳遞網路的定義與學習原理12
2.3.1 倒傳遞網路的訓練方法17
第三章結合隱藏式馬可夫模型與類神經網路模型之語音辨識系統 20
3.1 模型訓練階段20
3.1.1 隱藏式馬可夫模型音框分配系統20
3.1.2 自我監督類神經網路模型音框分配系統21
3.1.3 完整訓練流程22
3.2 模型辨識階段28
3.2.1 隱藏式馬可夫模型辨識方法28
3.2.2 類神經網路狀態模型辨識方法28
第四章實驗結果與討論35
4.1 系統設定35
4.2 語者相關辨識系統37
4.3 語者無關辨識系統40
第五章結論與未來展望44
5.1 結論44
5.2 未來展望45
參考文獻46

參考文獻

﹝1﹞ L. E. Baum and T. Tetrie, “Statistical Inference for Probabilistic Functions of Finite State Markov Chains,” Ann. Math. Stat., Vol. 37, pp. 1554-1563, 1966.
﹝2﹞ L. E. Baum and J. A. Egon, “An Inequality with Applications to Statistical Estimation for Probabilistic Functions of A Markov Process and to A Model for Ecology,” Bull. Amer. Meteorol. Soc., Vol. 73, pp. 360-363, 1967.
﹝3﹞ L. E. Baum and G. R. Sell, “Growth Functions for Transformations on Manifolds,” Pac. J. Math., Vol. 27, No.2, pp. 211-227, 1968.
﹝4﹞ L. E. Baum, T. Petrie, G. Soules, and N Weiss, “A Maximization Technique Occurring in The Statistical Analysis of Probabilistic Functions of Markov Chains,” Ann. Math. Stat., Vol. 41, No. 1, pp. 164-171, 1970.
﹝5﹞ L. E. Baum, “An Inequality and Associated Maximization Technique in Statistical Estimation for Probabilistic Functions of Markov Processes,” Inequalities, Vol. 3, pp. 1-8, 1972.
﹝6﹞ J. K. Baker, “The Dragon System-An Overview,” IEEE Trans. on Acoustics, Speech and Signal Processing, Vol. 23, No. 1, pp. 24-29, Feb. 1975.
﹝7﹞ F. Jelinek, “A Fast Sequential Decoding Algorithm Using A Stack,” IBM J. Res. Develop., Vol. 13, pp. 675-685,1969.
﹝8﹞ L. R. Bahl and F. Jelinek, “Decoding for Channels with Insertions, Deletions, and Substitutions with Applications to Speech Recognition,” IEEE Trans. on Information Theory, Vol. 21, pp. 404-411, 1975.
﹝9﹞ F. Jelinek, L. R. Bahl, and R. L. Mercer, “Design of A Linguistic Statistical Decoder for The Recognition of Continuous Speech,” IEEE Trans. on Information Theory, Vol. 21, pp. 250-256, 1975.
﹝10﹞ F. Jelinek, “Continuous Speech Recognition by Statistical Methods,” Proc. IEEE, Vol. 64, pp. 532-536, Apr. 1976.
﹝11﹞ R. Bakis, “Continuous Speech Word Recognition via Centi-second Acoustic States,” in Proc. ASA Meeting (Washington DC), Apr. 1976.
﹝12﹞ F. Jelinek, L. R. Bahl, and R. L. Mercer, “Continuous Speech Recognition: Statistical Methods,” in Handbook of statistics, II, P. R. Krishnaiad, Ed. Amsterdam, The Netherlands: North-Holland, 1982.
﹝13﹞ L. R. Bahl, F. Jelinek, and R. L. Mercer, “A Maximum Likelihood Approach to Continuous Speech Recognition,” IEEE Trans. on Pattern Analysis and Machine Intelligence., Vol. 5, pp. 179-190, 1983.
﹝14﹞ L. R. Rabiner, “A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition,” Proc. IEEE, Vol. 77, No.2, pp. 257-286, Feb. 1989.
﹝15﹞ K. J. Lang, Alex H. Waibel and G. E. Hinton, “A Time-Delay Neural Network Architecture for Isolated Word Recognition,” Neural Networks, Vol. 3, pp. 23-43, 1990.
﹝16﹞ A. Bendiksen and K. Steiglitz, “Neural Networks for Voiced/Unvoiced Speech Classification,” IEEE Int. Conf. on Acoustics, Speech and Signal Processing, Vol. 1, No. 90, pp. 521-524, 1990.
﹝17﹞ T. Ghiselli-Crippa, A. El-Jaroudi, “A Fast Neural Net Training Algorithm and Its Application to Voiced-Unvoiced-Silence Classification of Speech,” IEEE Int. Conf. on Acoustics, Speech and Signal Processing, Vol. 1, No. 91, pp. 441-444, 1991.
﹝18﹞ Y. Qi and B. R, Hunt, “Voiced-Unvoiced-Silence Classifications of Speech Using Hybrid Features and A Network Classifier,” IEEE Trans. on Speech and Audio Processing, Vol. 1, No. 2, pp. 250-255, Apr. 1993.
﹝19﹞ G. Kuhn, R. L. Watrous and B. Ladendorf, “Connected Recognition with A Recurrent Network,” Speech Communication, Vol. 9, No. 1, pp. 41-48, Feb. 1990.
﹝20﹞ S. J. Lee, K. C. Kim, H. Yoon and J. W. Cho, “Application of Fully Recurrent Neural Networks for Speech Recognition,” Int. Conf. on Acoustics, Speech and Signal Processing, Vol. 1, pp. 77-80, 1991.
﹝21﹞ A. Hunt, “Recurrent Neural Networks for Syllabification,” Speech Communication, Vol. 13, pp. 323-332, 1993.
﹝22﹞ T. Lee, P. C. Ching and L. W. Chan, “Recurrent Neural Networks for Speech Modeling and Speech Recognition,” Int. Conf. on Acoustics, Speech and Signal Processing, Vol. 5, pp. 3319-3322, 1995.
﹝23﹞ W.-Y. Chen, Y.-F. Liao and S.-H. Chen, “Speech Recognition with Hierarchical Recurrent Neural Networks,” Pattern Recognition, Vol. 28, No. 6, pp. 795-805, 1995.
﹝24﹞ H. Bourlard and C. j. Wellekens, “Links between Markov Models and Multilayer Perceptrons,” IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 12, No. 12, pp. 1167-1178, Dec. 1990.

指導教授

莊堯棠(Yau-Tarng Juang)

審核日期

2000-6-13

推文