基於支持向量機之語者驗證超大型積體電路架構設計

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：60

、訪客IP：18.117.104.56

姓名

連禮勳(Li-Xun Lian) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

基於支持向量機之語者驗證超大型積體電路架構設計
(VLSI Architecture Design for SVM-Based Speaker Verification)

相關論文

★ Single and Multi-Label Environmental Sound Recognition with Gaussian Process	★ 波束形成與音訊前處理之嵌入式系統實現
★ 語音合成及語者轉換之應用與設計	★ 基於語意之輿情分析系統
★ 高品質口述系統之設計與應用	★ 深度學習及加速強健特徵之CT影像跟骨骨折辨識及偵測
★ 基於風格向量空間之個性化協同過濾服裝推薦系統	★ RetinaNet應用於人臉偵測
★ 金融商品走勢預測	★ 整合深度學習方法預測年齡以及衰老基因之研究
★ 漢語之端到端語音合成研究	★ 基於 ARM 架構上的 ORB-SLAM2 的應用與改進
★ 基於深度學習之指數股票型基金趨勢預測	★ 探討財經新聞與金融趨勢的相關性
★ 基於卷積神經網路的情緒語音分析	★ 運用深度學習方法預測阿茲海默症惡化與腦中風手術存活

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 ( 永不開放)

摘要(中)

本篇論文提出了一種全新的VLSI架構來實現SVM與GMM-Supervector並可應用於語者驗證(Speaker Verification)或語者辨識(Speaker Identification) 。
在SVM-Based的方法中，Decision Function為一不單是計算量相當龐大，還包括RBF核化函數裡面複雜的運算。且Decision Function需要與大量的支持向量(Support Vecotr)進行運算，如果我們想要達到Real-Time的效果以及不錯的辨識率對硬體設計而言則是相當的困難。因此我們提出了全新高斯核化單元(Gaussian Kernel Unit)，裡面包括數個可平行的Gaussian-PE。在每個Gaussian-PE中還包含了一個改良過CORDIC架構的指數單元(Exponential Unit)，改良過後的指數單元(Exponential Unit)可幫我們快速地來計算指數函數且較不佔面積，使得我們可平行化的數目增加。我們同樣的也完成了SVM-Based語者驗證之VLSI並且可支援相當多的支持向量個數，並且擁有不錯的運算速度。
另外在GMM-Supervector的方法中，MAP的調適針對硬體而言已經是相當複雜的運算，並且大量的高斯個數更會對此方法如果有Real-Time的需求困難度大增。因此我們提出了全新的GMM-Supervector的硬體架構，包括我們在高斯混合模型模組(GMM Module)做了一些調整，可幫我們快速的運算GMM的值。並且我們提出全新的MAP模組，裡面包含數個平行的MAP-PE，每個MAP-PE接可快速地幫我們計算出調適後的高斯mean值，進而提升我們整體的系統速度，且我們的架構也可與我們提出的SVM-Based的硬體架構作結合，完成一全新的GMM-Supervector語者驗證之VLSI架構，並可達到不錯的辨識率及效率。

摘要(英)

This paper presents a VLSI chip design for support vector machine (SVM) and GMM-Supervector (Gaussian Mixture Model-Supervector) based speaker verification.
In SVM-Based method, the proposed chip consists mainly of a speaker feature extraction (SFE) module, an SVM module, and a decision module. The SFE module performs autocorrelation analysis, linear predictive coefficient (LPC) extraction, and LPC to cepstrum conversion. The SVM module includes a Gaussian kernel unit and a scaling unit. The purpose of Gaussian kernel unit is to evaluate the kernel value of a test vector and a support vector first. Four Gaussian kernel parallel processing elements (GK-PEs) are design to process four support vectors simultaneously. Each GK-PE is designed by a pipeline fashion and capable of perform 2-norm and exponential operations. An enhanced CORDIC architecture is presented to calculate the exponential value. In addition to the Gaussian kernel unit, a scaling unit is also developed in the SVM module. The scaling unit is used to perform scaling multiplications and complete the remaining operations of SVM decision value evaluation. Finally, the decision module accumulates the frame scores generated by all the test frames, and then compare it with a threshold to see if the test utterance is spoken by the claimed speaker. This chip design is characterized by its high speed, capable of handling a large number of support vectors in SVM.
In GMM-Supervector method, the proposed chip consists mainly of a speaker feature extraction (SFE) module, a Gaussian mixture model (GMM) module, an MAP module, an SVM module. The GMM module can help us to compute the result of GMM quickly, and we propose a new MAP module, which contains numbers of parallel MAP-PE, each MAP-PE can help us calculate Gaussian mean values after adaptation quickly, thereby this paper enhance the speed of the overall system.

關鍵字(中)

★ 語者驗證
★ 超大型積體電路架構設計
★ 支持向量機
★ 高斯混合模型
★ 超大型向量
★ 座標旋轉運算器

關鍵字(英)

★ speaker verification
★ VLSI
★ support vector machine(SVM)
★ Gaussian mixture model(GMM)
★ supervector
★ CORDIC

論文目次

摘要..................................................... ii
Abstract................................................iv
圖目錄…………………………………………………………………………………………………………………………………………vi
表目錄……………………………………………………………………………………………………………………………………viii
章節目次 ix
第一章緒論 1
1.1 前言 1
1.2 研究動機與目的 1
1.3 論文架構 2
第二章語者驗證系統簡介 3
2.1 簡介(Introduction) 3
2.2 語者驗證之常見演算法 5
2.3 語音參數擷取(Speech Feature Extraction) 6
2.3.1線性預測倒頻譜(LPCC) 8
2.4 支持向量機(Support Vector Machine) 9
2.4.1核化函數(Kernel Function) 10
2.5 GMM-Supervector 11
2.5.1高斯混合模型(Gaussian Mixture Model) 12
2.5.2語者模型之調適(Adaptation of Speaker Model) 14
第三章基於SVM-Based語者驗證之架構設計 17
3.1 系統架構 17
3.2 支持向量機模組(SVM Module) 19
3.2.1高斯核化單元(Gaussian Kernel Unit) 20
3.2.2指數單元(Exponential Unit) 22
3.2.3延展單元(Scaling Unit) 29
3.3 決策單元(Decision Unit) 32
第四章基於GMM-Supervector語者驗證之架構設計 32
4.1 系統架構 32
4.2 高斯混合模型模組(GMM Module) 34
4.3 MAP模組 32
4.4 GMM-Supervector之整體架構 43
第五章實驗結果 46
5.1 基於SVM-Based語者驗證之VLSI架構 46
第六章結論及未來研究方向 52
參考文獻 53

參考文獻

[1] D. A. Reynolds and R. C. Rose, “Robust text-independent speaker identification using Gaussian mixture models,” IEEE Trans. Speech Audio Process., vol. 3, no. 1, pp. 72–83, Jan. 1995.
[2] B. L. Pellom and J. H. L. Hansen, “An efficient scoring algorithm for Gaussian mixture model based speaker identification,” IEEE Signal Process. Lett., vol. 5, no. 11, pp. 281–284, Nov. 1998.
[3] W. M. Campbell, J. P. Campbell, D. A. Reynolds, E. Singer, and P. A. Torres-Carrasquillo, “Support vector machines for speaker and language recognition,” Comput. Speech Lang., vol. 20, pp. 210–229, 2006.
[4] D. A. Reynolds, T. F. Quatieri, and R. B. Dunn, “Speaker verification using adapted Gaussian mixture models,” Digital Signal Processing, vol. 10, no. 1, pp. 19–41, 2000.
[5] K. Irick, M. DeBole, V. Narayanan, and A. Gayasen, “A hardware efﬁcient support vector machine architecture for FPGA,” In Proc. FCCM ’08. IEEE Computer Society, 2008, pp. 304–305.
[6] D. Anguita, A. Ghio, S. Pischiutta, and S. Ridella. “A hardware-friendly support vector machine for embedded automotive applications, ” In Proc. International Joint Conference on Neural Networks, pages 1360–1364, Aug. 2007
[7] C. Kyrkou and T. Theocharides, “A Parallel Hardware Architecture for Real-Time Object Detection with Support Vector Machines,” IEEE Transactions on Computers, vol. 61, no. 6, pp. 1038–1046, June. 2012.
[8] A. Ghio, S. Pischiutta, ”A support vector machine based pedestrian recognition system on resource-limited hardware architectures,” in Research in Microelectronics and Electronics, IEEE Conference, pp. 161-163, 2007.
[9] Anguita, D., Boni, A., Ridella, S., 1999b. ”A VLSI friendly algorithm for support vector machines, ” 1999 International Joint Conference on Neural Networks, vol. 2. July 1999b, pp. 939–942.
[10] Omar Pina-Ramirez, Raquel Valdes-Cristerna and Oscar Yanez-Suarez, “An FPGA Implementation of Linear Kernel Support Vector Machines,” IEEE International Conference on Reconfigurable Computing and FPGA’s, ReConFig 2006, Sept 2006,pp.1-6.
[11] Ruiz-Llata, M., Guarnizo, G., and Yebenes-Calvino. M., “FPGA Implementation of a Support Vector Machine for Classification and Regression,” Proc. of International Joint Conference on Neural Networks, pp. 1-5, Jul. 2010.
[12] J. Manikandan, B. Venkataramani, V. Avanthi, "FPGA Implementation of Support Vector Machine Based Isolated Digit Recognition System,", 2009 22nd International Conference on VLSI Design, pp.347-352, 2009.
[13] Ramos-Lara, R., López-García, M., Cantó-Navarro, E., Puente-Rodriguez, L.: SVM Speaker Verification System based on a Low-Cost FPGA. In: Field-Programmable Logic and its Applications, pp. 202–211 (2009)
[14] D. Mahmoodi, et al., "FPGA Simulation of Linear and Nonlinear Support Vector Machine," Journal of Software Engineering and Applications, vol. 5, No.4, pp. 320-328, 2011.
[15] 陳泰龍， “以高速模組為基礎之即時語音辨識系統單晶片設計”，國立成功大學電機工程學系研究所碩士論文，民國88年。
[16] V. Kantabutra. “On Hardware for Computing Exponential and Trigonometric Functions”, IEEE Trans. on Computers, vol. 45, no. 3, March 1996.
[17] A. Bodabous, F. Ghozzi, M. Kharrat and N. Masmoudi, “Implementation of hyperbolic functions using CORDIC algorithm,” Proc. IEEE Conf. Microelec., 2004, pp. 738-741.
[18] M. Shi and A. Bermak, “An efﬁcient digital VLSI implementation of Gaussian mixture models-based classiﬁer,” IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 14, no. 9, pp. 962–974, Sep. 2006.
[19] H. Noguchi, K. Miura, T. Fujinaga, T. Sugahara, H. Kawaguchi, and M. Yoshimoto, “VLSI architecture of GMM processing and Viterbi decoder for 60,000-word real-time continuous speech recognition,” IEICE Trans. Electron., vol. 94, no. 4, pp. 458–467, Apr. 2011.
[20] Guangji He, Takanobu Sugahara, Yuki Miyamoto, Tsuyoshi Fujinaga, Hiroki Noguchi, Shintaro Izumi, Hiroshi Kawaguchi, and Masahiko Yoshimoto, “A 40 nm 144 mW VLSI Processor for Real-Time 60-kWord Continuous Speech Recognition”, IEEE Transactions on circuits and systems—I: regular papers, vol. 59, No. 8, August 2012.
[21] Peng Li. Design of a Low-Power Coprocessor for Mid-Size Vocabulary Speech Recognition Systems. Circuits and Systems I: Regular Papers,IEEE Transactions on 58, (May 2011) 961-970.
[22] O. Cheng, W. Abdulla, and Z. Salcic, “Hardware-software codesign of automatic speech recognition system for embedded real-time applications,” IEEE Trans. Ind. Electron., vol. 58, no. 3, pp. 850–859, Mar. 2011.
[23] F. Seide, “Fast likelihood computation for continuous-mixture densities using a tree-based nearest neighbor search,” in Proc. EUROSPEECH-95: Eur. Conf. Speech Technology, Madrid, Spain, 1995, pp. 1079–1082.
[24] Young-kyu Choi, Kisun You, Wonyong Sung. "A real-time FPGA-based 2000-word speech recognizer with optimized DRAM access," IEEE Transation on Circuits and Systems. /, Reg. Paper. vol. 53, no. 1, pp. 70-77, January 2010.
[25] S.F. Oberman and M.J. Flynn, ªDivision Algorithms and Implementations,º IEEE Trans. Computers, vol. 46, no. 8, pp. 833-854,Aug. 1997.
[26] P. Hung, H. Fahmy, O. Mencer, M.J. Flynn, “Fast division algorithm with a small lookup table,”Conference Record of theThirty-Third Asilomar Conference on Signals, Systems andcomputers, Vol. 2, pp. 1465–1468, May 1999.
[27] J. Jeong, W. Park, W. Jeong, T. Han, and M. Lee, “A Cost-EffectivePipelined Divider with a Small Lookup Table,” IEEE Trans. on Computers, vol 53, no. 4, pp. 489–494, April 2004.

指導教授

王家慶(Jia-Ching Wang)

審核日期

2013-8-26

推文