論文題目集成和多模態學習用於病理性語音分類

、線上人數：24

、訪客IP：18.118.147.65

姓名	黎亞媞(Whenty Ariyanti) 查詢紙本館藏	畢業系所	資訊工程學系
論文名稱	論文題目集成和多模態學習用於病理性語音分類 (ENSEMBLE AND MULTIMODAL LEARNING FOR PATHOLOGICAL VOICE CLASSIFICATION)
檔案	[Endnote RIS 格式] [Bibtex 格式] [相關文章] [文章引用] [完整記錄] [館藏目錄] 至系統瀏覽論文 ( 永不開放)
摘要(中)	語音障礙是現代社會中最常見的醫學疾病之一，特別是對於有職業語音需求的人群。在本文中，我們研究了一種通過組合聲信號和病歷對病理性語音障礙進行分類的堆疊式集成學習方法。在提出的集成學習框架中，堆疊支持向量機（SVM）形成了一組弱分類器，並為元學習者提供了一個深度神經網絡（DNN）。基於DNN的高度複雜性，將聲學特徵和病歷結合起來以獲得更好的分類性能。與單個SVM和DNN分類器相比，具有更好的性能，並且具有顯著的優勢。
摘要(英)	Voice disorders are one of the most common medical diseases in modern society, especially for those with occupational voice demand. In this paper, we investigate a stacked ensemble learning method to classify pathological voice disorder by combining acoustic signals and medical records. In the proposed ensemble learning framework, a stacked support vector machine (SVM) form a set of weak classifiers and a deep neural network (DNN) for a meta learner. Based on the high complexity of DNN, acoustic features and medical records are combined to attain better classification performance. The better performance than single SVM and DNN classifiers with a notable margin.
關鍵字(中)	★ 病理性語音 ★ 聲學信號 ★ 集成學習 ★ 二進制分類	關鍵字(英)	★ Pathological Voice ★ Acoustic Signal ★ Ensemble Learning ★ Binary Classification
論文目次	摘要 i ABSTRACT ii ACKNOWLEDGEMENT iii TABLE OF CONTENTS iv LIST OF FIGURES vi LIST OF TABLES vii LIST OF ABBREVATIONS viii CHAPTER 1 INTRODUCTION 1 CHAPTER 2 REVIEW OF LITERATURE 3 2.1. Pathological Voice Disorders 4 2.1.1 Classification of Voice Disorders 5 2.2. Support Vector Machine Classifier (SVM) 6 2.2.1 Multiclass Support Vector Machine 9 2.3. Deep Neural Network (DNN) 12 2.4. Ensemble Learning 15 2.4.1 Ensemble Learning Process 17 CHAPTER 3 IDENTIFICATION STRATEGIES 23 3.1 Overview 23 3.1 Classifier Design 25 3.2 Pre-Processor 27 3.2.1 Transformation 27 3.2.2 Feature Extraction 28 3.2.3 Normalization 32 3.2.4 Splitting 33 3.3 Performance Measures 33 CHAPTER 4 ENSEMBLE AND MULTIMODAL FOR PATHOLOGICAL VOICE CLASIFICATION 35 4.1 Overview of Dataset 35 4.1.1 Acoustic Signals 35 4.1.2 Medical Records 37 4.2 System Implementation 41 4.2.1 Pre-processing 41 4.2.2 Experiment Design 41 4.3 Experiment Results 43 4.3.1 Single Feature Results 43 4.3.2 Ensemble and Multimodal Learning Results 44 CHAPTER 5 CONCLUSIONS 46 5.1 Accomplishments 46 5.2 Limitations 46 5.3 Future Research Directions 47 BIBLIOGRAPHY 48
參考文獻	[1] S.-H. Fang., C-T. Wang., J-Y. Chen., Y. Tsao., F-C. Lin., “Combining acoustic signals and medical records to improve pathological voice classification,” in APSIPA Transaction on Signal and Information Processing, 2019. [2] S. R. Schwartz., S. M. Cohen., S. H. Dailey., R. M. Rosenfeld., E. S. Deutsch., M. B. Gillespie., E. Granieri., E. R. Hapner., C. E., Kimball., H. J. Krouse et al., “Clinical practice guideline: hoarness (dysphonia),” in Otolaryngology-Head and Neck Surgery, vol. 141, pp.1-31, 2009. [3] Vaziri. G., Almasganj. F., Behroozmand. R., “Pathological assessment of patients speech signals using nonlinear dynamical analysis,” in Computers in Biology and Medicine, vol.40(1), pp.128-134, 2006. [4] S. R. Savithri., “Clinical voice evaluation,” http://docplayer,.net/53758736-Clinical-voice-evaluation.html, (Date last accessed March 20, 2020) [5] H. Kasuya., S. Ogawa., Y. Kikuchi,. And S. Ebihara., “An acoustic analysis of pathological voice and its application to the evaluation of laryngeal pathology,” in Speech Communication, vol.5, no.2, pp.171-181, 1986. [6] C. Maguire., P. d. Chazal., R. B. Reilly., and P. D. Lacy., “Identification of voice pathology using automated speech analysis,” in Third International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications, 2003. [7] R. Herbrich, “Learning Kernel Classifiers: Theory and Algorithm,” in MIT Press, 2002 [8] R. Schapire., Y. Singer., “Improved boosting algorithms using confidencerated predictions,” in COLT, 1998 [9] E. Allwein., R. Schapire., Y. Singer., “Reducing multiclass to binary: a unifying approach for margin classifiers,” in Journal of Machine Learning Research, pp. 113-141, 2000. [10] H. Schwenk., “Using boosting to improve HMM/neural network speech recognizer,” in Acoustic, Speech and Signal Processing (ICASSP), pp. 1009-12. 1999. [11] G. Zweig., “Boosting Gaussian mixture in an LVCSR system,” in Acoustic, Speech and Signal Processing (ICASSP), pp. 1527-30, 2000 [12] T. Dietterich., G. Bakhiri., “Solving multiclass learning, boosting and error-correcting codes,” in COLT, pp.145-155, 1999 [13] D. Yu and L. Deng, Automatic Speech Recognition in Springer Handbook of Signals and Communication Technology, Springer (Chapter 1), 2015 [14] J. Li and L. Deng, Robust Automatic Speech Recognition in Springer Handbook of a Bridge of Practical Applicants, Springer (Chapter 2), 2016 [15] Roy, N., Merrill, R. M., Thibeault, S., Parsa, R. A., Gray, S. D., & Smith, E. M (2004). Prevalence of voice disorders in teachers and the general population. J Speech Lang Hear Res., 47(2), 281-93 [16] M. Bansal., "Diseases of ear, nose, & throat", in Jaypee Brothers Medical Publisher, 2013. [17] Vapnik. V, Cortes. C, "Support Vector Network", in Machine Learning, 20, 273-297 [18] M. Mohammed, M.B. Khan and E.B.M. Bashier, Machine Learning: Algorithms and Applications, CRC Press, Boca Raton, (2017), 115–126 [19] Corinna Cortes and Vladimir Vapnik, Support-Vector Networks, Machine Learning, (1995), 273–297. [20] Chih-W Hsu and Chih-J Lin, A Comparison of Methods for Multi-class Support Vector Machines, IEEE Transactions on Neural Networks 13, (2002), 415–425. [21] Dymitr Ruta and Bogdan Gabrys. Classifier selection for majority voting. Information fusion, 6(1):63–81, 2005. [22] D. Yu and L. Deng, Automatic Speech Recognition in Springer Handbook of Signals and Communication Technology, Springer (Chapter 4), 2015. [23] P. Werbos. Beyond regression: New tools for prediction and analysis in the behavior science. PhD thesis, Harvard University, Cambridge, MA, 1974. [24] João Mendes-Moreira, Carlos Soares, Alípio Mário Jorge, and Jorge Freire De Sousa. Ensemble approaches for regression. Volume 45(1). ACM, 2012, pages 1–40. ISBN: 3512250815. DOI: 10.1145/2379776.2379786. [25] Alexander Strehl and Joydeep Ghosh. Cluster Ensembles — a Knowledge Reuse Framework for Combining Multiple Partitions. J. mach. learn. res., 3:583–617, March 2003. ISSN: 1532-4435. DOI: 10.1162/153244303321897735. [26] Thomas G Dietterich. Ensemble Methods in Machine Learning. First international workshop on multiple classifier systems, 1857:1–15, 1990. [27] Lars Kai Hansen and Peter Salamon. Neural network ensembles. IEEE transactions on pattern analysis and machine intelligence, 12(10):993–1001, 1990. [28] Robi Polikar. Ensemble learning. In, Ensemble machine learning, pages 1–34. Springer US, Boston, MA, 2012. [29] Li. H., Kinnuen T., “An overview of text-independent speaker recognition: from features to super vectors,” in Speech Communication, pp.12-40. [30] DAVIS, S., MERMELSTEIN, P. “Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences”, IEEE Transactions on Acoustics, Speech, and Signal Processing, v. 28, n. 4, pp. 357–366, August 1980. [31] J.I. Godino-Liorente., P. Gomez Vilda., and M. Blanco-Velasco., “Dimensionality reduction of a pathological voice quality assessment system based on gaussian mixture models and shortterm cepstral parameter,” in IEEE Transactions on Biomedical Engineering, vol.53, no.10, pp.1943-1953. [32] S.-H. Fang, Y. Tsao, M.-J. Hsiao, J.-Y. Chen, Y.-H. Lai, F.-C. Lin, and C.-T. Wang, “Detection of pathological voice using cepstrum vectors: A deep learning approach,” in Journal of Voice, pp.634-641, 2019. [33] JUANG, B.-H., RABINER, L. R., WILPON, J. G. “On the use of band pass littering in speech recognition”, IEEE Transactions on Acoustics, Speech, and Signal Processing, v. 35, n. 7, pp. 947–954, July 1987. [34] D. Zhang., D. Gatcia-Perez., S. Bengio and I. McCowan., “Semi-supervised adapted HMMs for unusual event detection,” in IEEE Comp Society Conference, vol.1, pp.611-618, 2005. [35] Fukunaga, Keinosuke, and Patrenahalli M. Narendra. “A branch and bound algorithm for computing k-nearest neighbors.” IEEE Transactions on Computers 100.7 (1975): 750-753. [36] Friedl, Mark A., and Carla E. Brodley. “Decision tree classification of land cover from remotely sensed data.” Remote Sensing of Environment 61.3 (1997): 399-409. [37] Varma, Manik, and Bodla Rakesh Babu. “More generality in efficient multiple kernel learning.” Proceedings of the 26th Annual International Conference on Machine Learning. ACM, (2009). [38] Fawcett, Tom. “An introduction to ROC analysis.” Pattern Recognition letters 27.8 (2006): 861-874.
指導教授	王家慶曹昱(Jia-Ching Wang Yu Tsao)	審核日期	2020-8-20
推文	facebook plurk twitter funp google live udn HD myshare reddit netvibes friend youpush delicious baidu
網路書籤	Google bookmarks del.icio.us hemidemi myshare

博碩士論文 107522619 詳細資訊