博碩士論文 100522603 詳細資訊




以作者查詢圖書館館藏 以作者查詢臺灣博碩士 以作者查詢全國書目 勘誤回報 、線上人數:16 、訪客IP:3.145.60.166
姓名 西雅恩(Ernestasia Siahaan)  查詢紙本館藏   畢業系所 資訊工程學系
論文名稱 Single and Multi-Label Environmental Sound Recognition with Gaussian Process
(基於高斯程序之單一及多重標籤環境聲音辨識)
相關論文
★ 波束形成與音訊前處理之嵌入式系統實現★ 語音合成及語者轉換之應用與設計
★ 基於語意之輿情分析系統★ 高品質口述系統之設計與應用
★ 深度學習及加速強健特徵之CT影像跟骨骨折辨識及偵測★ 基於風格向量空間之個性化協同過濾服裝推薦系統
★ RetinaNet應用於人臉偵測★ 金融商品走勢預測
★ 整合深度學習方法預測年齡以及衰老基因之研究★ 漢語之端到端語音合成研究
★ 基於 ARM 架構上的 ORB-SLAM2 的應用與改進★ 基於深度學習之指數股票型基金趨勢預測
★ 探討財經新聞與金融趨勢的相關性★ 基於卷積神經網路的情緒語音分析
★ 運用深度學習方法預測阿茲海默症惡化與腦中風手術存活★ 運用LLM自動生成食譜方法與系統
檔案 [Endnote RIS 格式]    [Bibtex 格式]    [相關文章]   [文章引用]   [完整記錄]   [館藏目錄]   至系統瀏覽論文 ( 永不開放)
摘要(中) Sound recognition applications play an important role in various aspects of human life, with research efforts being put into recognition systems of different kinds of sounds, i.e. speech, music, and environmental sounds. This thesis deals with the problem of environmental sound recognition, as it is a highly interesting part of sound recognition research due to the range of potential applications that benefit from it. We address two prominent parts of a recognition problem that hold an important role in delivering high performance in terms of recognition accuracy, i.e. the feature extraction and classification part.
We proposed to use features extracted from the wavelet domain of a signal, as it is considered to provide better analysis of environmental sound audio signals. We extract the wavelet packet decomposition of an audio signal, and derive the signal’s spectral centroid, sparsity, flatness and spread using the wavelet nodes, as well as a set of wavelet-based cepstral coefficients. In addition, we propose the use of a set of histogram features calculated from the wavelet based features. We compare the performance of the different feature sets in our experiments.
In the classification part of the system, we propose the use of Gaussian Process based classifier. We propose a multiple kernel approach, in which we combined the linear kernal and probability product kernel to present two different kinds of similarity notion from our data in the learning algorithm. We show the probability product kernel between two kernel density estimations, and then combine it with the linear kernel using a weighted linear combination approach, and multiplication approach.
Two kinds of recognition problems are observed in this thesis, i.e. singular and multi-label problems. Through our experiments, we show that the proposed features and classification approach yielded satisfying recognition results in both singular and multi-label classification. Moreover, the use of multiple features in multiple kernel in a Gaussian Process further improved the system performance.
摘要(英) 聲音辨識的應用在人類生活中許多方面扮演了重要的角色,而現在對於聲音辨識的研究主要在不同種類聲音的辨識系統上,例如:語音、音樂、環境聲音。本篇論文討論環境聲音辨識的問題,因為環境聲音辨識的研究有廣泛的潛在性應用,因此它在聲音辨識的領域中是個十分令人感興趣的部分。我們要解決兩個在辨識問題中扮演提高辨識率的重要角色的部分,分別是特徵值選取與分類方法。
我們使用從訊號的小波域中選取的特徵值,因為這些特稱值提供了更好的環境聲音訊號的分析。我們取出聲音訊號的小波包分解以及一組基於小波轉換的倒頻譜係數,並且用小波節點推導出訊號的頻譜中心、稀疏性、平整度及分散度。此外,我們使用從基於小波的特徵值計算出來的一組直方圖特徵值。我們在實驗中比較不同組特徵值的效果。
在辨識系統的分類方法部分,我們提出基於高斯程序的分類器。我們提出一個多重核心的方法,此方法是結合線性核心和機率乘積核心來表示我們在學習演算法中資料的兩種相似性概念。我們描述了在兩種核心密度估計中的機率乘積核心,並且用加權線性組合與乘法方法將機率乘積核心與線性核心結合。
本篇論文敘述兩種辨識問題-單數標籤與多重標籤問題。經由實驗,我們證明了我們提出的特徵值以及分類方法滿足單數標籤與多重標籤分類問題的辨識結果。此外,在高斯程序中,多重特徵值在多重核心中的使用進一步提升了辨識系統的效能。
關鍵字(中) ★ 高斯程序
★ 環境聲音辨識
關鍵字(英) ★ Gaussian Process
★ Environmental Sound Recognition
論文目次 摘要       i
ABSTRACT ii
ACKNOWLEDGEMENTS iii
LIST OF FIGURES vi
LIST OF TABLES vii
I. INTRODUCTION 1
II. RELATED WORK 5
2-1. Environmental Sound Recognition 5
2-1-1. Feature Selection 5
2-1-2. Challenges and Applications 9
2-2. Multi-Label Classification Problem 11
2-3. Gaussian Process for Classification 14
2-4. Kernel Methods 16
2-4-1. Multiple Kernel 17
III. METHODOLODY 20
3-1. System Overview 20
3-2. Feature Extraction 21
3-3. Multiple Kernel for Gaussian Process Classification 24
3-3-1. Multiple Kernel for Multi Features 28
3-4. Multi-Label Sound Recognition 28
3-4-1. Sound Recognition from Continuous Audio Stream 29
3-4-2. Mixed Sound Recognition 30
3-5. Evaluation 31
IV. EXPERIMENT AND RESULTS 33
4-1. Singular Sound Event Classification 33
4-1-1. Comparison of Feature Extraction and Classification Approaches 34
4-1-2. Test of Robustness 38
4-2. Multi-Label Environmental Sound Recognition 39
4-2-1. Continuous Audio Stream Test Case 39
4-2-2. Mixed Sound Test Case 41
V. CONCLUSION 43
BIBLIOGRAPHY 44
參考文獻 BIBLIOGRAPHY
[1] J.-C. Wang, H.-P. Lee, J.-F. Wang and C.-B. Lin, "Robust Environmental Sound Recognition for Home Automation," IEEE Trans. on Automation Science and Engineering, vol. 5, no. 1, pp. 25-31, Jan 2008.
[2] S.-H. Shin, T. Hashimoto and S. Hatano, "Automatic Detection System for Cough Sounds as a Symptom of Abnormal Health Condition," IEEE Trans. on Information Technology in Biomedicine, vol. 13, no. 4, pp. 486-493, Jul 2009.
[3] J. Nishimura and T. Kuroda, "Versatile Recognition Using Haar-Like Feature and Cascaded Classifier," IEEE Sensors Journal, vol. 10, no. 5, pp. 942-951, May 2010.
[4] R. Cai, L. Lu, A. Hanjalic, H.-J. Zhang and L.-H. Cai, "A Flexible Framework for Key Audio Effects Detection and Auditory Context Inference," IEEE Trans. on Audio, Speech, and Language Processing, vol. 14, no. 3, pp. 1026-1039, May 2006.
[5] S. Chu, S. Narayanan and C. C. J. Kuo, "Environmental Sound Recognition with Time-Frequency Audio Features," IEEE Trans. on Audio, Speech, and Language Processing, vol. 17, no. 6, pp. 1142-1158, Aug 2009.
[6] H. D. Tran and H. Li, "Sound Event Recognition with Probabilistic Distance SVMs," IEEE Trans. on Audio, Speech, and Language Processing, vol. 19, no. 6, pp. 1556-1568, Aug 2011.
[7] I. Paraskevas, S. Potirakis and M. Rangoussi, "Natural Soundscapes and Identification of Environmental Sounds : A Pattern Recognition Approach," in Digital Signal Processing, 2009 16th International Conference on, Santorini-Hellas, 2009.
[8] Z. Shi, B. Gao, J. Han and Z. Wu, "Study of Objectionable Sound Recognition Based on Histogram Features and SVM," in Image and Signal Processing, 2009. CISP ’09. 2nd International Congress on, Tianjin, 2009.
[9] A. Rabaoui, M. Davy, S. Rossignol and N. Ellouze, "Using One-Class SVMs and Wavelets for Audio Surveillance," IEEE Trans. on Information Forensics and Security, vol. 3, no. 4, pp. 763-775, Dec 2008.
[10] M. Karbasi, S. M. Ahadi and M. Bahmanian, "Environmental Sound Classification Using Spectral Dynamic Features," in Information, Communications and Signal Processing (ICICS) 2011 8th International Conference on, Singapore, 2011.
[11] B. Ghoraani and S. Krishnan, "Time-Frequency Matrix Feature Extraction and Classification of Environmental Audio Signals," IEEE Trans. on Audio, Speech, and Language Processing, vol. 19, no. 7, pp. 2197-2209, Sep 2011.
[12] A. Graps, "An Introduction to Wavelets," IEEE Computational Science and Engineering, vol. 2, no. 2, pp. 50-61, 1995.
[13] G. Haranadh and C. C. Sekhar, "Hyperparameters of Gaussian Process as Features for Trajectory Classification," in Neural Networks, 2008 IEEE International Joint Conference on, Hong Kong, 2008.
[14] K. Umapathy, S. Krishnan and R. K. Rao, "Audio Signal Feature Extraction and Classification Using Local Discriminant Bases," IEEE Trans. on Audio, Speech, and Language Processing, vol. 15, no. 4, pp. 1236-1246, May 2007.
[15] K. Umapathy, S. Krishnan and S. Jimaa, "Multigroup Classification of Audio Signals Using Time-Frequency Parameters," IEEE Trans. on Multimedia, vol. 7, no. 2, pp. 308-315, Apr 2005.
[16] G. Wichern, J. Xue, H. Thornburg, B. Mechtley and A. Spanias, "Segmentation, Indexing, and Retrieval for Environmental and Natural Sounds," IEEE Trans. on Audio, Speech, and Language Processing, vol. 18, no. 3, pp. 688-707, Mar 2010.
[17] X. Zhuang, X. Zhou, M. A. Hasegawa-Johnson and T. S. Huang, "Real-world Acoustic Event Detection," Pattern Recognition Letters, vol. 31, pp. 1543-1551, 2010.
[18] B. Wang, F. Wan, P. U. Mak, P. I. Mak and M. I. Vai, "EEG Signals Classification for Brain Computer Interfaces Based on Gaussian Process Classifier," in Information, Communications and Signal Processing, 2009 7th International Conference on, Macau, 2009.
[19] Y. Bazi and F. Melgani, "Gaussian Process Approach to Remote Sensing Image Classification," IEEE Trans. on Geoscience and Remote Sensing, vol. 48, no. 1, pp. 186-197, Jan 2010.
[20] J. J. Burred, A. Robel and T. Sikora, "Dynamic Spectral Envelope Modeling for Timbre Analysis of Musical Instrument Sounds," IEEE Trans. on Audio, Speech, and Language Processing, vol. 18, no. 3, pp. 663-674, Mar 2010.
[21] A. Liutkus, R. Badeau and G. Richard, "Gaussian Processes for Underdetermined Source Separation," IEEE Trans. on Signal Processing, vol. 59, no. 7, pp. 3155-3167, Jul 2011.
[22] S. Wang and H. Gu, "Multiuser Detection with Sparse Spectrum Gaussian Process," IEEE Communications Letters, vol. 16, no. 2, pp. 164-167, Feb 2012.
[23] H. Zhou, F. Ramos and E. Nettleton, "Improving Kernel Methods through Complex Data Mapping," in Data Mining (ICDM), 2010 IEEE 10th International Conference on, Sydney, 2010.
[24] T. Jebara, R. Kondor and A. Howard, "Probability Product Kernels," Journal of Machine Learning Research, vol. 5, pp. 819-844, 2004.
[25] V. Wan and S. Renals, "Speaker Verification using Sequence Discriminant Support Vector Machines," IEEE Trans. on Speech and Audio Processing, vol. 13, no. 2, pp. 203-2010, Mar 2005.
[26] H. Song, Z. Ding, C. Guo, Z. Li and H. Xia, "Research on Combination Kernel Function of Support Vector Machine," in Computer Science and Software Engineering, 2008 International Conference on , Wuhan, Hubei, 2008.
[27] R. Zhang and X. Duan, "A New Compositional Kernel Method for Multiple Kernels," in Computer Design and Applications (ICCDA), 2010 International Conference on, Qinhuangdao, 2010.
[28] B. Siddiquie, S. N. Vitaladevuni and L. S. Davis, "Combining Multiple Kernels for Efficient Image Classification," in Applications of Computer Vision (WACV), 2009 Workshop on, Snowbird, UT, 2009.
[29] D. Tuia, G. Camps-Valls, G. Matasci and M. Kanevski, "Learning Relevant Image Features With Multiple-Kernel Classification," IEEE Trans. on Geoscience and Remote Sensing, vol. 48, no. 10, pp. 3780-3791, Oct 2010.
[30] J. Young, M. Modat, M. J. Cardoso, A. Mendelson, D. Cash and S. Ourselin, "Accurate multimodal probabilistic prediction of conversion to Alzheimer’s disease in patients with mild cognitive impairment," NeuroImage: Clinical, vol. 2, pp. 734-745, 2013.
[31] T. Takiguchi, T. Imada, R. Takashima, Y. Ariki, J. L. Lin, P. K. Kuhl, M. Kawakatsu and M. Kotani, "A New Multiple-Kernel-Learning Weighting Method for Localizing Human Brain Magnetic Activity," in Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on, Kyoto, 2012.
[32] E. Rodner, D. Hegazy and J. Denzler, "Multiple Kernel Gaussian Process Classification for Generic 3D Object Recognition," in Image and Vision Computing New Zealand (IVCNZ), 2010 25th International Conference of, Queenstown, 2010.
[33] S. Ntalampiras, I. Potamitis and N. Fakotakis, "Exploiting Temporal Feature Integration for Generalized Sound Recognition," EURASIP Journal on Advances in Signal Processing, vol. 2009, pp. 1-12, 2009.
[34] W. Chu and B. Champagne, "A Noise-Robust FFT-Based Auditory Spectrum With Application in Audio Classification," IEEE Trans. on Audio, Speech, and Language Processing, vol. 16, no. 1, pp. 137-150, Jan 2008.
[35] J.-C. Wang, C.-H. Lin, E. Siahaan, B.-W. Chen and H.-L. Chuang, "Mixed Sound Event Verification on Wireless Sensor Network for Home Automation," IEEE Trans. on Industrial Informatics, vol. PP, no. 99, 2013.
[36] M.-L. Zhang and Z.-H. Zhou, "A Review on Multi-Label Learning Algorithms," IEEE Trans. on Knowledge and Data Enginerring, vol. PP, no. 99, 2013.
[37] A. P. Streich and J. M. Buhmann, "Classfication of Multi-Labeled Data: A Generative Approach," in European Conference on Machine Learning and Knowledge Discovery in Databases, Antwerp, 2008.
[38] G. Qu, H. Zhang and C. T. Hartrick, "Multi-label Classfication with Bayes’ Theorem," in Biomedical Engineering and Informatics (BMEI), 2011 4th International Conference on , Shanghai, 2011.
[39] E. Spyromitros, G. Tsoumakas and I. Vlahavas, "An Empirical Study of Lazy Mutilabel Classification Algorithms," in SETN ’08 Proceedings of the 5th Hellenic conference on Artificial Intelligence, Berlin, 2008.
[40] K. Brinker and E. Hullermeier, "Case-based Multilabel Learning," in IJCAI’07 Proceedings of the 20th international joint conference on Artifical intelligence, San Fransisco, 2007.
[41] S. Godbole and S. Sarawagi, "Discriminative Methods for Multi-labeled," in Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, Sydney, 2004.
[42] W. Gu, B. Chen and J. Hu, "Combining binary-SVM and pairwise label constraints for multi-label classification," in Systems Man and Cybernetics (SMC), 2010 IEEE International Conference on, Istanbul, 2010.
[43] C. E. Rasmussen and C. K. I. Williams, Gaussian Processes for Machine Learning, Cambridge, Massachussets: MIT Press, 2006.
[44] C. M. Bishop, Pattern Recognition and Machine Learning, New York: Springer, 2006.
[45] S. Sonnenburg, G. Ratsch and C. Schafer, "A General and Efficient Multiple Kernel Learning Algorithm," in Advances in Neural Information Processing Systems (NIPS), Vancouver, 2006.
[46] M. Gonen and E. Alpaydm, "Multiple Kernel Learning Algorithms," Journal of Machine Learning Research, vol. 12, pp. 2211-2268, 2011.
[47] B. Siddiquie, S. N. Vitaladevuni and L. C. Davis, "Combining Multiple Kernels for Efficient Image Classification," in Applications of Computer Vision (WACV), 2009 Workshop on, Snowbird, 2009.
[48] A. Melkumyan and F. Ramos, "Multi-Kernel Gaussian Processes," in International Joint Conference on Artificial Intelligence, Barcelona, 2011.
[49] J. Sueur : ICML 2013 Bird Challenge Dataset, 2013 June, available : .
[50] I.-J. Ding, "Events Detection for Audio Based Surveillance by Variable-Sized Decision Windows Using Fuzzy Logic Control," Tamkang Journal of Science and Engineering, vol. 12, no. 3, pp. 299-308, 2009.
[51] I. Feki, A. Ben Ammar and A. M. Alimi, "Environmental Sound Extraction and Incremental Learning Approach for Real Time Concepts Identification," in Computational Intelligence for Multimedia, Signal and Vision Processing (CIMSIVP), 2011 IEEE Symposium on, Paris, 2011.
[52] S. Jadhav and A. Bhalchandra, "Blind Source Separation : Trends of New Age - A Review," in Wireless, Mobile and Multimedia Networks, 2008. IET International Conference on, Mumbai, 2008.
[53] A. Ozerov, E. Vincent and F. Bimbot, "A General Flexible Framework for the Handling of Prior Information in Audio Source Separation," IEEE Trans. on Audio, Speech, and Language Processing, vol. 20, no. 4, pp. 1118-1133, May 2012.
[54] E. Vincent, "Advances in Audio Source Separation and Multisource Audio Content Retrieval," in SPIE Defense, Security, and Sensing, Baltimore, 2012.
[55] J. J. Burred, A. Robel and T. Sikora, "Dynamic Spectral Envelope Modeling for Timbre Analysis of Musical Instrument Sounds," IEEE Trans. on Audio, Speech, and Language Processing, vol. 18, no. 3, pp. 663-674, Mar 2010.
指導教授 王家慶(Jia-Ching Wang) 審核日期 2013-8-14
推文 facebook   plurk   twitter   funp   google   live   udn   HD   myshare   reddit   netvibes   friend   youpush   delicious   baidu   
網路書籤 Google bookmarks   del.icio.us   hemidemi   myshare   

若有論文相關問題,請聯絡國立中央大學圖書館推廣服務組 TEL:(03)422-7151轉57407,或E-mail聯絡  - 隱私權政策聲明