Single and Multi-Label Environmental Sound Recognition with Gaussian Process; 基於高斯程序之單一及多重標籤環境聲音辨識

NCU Institutional Repository > 資訊電機學院 > 資訊工程研究所 > 博碩士論文 > Item 987654321/61566

jsp.display-item.identifier=請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/61566

题名:	Single and Multi-Label Environmental Sound Recognition with Gaussian Process;基於高斯程序之單一及多重標籤環境聲音辨識
作者:	西雅恩;Siahaan,Ernestasia
贡献者:	資訊工程學系
关键词:	高斯程序;環境聲音辨識;Gaussian Process;Environmental Sound Recognition
日期:	2013-08-14
上传时间:	2013-10-08 15:22:27 (UTC+8)
出版者:	國立中央大學
摘要:	Sound recognition applications play an important role in various aspects of human life, with research efforts being put into recognition systems of different kinds of sounds, i.e. speech, music, and environmental sounds. This thesis deals with the problem of environmental sound recognition, as it is a highly interesting part of sound recognition research due to the range of potential applications that benefit from it. We address two prominent parts of a recognition problem that hold an important role in delivering high performance in terms of recognition accuracy, i.e. the feature extraction and classification part. We proposed to use features extracted from the wavelet domain of a signal, as it is considered to provide better analysis of environmental sound audio signals. We extract the wavelet packet decomposition of an audio signal, and derive the signal’s spectral centroid, sparsity, flatness and spread using the wavelet nodes, as well as a set of wavelet-based cepstral coefficients. In addition, we propose the use of a set of histogram features calculated from the wavelet based features. We compare the performance of the different feature sets in our experiments. In the classification part of the system, we propose the use of Gaussian Process based classifier. We propose a multiple kernel approach, in which we combined the linear kernal and probability product kernel to present two different kinds of similarity notion from our data in the learning algorithm. We show the probability product kernel between two kernel density estimations, and then combine it with the linear kernel using a weighted linear combination approach, and multiplication approach. Two kinds of recognition problems are observed in this thesis, i.e. singular and multi-label problems. Through our experiments, we show that the proposed features and classification approach yielded satisfying recognition results in both singular and multi-label classification. Moreover, the use of multiple features in multiple kernel in a Gaussian Process further improved the system performance. 聲音辨識的應用在人類生活中許多方面扮演了重要的角色，而現在對於聲音辨識的研究主要在不同種類聲音的辨識系統上，例如:語音、音樂、環境聲音。本篇論文討論環境聲音辨識的問題，因為環境聲音辨識的研究有廣泛的潛在性應用，因此它在聲音辨識的領域中是個十分令人感興趣的部分。我們要解決兩個在辨識問題中扮演提高辨識率的重要角色的部分，分別是特徵值選取與分類方法。我們使用從訊號的小波域中選取的特徵值，因為這些特稱值提供了更好的環境聲音訊號的分析。我們取出聲音訊號的小波包分解以及一組基於小波轉換的倒頻譜係數，並且用小波節點推導出訊號的頻譜中心、稀疏性、平整度及分散度。此外，我們使用從基於小波的特徵值計算出來的一組直方圖特徵值。我們在實驗中比較不同組特徵值的效果。在辨識系統的分類方法部分，我們提出基於高斯程序的分類器。我們提出一個多重核心的方法，此方法是結合線性核心和機率乘積核心來表示我們在學習演算法中資料的兩種相似性概念。我們描述了在兩種核心密度估計中的機率乘積核心，並且用加權線性組合與乘法方法將機率乘積核心與線性核心結合。本篇論文敘述兩種辨識問題-單數標籤與多重標籤問題。經由實驗，我們證明了我們提出的特徵值以及分類方法滿足單數標籤與多重標籤分類問題的辨識結果。此外，在高斯程序中，多重特徵值在多重核心中的使用進一步提升了辨識系統的效能。
显示于类别:	[資訊工程研究所] 博碩士論文

文件中的档案:

档案	描述	大小	格式	浏览次数
index.html		0Kb	HTML	812	检视/开启

在NCUIR中所有的数据项都受到原著作权保护.

社群 sharing

数据加载中.....