中大機構典藏-NCU Institutional Repository-提供博碩士論文、考古題、期刊論文、研究計畫等下載:Item 987654321/48514
English  |  正體中文  |  简体中文  |  全文筆數/總筆數 : 80990/80990 (100%)
造訪人次 : 41692530      線上人數 : 1322
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜尋範圍 查詢小技巧:
  • 您可在西文檢索詞彙前後加上"雙引號",以獲取較精準的檢索結果
  • 若欲以作者姓名搜尋,建議至進階搜尋限定作者欄位,可獲得較完整資料
  • 進階搜尋


    請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/48514


    題名: 基於非均勻尺度-頻率圖之環境聲音辨識;Non-uniform Scale-Frequency Map for Environmental Sound Recognition
    作者: 蔡旻剛;Min-Kang Tsai
    貢獻者: 資訊工程研究所
    關鍵詞: 匹配追蹤;非均勻尺度-頻率圖;環境聲音辨識;加伯函數;參數擷取;Gabor function;Nonuniform scale-frequency map;matching pursuit;feature extraction;environmental sound classification
    日期: 2011-08-23
    上傳時間: 2012-01-05 14:56:51 (UTC+8)
    摘要: 本論文對於環境聲音的辨識提出一個新穎的參數擷取技術稱為non-uniform scale-frequency map。對每一個frame,我們利用matching pursuit演算法從Gabor字典中選取重要的atoms。忽略phase和position的資訊,我們選擇atoms的scale和frequency建構一個scale-frequency map。在應用主成分分析和線性鑑別分析於scale-frequency map後,產生最終之特徵向量。 對於環境聲音辨識,我們執行一個區段層級的多類支持向量機(SVM)。在實驗方面,我們採用17個類別的聲音資料庫,結果顯示提出的方法能夠達到86.47% 的準確率,跟其它時頻參數的效果比較,本論文所提出之特徵參數具明顯優越性。 另外,我們對於語音情緒辨識也提出一個新穎的參數擷取技術稱為SFM descriptor。對於每一個frame,我們一樣利用matching pursuit演算法選取atom,然後建構scale-frequency map。接著我們對每一個scale-frequency map擷取descriptor參數。然後建議的SFM descriptor結合non-uniform SFM 和MFCC且送進分類器。對於語音情緒辨識,我們執行一個語句層級的多類支持向量機。在實驗方面,我們採用7個類別的情緒語音資料庫,且辨識率可以達到73.96%。 In this study, we present a novel feature extraction technique called non-uniform scale-frequency map for environmental sound recognition. For each audio frame, we use matching pursuit algorithm to select important atoms from the Gabor dictionary. Ignoring phase and position information, we extract the scale and frequency of the selected atoms to construct a scale-frequency map. Principle component analysis (PCA) and linear discriminate analysis (LDA) are then applied to the scale-frequency map, generating a 16-dimensional vector. In the recognition phase, a segment-level multiclass support vector machine (SVM) is performed. Experiments are carried out on a 17-class sound database, and the result shows that the proposed approach can achieve an 86.47% accuracy rate. The performance comparison between the other time-frequency features demonstrates the superiority of the proposed feature. Other, we also present a novel feature extraction technique called SFM descriptor for emotional sound. For each frame,we use matching pursuit algorithm to select atom ,then construct scale-frequency map. Next,we extract descriptor feature for each scale-frequency map, then proposed SFM descriptor combined with non-uniform SFM feature and MFCC and sent into multiclass SVM. In the recognition phase,a file-level multiclass support vector machine (SVM) is performed. Experiments are carried out on 7-class emotional sound database and the result of recognition can achieve 73.96%.
    顯示於類別:[資訊工程研究所] 博碩士論文

    文件中的檔案:

    檔案 描述 大小格式瀏覽次數
    index.html0KbHTML714檢視/開啟


    在NCUIR中所有的資料項目都受到原著作權保護.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明