中大機構典藏-NCU Institutional Repository-提供博碩士論文、考古題、期刊論文、研究計畫等下載:Item 987654321/74567
English  |  正體中文  |  简体中文  |  Items with full text/Total items : 78937/78937 (100%)
Visitors : 39802979      Online Users : 601
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version


    Please use this identifier to cite or link to this item: http://ir.lib.ncu.edu.tw/handle/987654321/74567


    Title: 非對稱摺積神經網路之聲音場景分類;Asymmetric Kernel Convolutional Neural Network for Acoustic Scenes Classification
    Authors: 伍聿旂;Wu, Yu-Chi
    Contributors: 通訊工程學系
    Keywords: 計算聽覺場景分析;聲音場景辨分類;深度學習;摺積神經網路;Computational Auditory Scene Analysis;Acoustic scenes classification;Deep learning;Convolutional neural network
    Date: 2017-07-26
    Issue Date: 2017-10-27 14:02:13 (UTC+8)
    Publisher: 國立中央大學
    Abstract: 隨著人類追求便利性,我們使用電腦使其學習並了解人類所熟知的事物,我們希望通過分析聲音使電腦認識自己的環境,自2013年首次舉辦IEEE Audio and Acoustic Signal Processing (AASP) 聲音場景與事件辨識(Detection and Classification of Acoustic Scenes and Events, DCASE) 競賽,掀起了聲音場景分類 (Acoustic scene classification, ASC)的風波,邁向統一ASC的資料庫與評估方法的第一步,更於2016年舉辦第二屆 DCASE2016競賽。
    本論文利用深度學習中的摺積神經網路 (Convolutional Neural Net-work, CNN) 作為ASC的方法。由於CNN之輸入資料為頻譜,而頻譜包含時域資訊與頻域資訊,因此我們假設時域資訊與頻域資訊的資料變化量不一,因此使用長形的摺積核 (kernel) ,也就是本論文提出之非對稱摺積核 (Asymmetric Kernel) (相對於以往的方形的對稱摺積核),並在訓練期間做資料正規化 (Normalization)加速訓練。我們發現即使現在多以寬又深的網路作為趨勢,發展更佳的資料分類方法,但其實本論文所提出的架構,兩層不用預訓練 (Pre-train)的CNN即可達到相較DCASE2016排名第五名更佳的效果。
    ;Detection and Classification of Acoustic Scenes and Events (DCASE) Challenge have held in three times. The first DCASE Challenge was held in 2013. Then, DCASE2016 Challenge was the 2nd times of DCASE Challenge. The result why IEEE Audio and Acoustic Signal Processing (AASP) held the 2nd challenge after 3 years is to reset a brand new dataset and united the rule of ASC.
    In this work, we use the dataset of ASC from DCASE2016 to propose an Asymmetric Kernel Convolutional Neural Network (AKCNN), whose kernel shape is very different from the traditionally squared kernel. The width and height of the kernel are asymmetric which means that the shape of the kernel is a rectangular kernel. Also, the proposed uses weight normalization (WN) to accelerate the training time because it can early converge the training loss and testing accuracy during training. The best of all, WN can help increase the accuracy of ASC. The result shows that AKCNN achieves accuracy 86.7%. If we rank the score in DCASE2016 ASC Challenge, it would show that we have a better score than the 5th place.
    Appears in Collections:[Graduate Institute of Communication Engineering] Electronic Thesis & Dissertation

    Files in This Item:

    File Description SizeFormat
    index.html0KbHTML393View/Open


    All items in NCUIR are protected by copyright, with all rights reserved.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明