密集卷積網路應用在聲學場景分類

DC 欄位	值	語言
DC.contributor	電機工程學系	zh_TW
DC.creator	劉建杰	zh_TW
DC.creator	Jian-Jie Liu	en_US
dc.date.accessioned	2020-7-29T07:39:07Z
dc.date.available	2020-7-29T07:39:07Z
dc.date.issued	2020
dc.identifier.uri	http://ir.lib.ncu.edu.tw:88/thesis/view_etd.asp?URN=105521115
dc.contributor.department	電機工程學系	zh_TW
DC.description	國立中央大學	zh_TW
DC.description	National Central University	en_US
dc.description.abstract	隨著智慧城市發展與無人駕駛技術的推動，日常生活中，環境聲音包含的資訊越來越重要，這些資訊透過正確且有效的轉換後，讓我們可以對身處的環境有更進一步的分析及處理。而近年來隨著GPU進步與大數據時代的來臨，深度學習在各種不同的領域持續帶來重大的突破，尤其是在電腦視覺與自然語言處理方面更是讓人們的生活更有感。而在聲源分離領域，有一個重要的概念Computational Auditory Scene Analysis(CASA)，此概念的一個重要目標就是將機器人置身於某個聲學場景，比如街道的十字路口、機場的大廳、甚至是購物中心，讓它能全盤了解自己所處的聲學環境，知道各個聲源的位子，知道有那些聲源。在所處的生活中，有著非常多元的音訊接收裝置，透過物聯網的概念，可以更方便的讓行動裝置成為資料蒐集的來源。本論文提出透過深度神經網路對DCASE Challenge 2020的公開資料集TAU Urban Acoustic Scenes 2020 Mobile進行聲學場景分類。此競賽是由IEEE AASP授權的競賽，為該領域目前最大型的競賽，今年已經舉辦第六屆，由CMU、法國INRIA、芬蘭Tampere大學共同舉辦，Google和Audio Analytic(英國劍橋的音頻處理公司)共同贊助。聲學特徵採用Log-mel spectrogram為主要方法，神經網路的部分採用DenseNet的結構為基礎，針對dataset中10類聲學場景進行分類，最終可達65.84%的準確度，並高於baseline system。	zh_TW
dc.description.abstract	With the development of smart cities and driverless driving technology, the information contained in environmental sounds is becoming more and more important in daily life. After correct and effective conversion, this information allows us to further analyze and analyze the environment we are living. In recent years, with the advancement of GPUs and the advent of the era of big data, deep learning continues to bring major breakthroughs in various fields, especially in computer vision and natural language processing, which makes people’s lives more meaningful. In the field of sound source separation, there is an important concept Computational Auditory Scene Analysis (CASA). An important goal of this concept is to place the robot in an acoustic scene, such as a street intersection, an airport lobby, or even a shopping center. It can fully understand the acoustic environment in which it is located, know the position of each sound source, and know which sound sources are available. In the life where you live, there are very diverse audio receiving devices. Through the concept of the Internet of Things, it is more convenient to make mobile devices a source of data collection. This paper proposes to classify the acoustic scenes of the public dataset TAU Urban Acoustic Scenes 2020 Mobile of the DCASE Challenge 2020 through deep neural networks. This competition is authorized by IEEE AASP. It is the largest competition in this field. It has been held for the sixth time. It is co-organized by CMU, INRIA of France, and Tampere University of Finland. Google and Audio Analytic (audio processing company in Cambridge, UK) Co-sponsored. Acoustic features use Log-mel spectrogram as the main method, and the neural network part uses DenseNet structure as the basis. It classifies 10 types of acoustic scenes in the dataset, and finally achieves 65.84% accuracy, which is higher than the baseline system.	en_US
DC.subject	聲學場景分類	zh_TW
DC.subject	Acoustic Scene Classification	en_US
DC.title	密集卷積網路應用在聲學場景分類	zh_TW
dc.language.iso	zh-TW	zh-TW
DC.title	Densely Connected Convolutional Networks (DenseNet) for Acoustic Scene Classification	en_US
DC.type	博碩士論文	zh_TW
DC.type	thesis	en_US
DC.publisher	National Central University	en_US

博碩士論文 105521115 完整後設資料紀錄