dc.description.abstract | With the development of smart cities and driverless driving technology, the information contained in environmental sounds is becoming more and more important in daily life. After correct and effective conversion, this information allows us to further analyze and analyze the environment we are living.
In recent years, with the advancement of GPUs and the advent of the era of big data, deep learning continues to bring major breakthroughs in various fields, especially in computer vision and natural language processing, which makes people’s lives more meaningful.
In the field of sound source separation, there is an important concept Computational Auditory Scene Analysis (CASA). An important goal of this concept is to place the robot in an acoustic scene, such as a street intersection, an airport lobby, or even a shopping center. It can fully understand the acoustic environment in which it is located, know the position of each sound source, and know which sound sources are available. In the life where you live, there are very diverse audio receiving devices. Through the concept of the Internet of Things, it is more convenient to make mobile devices a source of data collection.
This paper proposes to classify the acoustic scenes of the public dataset TAU Urban Acoustic Scenes 2020 Mobile of the DCASE Challenge 2020 through deep neural networks. This competition is authorized by IEEE AASP. It is the largest competition in this field. It has been held for the sixth time. It is co-organized by CMU, INRIA of France, and Tampere University of Finland. Google and Audio Analytic (audio processing company in Cambridge, UK) Co-sponsored. Acoustic features use Log-mel spectrogram as the main method, and the neural network part uses DenseNet structure as the basis. It classifies 10 types of acoustic scenes in the dataset, and finally achieves 65.84% accuracy, which is higher than the baseline system. | en_US |