端到端輕量化音樂源分離深度學習模型;Lightweight End-to-End Deep Learning Model for Music Source Separation

NCUIR > College of Electrical Engineering & Computer Science > Graduate Institute of Computer Science and Information Engineering > Electronic Thesis & Dissertation > Item 987654321/81225

Please use this identifier to cite or link to this item: http://ir.lib.ncu.edu.tw/handle/987654321/81225

Title:	端到端輕量化音樂源分離深度學習模型;Lightweight End-to-End Deep Learning Model for Music Source Separation
Authors:	王耀霆;Wang, Yao-Ting
Contributors:	資訊工程學系
Keywords:	深度學習;語音增強;音源分離;Deep Learning;Speech Enhancement;Audio Source Separation
Date:	2019-07-31
Issue Date:	2019-09-03 15:39:52 (UTC+8)
Publisher:	國立中央大學
Abstract:	深度類神經網路(DNN)在音訊處理的領域中進展快速，過去大多利用經由短時傅立葉轉換(STFT)出來的頻譜資訊進行處理，但其中許多作法都是僅處理實數部分，近年來為了避免複數資訊未被考慮而造成的資訊損失，陸續提出了基於時域資訊直接進行端到端處理的音源分離深度學習模型。不過這些方法一來模型龐大，參數量多，在設備運算效能受限的狀態下難以利用；另一方面，一般都需要較長時間的輸入才能獲得良好的分離效果，這代表著高延遲，對於需要低延遲的應用而言較無助益。本論文基於前人之研究提出端到端輕量化音樂源分離深度學習模型，減少模型參數量並加速運算，並提出新穎的解碼器來進一步提升在輸入時間長度受限的狀態下的分離效果。實驗結果表明，本論文提出的方法，只需過去10%以下或是更少的參數量，就能獲得優於之前的分離結果。 ;DNNs(Deep neural networks) have made rapid progress in the field of audio processing. In the past, most of them used spectrum information via STFT (Short Term Fourier Transform), but them usually only deal with real parts. In recent years, in order to avoid the information loss caused by the lack of consideration of complex value, deep learning models have gradually been proposed for audio source separation based on time domain for end-to-end processing. However, those models are huge, i.e., the number of parameters is very large. Therefore, it is difficult to use them where the computing resources of the device is limited. On the other hand, it generally takes a long term input to obtain a good result for separation, which represents high delay. It is less helpful for some applications that require low latency. Based on the previous research, this thesis proposes a lightweight end-to-end music source separation deep learning model. To reduce the number of parameters and accelerate the computation, and then propose a novel decoder that can further enhance the result of separation while the input context length is limited. The experimental results show that the method proposed in this paper can obtain better than the previous results by only uses 10% or less parameters.
Appears in Collections:	[Graduate Institute of Computer Science and Information Engineering] Electronic Thesis & Dissertation

Files in This Item:

File	Description	Size	Format
index.html		0Kb	HTML	116	View/Open

社群 sharing

Loading...