中大機構典藏-NCU Institutional Repository-提供博碩士論文、考古題、期刊論文、研究計畫等下載:Item 987654321/81225
English  |  正體中文  |  简体中文  |  Items with full text/Total items : 78852/78852 (100%)
Visitors : 38099050      Online Users : 880
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version


    Please use this identifier to cite or link to this item: http://ir.lib.ncu.edu.tw/handle/987654321/81225


    Title: 端到端輕量化音樂源分離深度學習模型;Lightweight End-to-End Deep Learning Model for Music Source Separation
    Authors: 王耀霆;Wang, Yao-Ting
    Contributors: 資訊工程學系
    Keywords: 深度學習;語音增強;音源分離;Deep Learning;Speech Enhancement;Audio Source Separation
    Date: 2019-07-31
    Issue Date: 2019-09-03 15:39:52 (UTC+8)
    Publisher: 國立中央大學
    Abstract: 深度類神經網路(DNN)在音訊處理的領域中進展快速,過去大多利用經由短時傅立葉轉換(STFT)出來的頻譜資訊進行處理,但其中許多作法都是僅處理實數部分,近年來為了避免複數資訊未被考慮而造成的資訊損失,陸續提出了基於時域資訊直接進行端到端處理的音源分離深度學習模型。不過這些方法一來模型龐大,參數量多,在設備運算效能受限的狀態下難以利用;另一方面,一般都需要較長時間的輸入才能獲得良好的分離效果,這代表著高延遲,對於需要低延遲的應用而言較無助益。
    本論文基於前人之研究提出端到端輕量化音樂源分離深度學習模型,減少模型參數量並加速運算,並提出新穎的解碼器來進一步提升在輸入時間長度受限的狀態下的分離效果。實驗結果表明,本論文提出的方法,只需過去10%以下或是更少的參數量,就能獲得優於之前的分離結果。
    ;DNNs(Deep neural networks) have made rapid progress in the field of audio processing. In the past, most of them used spectrum information via STFT (Short Term Fourier Transform), but them usually only deal with real parts. In recent years, in order to avoid the information loss caused by the lack of consideration of complex value, deep learning models have gradually been proposed for audio source separation based on time domain for end-to-end processing. However, those models are huge, i.e., the number of parameters is very large. Therefore, it is difficult to use them where the computing resources of the device is limited. On the other hand, it generally takes a long term input to obtain a good result for separation, which represents high delay. It is less helpful for some applications that require low latency.
    Based on the previous research, this thesis proposes a lightweight end-to-end music source separation deep learning model. To reduce the number of parameters and accelerate the computation, and then propose a novel decoder that can further enhance the result of separation while the input context length is limited. The experimental results show that the method proposed in this paper can obtain better than the previous results by only uses 10% or less parameters.
    Appears in Collections:[Graduate Institute of Computer Science and Information Engineering] Electronic Thesis & Dissertation

    Files in This Item:

    File Description SizeFormat
    index.html0KbHTML147View/Open


    All items in NCUIR are protected by copyright, with all rights reserved.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明