English  |  正體中文  |  简体中文  |  全文筆數/總筆數 : 78852/78852 (100%)
造訪人次 : 37839975      線上人數 : 502
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜尋範圍 查詢小技巧:
  • 您可在西文檢索詞彙前後加上"雙引號",以獲取較精準的檢索結果
  • 若欲以作者姓名搜尋,建議至進階搜尋限定作者欄位,可獲得較完整資料
  • 進階搜尋


    請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/74713


    題名: 全複數深度遞迴類神經網路應用於歌曲人聲分離;Complex-Valued Deep Recurrent Neural Network for Singing Voice Separation
    作者: 俞果;Yu, Kuo
    貢獻者: 資訊工程學系
    關鍵詞: 深度類神經網路;歌唱人聲分離;相位資訊;Deep Neural Network;Singing Voice Separation;Phase Informaiton
    日期: 2017-08-14
    上傳時間: 2017-10-27 14:37:03 (UTC+8)
    出版者: 國立中央大學
    摘要: 深度類神經網路(DNN)在多媒體訊號處理的領域中有不凡的表現,但大部分的基於深度類神經網路的作法都是在處理實數資料,只有少數設計成能處理複數資料,即便複數資料在多媒體的領域佔有重要的地位,因此本論文提出全複數深度遞迴類神經網路(C-DRNN)的架構來處理歌曲人聲分離,本架構可以直接處理短時傅立葉轉換(STFT)出來的複數資料,並且本架構的權重以及激發函數等都是以複數計算。本論文的目標為從歌曲中分離人聲與樂器,在倒傳遞時使用複數微分成本函數,進而得到複數梯度,本架構也對輸出層做了改進,加入了複數比例遮罩以確保最後估計的輸出不會超過輸入的數值,並且在訓練網路時多加了鑑別項以增加網路的計算能力。最後,本論文提出的方法使用MIR-1K資料庫實驗歌曲人聲分離的能力,實驗結果顯示本方法較其他深度類神經網路表現更加優秀。;Deep neural networks (DNN) have performed impressively in the processing of multimedia signals. Most DNN-based approaches were developed to handle real-valued data; very few have been designed for complex-valued data, despite their being essential for processing various types of multimedia signal. Accordingly, this work presents a complex-valued deep recurrent neural network (C-DRNN) for singing voice separation. The C-DRNN operates on the complex-valued short-time discrete Fourier transform (STFT) domain. A key aspect of the C-DRNN is that the activations and weights are complex-valued. The goal herein is to reconstruct the singing voice and the background music from a mixed signal. For error back-propagation, CR-calculus is utilized to calculate the complex-valued gradients of the objective function. To reinforce model regularity, two constraints are incorporated into the cost function of the C-DRNN. The first is an additional masking layer that ensures the sum of separated sources equals the input mixture. The second is a discriminative term that preserves the mutual difference between two separated sources. Finally, the proposed method is evaluated using the MIR-1K dataset and a singing voice separation task. Experimental results demonstrate that the proposed method outperforms the state-of-the-art DNN-based methods.
    顯示於類別:[資訊工程研究所] 博碩士論文

    文件中的檔案:

    檔案 描述 大小格式瀏覽次數
    index.html0KbHTML282檢視/開啟


    在NCUIR中所有的資料項目都受到原著作權保護.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明