結合凸非負矩陣分解與壓縮感測之旋積盲訊號源分離技術

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：25

、訪客IP：3.144.95.248

姓名

張裕江(Yu-chiang Chang) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

結合凸非負矩陣分解與壓縮感測之旋積盲訊號源分離技術
(Convex Nonnegative Matrix Factorization and Compressive Sensing Architecture for Convolutive Blind Source Separation)

相關論文

★ Single and Multi-Label Environmental Sound Recognition with Gaussian Process	★ 波束形成與音訊前處理之嵌入式系統實現
★ 語音合成及語者轉換之應用與設計	★ 基於語意之輿情分析系統
★ 高品質口述系統之設計與應用	★ 深度學習及加速強健特徵之CT影像跟骨骨折辨識及偵測
★ 基於風格向量空間之個性化協同過濾服裝推薦系統	★ RetinaNet應用於人臉偵測
★ 金融商品走勢預測	★ 整合深度學習方法預測年齡以及衰老基因之研究
★ 漢語之端到端語音合成研究	★ 基於 ARM 架構上的 ORB-SLAM2 的應用與改進
★ 基於深度學習之指數股票型基金趨勢預測	★ 探討財經新聞與金融趨勢的相關性
★ 基於卷積神經網路的情緒語音分析	★ 運用深度學習方法預測阿茲海默症惡化與腦中風手術存活

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 ( 永不開放)

摘要(中)

本論文提出一新穎之盲訊號源分離技術，此技術分為兩個階段，第一階段進行時頻訊號初步分離，我們提出運用凸非負矩陣分解(convex nonnegative matrix factorization)結合複數表示法(complex representation)來估測混合矩陣，在解源訊號方面我們提出基於頻率之壓縮感測(compressive sensing)架構。在本階段，首先對每個時頻點擷取出能量比(level-ratio)與相位差(phase difference)參數；接著，再將參數以各頻帶為單位進行離群樣本去除(outlier elimination)，並利用凸非負矩陣分解演算法進行參數基底擷取；再來，將擷取出的基底解完排列問題後透過複數表示法轉換成混合矩陣；此混合矩陣將結合提出之壓縮感測架構進行源訊號分離。在此架構下、我們將混合矩陣擴充成測量矩陣，以及事先訓練的全域字典利用正交匹配追蹤演算法(orthogonal matching pursuit)以音框為單位解稀疏係數；將字典乘上稀疏係數便完成訊號分離。在第二階段，我們提出結合隱藏式馬可夫模型(hidden markov model)與上個階段之初步分離訊號，進行基於知識基礎之分離訊號強化。首先將源訊號的基礎音素轉換為梅爾倒頻譜係數(MFCC coefficient)後用以訓練各個音素模型。初步分離訊號代入此隱藏式馬可夫模型辨識後，一個音框對於各音素都會產生不同的對數似然(log likelihood)值，我們選出最大似然值的對應因素當作此音框之音素知識。全域字典根據此音素知識進行擴充以增加適應性，最後回到第一階段之壓縮感測架構完成源訊號分離。實驗結果顯示本論文提出之方法相較於傳統方法，在訊號-干擾比(SIR)上有顯著的提升。

摘要(英)

This thesis proposed a novel compressive sensing blind source separation (BSS) technique, there are two phase in this technique. In the first phase we focus on source separation, convex nonnegative matrix factorization (convex NMF) incorporates complex representation is proposed to estimate the mixing matrix, and frequency based compressive sensing (CS) framework is proposed to separate the source. In this phase, extract level-ratio and phase difference as the feature for each time-frequency point first. Next, eliminate the outlier of the band based feature by implementing outlier elimination, and cluster the remains by implementing convex NMF algorithm to get the bases of the features. Then, transform the bases into the mixing matrix by complex representation. The mixing matrix is going to be used for source separation with the proposed compressive sensing framework. In this framework, we use the measurement matrix which extended by the mixing matrix, and pre-trained global dictionary to solve the sparse coefficient with orthogonal match pursuit (OMP) frame by frame. Finally, multiply the global dictionary by the sparse coefficient to finish the source separation. In the second phase, the knowledge based source separation enforcement is implemented by the preliminary separated source with hidden markov model (HMM). Transform the basic factor of the source into MFCC coefficient to train each factor model. After substituting the preliminary separated source into HMM, there are different likelihood produced from each factor in a frame. The maximum log likelihood of the corresponding factor is choosed as the factor knowledge for the frame. We extend the global dictionary by the factor to increase the adaptivity of the dictionary. Finally, we go back to the compressive sensing step in the first phase to finish source separation. Experimental results shows that SIR in proposed method is improved compare to the traditional method.

關鍵字(中)

★ 盲訊號源分離
★ 壓縮感測
★ 凸非負矩陣分解

關鍵字(英)

★ blind source separation
★ compressive sensing
★ convex NMF

論文目次

章節目次
中文摘要.................................................................................................................................... ⅰ
英文摘要.................................................................................................................. ⅱ
章節目次.................................................................................................................. ⅲ
圖目錄 ...................................................................................................................... ⅴ
表目錄 ...................................................................................................................... ⅵ
第一章緒論 ................................................................................................. 1
1-1 前言 ............................................................................................................................... 1
1-2 研究動機與目的 ............................................................................................................ 1
1-3 章節架構簡述 ................................................................................................................ 3
第二章相關研究與文獻探討相關研究與文獻探討相關研究與文獻探討相關研究與文獻探討相關研究與文獻探討 ...................................................................... 4
2-1 盲訊號源分離相關文獻 ................................................................................................ 4
2-1-1 模型種類 ................................................................................................................ 4
2-1-2 派別簡介 ................................................................................................................ 5
2-2 稀疏編碼與壓縮感測架構 ............................................................................................ 8
2-2-1 稀疏編碼 ................................................................................................................ 8
2-2-2 壓縮感測架構 ........................................................................................................ 9
2-3 K-SVD字典與正交匹配追蹤演算法 ......................................................................... 11
2-3-1 K-SVD字典訓練演算法 ..................................................................................... 11
2-3-2 正交匹配追蹤演算法 .......................................................................................... 15
第三章特徵參數擷取與離群樣本去除 ..................................................... 17
3-1 簡介 .............................................................................................................................. 17 3-2 波束形成定理 ............................................................................................................... 17
3-3 參數之樣本型態 .......................................................................................................... 18
3-4 參數形式定義 .............................................................................................................. 19
3-5 K-近鄰演算法及其應用 ............................................................................................. 20 3-5-1 K-近鄰演算法 ...................................................................................................... 20
iv
3-5-2 離群樣本去除 ...................................................................................................... 21
第四章傳統混合矩陣估測 ........................................................................ 23
4-1 簡介 .............................................................................................................................. 23
4-2 K-means群聚演算法 ................................................................................................ 23
4-3 排列問題 ...................................................................................................................... 24
4-4 混合係數與混合矩陣 ................................................................................................... 27 4-4-1 混合係數 .............................................................................................................. 27 4-4-2 混合矩陣 .............................................................................................................. 27
第五章盲訊號源分離系統 ........................................................................ 28
5-1 系統架構概要系統架構概要 .............................................................................................................. 28
5-2 凸非負矩陣分解基底擷取演算法 .............................................................................. 29
5-2-1 單一頻帶基底擷取 ............................................................................................. 30
5-2-2 凸非負矩陣分解的特性 ..................................................................................... 31
5-2-3 凸非負矩陣分解群聚演算法的更新 ................................................................. 33
5-3 混合矩陣估測 .............................................................................................................. 34
5-4 提出之時頻域壓縮感測架構 ...................................................................................... 35
5-5 提出具知識基礎之訊號分離強化 .............................................................................. 38
5-5-1 隱藏式馬可夫模型 ............................................................................................. 38
5-5-1-1 向前演算法 .................................................................................................. 39
5-5-1-2 期望最大化演算法 ....................................................................................... 40
5-5-2 區域字典選擇 ...................................................................................................... 41
第六章實驗結果 ....................................................................................... 43
6-1 實驗環境與設置 .......................................................................................................... 43
6-2 提出系統與BASELINE之比較 .................................................................................... 44
6-2-1 分離效能比較 ...................................................................................................... 45
第七章結論與未來研究方向 .................................................................... 51
7-1 結論 .............................................................................................................................. 51
7-2 未來研究方向 .............................................................................................................. 52
參考文獻 ...................................................................................................... 53

參考文獻

[1] T. Xu and W. Wang, “A Block-based Compressed Sensing Method for Underdetermined Blind Speech Separation incorporating Binary Mask,” in Proceedings of IEEE International Conference, ICASSP, pp. 2022-2025, Mar. 2010.
[2] T. Xu and W. Wang, “A Compressed Sensing Approach for Underdetermined Blind Audio Source Separation with Sparse Representation,” in Proceedings of IEEE Workshop, SP, pp. 493-496, Aug. 2009.
[3] G. Bao, Z. Ye, Xu Xu, and Y. Zhou, “A Compressed Sensing Approach to Blind Separation of Speech Mixture Based on a Two-Layer Sparsity Model,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 21, no. 5, May 2013.
[4] Y. Chen and J. Liu, “A New Method for Underdetermined Convolutive Blind Source Separation in Frequency Domain,” in Proceedings of IEEE International Conference on Computer Science and Network Technology, pp. 1484-1487, Dec. 2012.
[5] W. Zhang, J. Liu, J. Sun and S. Bai, “A New Two-Stage Approach to Underdetermined Blind Source Separation using Sparse Representation,” in Proceedings of IEEE International Conference, ICASSP,vol.3 pp. III-953 - III-956, Apr. 2007.
[6] Y. Zhang, Shanghai and Yong Fang, “A NMF Algorithm for Blind Separation of Uncorrelated Signals,” in Proceedings of the 2007 International Conference on Wavelet Analysis and Pattern Recognition, pp.99-1003 Nov. 2007
[7] H. Sawada, R. Mukai, S. Araki and S. Makino, “A Robust and Precise Method for solving the Permutation Problem of Frequency-Domain Blind Source Separation,” IEEE
Transactions on Audio, Speech, and Language Processing, vol. 12, no. 5 pp. 530-538,, Sept. 2004.
[8] I. Jafari, R. Togneri, and S. Nordholm , “A Robust Approach to Reverberant Blind Source Separation in the Presence of Noise for Arbitrarily Arranged Sensors,” in Proceedings of IEEE International Conference, ICASSP, pp. 2413-2416, Mar. 2012.
[9] A. Blin, S. Araki and S. Makino, “A Sparseness-Mixing Matrix Estimation (SMME) solving the Underdetermined BSS for Convolutive Mixtures,” in Proceedings of IEEE International Conference, ICASSP, vol. 4, pp. iv-85 - iv-88, May 2004.
[10] L. Rabiner, “A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition,” in Proceedings of IEEE, vol. 7, no. 2, Feb. 1989.
[11] Daniel D. Lee, H. Sebastian Seung, “Algorithms for Non-negative Matrix,” in proceedings from the conference Neural Information Processing Systems, NIPS, 2000.
[12] M. Daubechies, I. Defrise and C. De Mol, “An Iterative Thresholding Algorithm for Linear Inverse Problems with A Sparsity Constraint,” Communications on Pure and Applied Mathematics, vol. 57, no. 11, pp. 1413–1457, Nov. 2004.
[13] N. Mitianoudis and M. E. Davies , “Audio Source Separation of Convolutive Mixtures,”
IEEE Transactions on Speech and Audio Processing, vol. 11, no. 5, Sept. 2003.
[14] B. D. Van Veen and K. M. Buckley, “Beamforming, A Versatile Approach to Spatial
Filtering,” IEEE Magazine, ASSP, vol. 5, no. 2, Apr. 1988. [15] D. T. Pham, “Blind Separation of Instantaneous Mixture of Sources via An Independent Component Analysis,” IEEE Transactions on Signal Processing, vol. 44, no.11, pp. 2768-2779, Nov. 1996.
[16] L. Qiao , C. Yin, H. Xu and H. Li and more authors, “Blind Separation of Speech Sources in Multichannel Compressed Sensing,” in Proceedings of IEEE International Conference on Instrumentation and Measurement Technology Conference, pp. 1771-1774, May 2013.
[17] A. Aissa-El-Bey, K. Abed-Meraim and Y. Grenier , “Blind Separation of Underdetermined Convolutive Mixtures using Their Time-Frequency Representation,”
IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, no. 5, pp. 1540-1550, July 2007.
[18] H. H. Dam, S. Nordholm, S. Low and A. Cantoni, “Blind Signal Separation using Steepest Descent Method,” IEEE Transactions on Signal Processing, vol. 55, no.8, pp. 4198-4207, Sept. 2007.
[19] P. Georgiev, F. Theis and A. Cichocki, “Blind Source Separation and Sparse Component Analysis of Overcomplete Mixtures,” in Proceedings of IEEE International Conference, ICASSP, vol. 5, pp. v-493 - 6, May 2004. [
20] J. Cermak, S. Araki, H. Sawada and S. Makino , “Blind Source Separation Based on a Beamformer Array and Time Frequency Binary Masking,” in Proceedings of IEEE International Conference, ICASSP, vol. 1, pp. I-145 - I-148, Apr. 2007.
[21] R. G. Baraniuk, “Compressive Sensing [Lecture Notes],” IEEE Signal Processing Magazine, vol. 24, no.4, pp. 118-124, Jul. 2007.
[22] C. Ding, T. Li and M. I. Jordan, “Convex and Semi-Nonnegative Matrix Factorization,” IEEE Transactions on pattern analysis and machine intelligence, vol. 32, no. 1, pp. 45-55, Jan. 2010.
[23] J. T. Chien and H. L. Hsieh, “Convex Divergence ICA for Blind Source Separation,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 20, no. 1, pp. 302-313, July 2012.
[24] Z. He, S. Xie, S. Ding and A. Cichocki, “Convolutive Blind Source
Separation in the Frequency Domain Based on Sparse Representation,” IEEE Transactions on Audio, Speech, and Language Processing, vol.15, no. 5, pp. 1551-1563, July 2007.
[25] F. Nesta, P. Svaizer and M. Omologo, “Convolutive BSS of Short Mixtures by ICA
Recursively Regularized Across Frequencies,” IEEE Transactions on Audio, Speech, and Language Processing, vol.19, no. 3, pp. 624-639, July 2011.
[26] A. H. Zhang, B. G. Bi, C. Sirajudeen Gulam Razul, D.C.-M. See, “Estimation of Underdetermined Mixing Matrix with Unknown Number of Overlapped Sources in Short-Time Fourier Transform Domain,” in Proceedings of IEEE International Conference, ICASSP, pp. 6486-6490, May 2013.
[27] S. Kurita, H. Saruwatari, S. Kajita, K. Takeda and more authors, “Evaluation of Bind
Signal Separation Method using Directivity Pattern under Reverberant Conditions,” in Proceedings of IEEE International Conference, ICASSP, vol. 5, pp. 3140-3143, Jun. 2000.
[28] D. L. Donoho, “For Most Large Underdetermined Systems of Linear Equations the Minimal l1-norm Solution is also The Sparsest Solution,” Communications on Pure and Applied Mathematics, vol. 59, no. 6, pp. 797–829, 2006.
[29] M. Aharon, M. Elad and A. Bruckstein, “K-SVD_ An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation,” IEEE Transactions on Signal
Processing, vol. 54, no. 11, pp. 4311-4322, Nov. 2006.
[30] G. Bao, Y. Xu and Z. Ye, “Learning a Discriminative Dictionary for Single-Channel Speech Separation,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.22, no. 7, pp. 1130-1138, July 2014
[31] S. Winter, W. Kellermann, H. Sawada and S. Makino, “MAP-Based Underdetermined Blind Source Separation of Convolutive Mixtures by Hierarchical Clustering and l1-Norm Minimization,” EURASIP Journal on Advances in Signal Processing, vol. 2007, Article ID 24717, pp. 1-12, Jan. 2007.
[32] M. Cobos and Jose J. Lopez, “Maximum a Posteriori Binary Mask Estimation for Underdetermined Source Separation Using Smoothed Posteriors,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.20, no. 7, pp. 2059-2064, July 2012
[33] K. Kokkinakis and A. K. Nandi, “Multichannel Blind Deconvolution for Source Separation in Convolutive Mixtures of Speech,” IEEE Transactions on Audio, Speech,
and Language Processing, vol.14, no. 1, pp. 200-212, Jan. 2006.
[34] V. Hautamaki, I. Karkkainen and P. Franti, “Outlier Detection using K-Nearest Neighbor Graph,” in Proceedings of IEEE International Conference, ICPR, pp.430-433, Aug. 2004.
[35] E. Vincent, R. Gribonval, C. Fevotte, “Performance Measurement in Blind Audio Source Separation,” IEEE Transactions on Audio, Speech, and Language Processing, vol.14, no. 4, pp. 1462-1469, July 2006.
[36] M. Kawamoto, K. Kohno and Y. Inouye, “Robust Super-Exponential Methods for Deflationary Blind Source Separation of Instantaneous Mixtures,” IEEE Transactions on Signal Processing, vol. 53, no. 5, pp. 1933-1937, May 2005.
[37] E. J. Candes, J. Romberg and T. Tao, “Robust Uncertainty Principles: Exact Signal
Reconstruction from Highly Incomplete Frequency Information,” IEEE Transactions on Information Theory, vol. 52, no. 2, pp. 489-509, Feb. 2006.
[38] J. A. Tropp, A. C. Gilbert, “Signal Recovery From Random Measurements Via Orthogonal Matching Pursuit,” IEEE Transactions on Information Theory, vol. 53, no. 12, pp. 4655-4666, Feb. 2007.
[39] M. H. Radfar, R. M. Dansereau, “Single-Channel Speech Separation Using Soft Mask Filtering,” IEEE Transactions on Audio, Speech, and Language Processing, vol.15, no. 8, pp. 2299-2310, Nov. 2007.
[40] B. Barkat, F. Sattar and K. Abed-Meraim, “Sources Separation of Instantaneous Mixtures Using a Linear Time-Frequency Representation and Vectors Clustering,” in Proceedings of IEEE International Conference, ICASSP, vol. 3, pp. III, May 2006.
[41] T. Xu, W. Wang, and W. Dai, "Sparse Coding with Adaptive Dictionary Learning for Underdetermined Blind Speech Separation", Speech Communication, vol. 55, no. 3, pp. 432-450, 2013.
[42] M. D. Plumbley, T. Blumensath, L. Daudet, R. Gribonval and M. E. Davies, "Sparse
Representations in Audio & Music: from Coding to Source Separation," in proceedings of the IEEE, vol. 98, no. 6, pp. 995-1005, 2010.
[43] Y. Bo, L. Liu, “The Blind Source Separation Based on The Compressed Sensing,” in Proceedings of IEEE International Conference on Consumer Electronics, Communications and Networks, pp. 2948-2952, Apr. 2012.
[44] E. J. Cand`es, “The Restricted Isometry Property and Its Implications for Compressed Sensing, “ Compte Rendus de I’Academie des Sciences, Paris, SerieⅠ, 346 589-592, Feb. 2008.
[45] Z. Koldovsk and P. Tichavsky, “Time-Domain Blind Separation of Audio Sources on the Basis of a Complete ICA Decomposition of an Observation Space,” IEEE Transactions on Audio, Speech, and Language Processing, vol.19, no. 2, pp. 406-416, Feb. 2011.
[46] Y. Li and, S. -I. Amari, A. Cichocki, D. W. C. Ho, and more authors, “Underdetermined Blind Source Separation Based on Sparse Representation,” IEEE Transactions on Signal Processing, vol. 54, no. 2, pp. 423-437, Feb. 2006.
[47] S. Araki, H. Sawada, R. Mukai and S. Makino, “Underdetermined Blind Sparse Source
Separation for Arbitrarily Arranged Multiple Sensors,” Signal Process., vol. 87, pp. 1833-1847, Feb. 2007.
[48] H. Sawada, S. Araki and S. Makino, “Underdetermined Convolutive Blind Source
Separation via Frequency Bin-Wise Clustering and Permutation Alignment,” IEEE
Transactions on Audio, Speech, and Language Processing, vol. 19, no. 3, pp. 516-527,
Mar. 2011.
[49] V. G. Reju, Soo Ngee Koh, I. Y. Soon, “Underdetermined Convolutive Blind Source Separation via Time-Frequency Masking,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 18, no. 1, pp. 101-116, Jan. 2010.
[50] F. Nesta and M. Fakhry, “Unsupervised Spatial Dictionary Learning for Sparse Underdetermined Multichannel Source Separation,” in Proceedings of IEEE International Conference, ICASSP, pp. 86-90, May 2013.
[51] E. J. Candes and T. Tao, “Decoding by Linear Programming, ” IEEE Transactions on Information Theory, vol. 51, no. 12, pp. 4203-4215, Feb. 2005
[52] E. J. Candes, J. Romberg and T. Tao, “Stable Signal Recovery from Incomplete and
Inaccurate Measurement” Communications on Pure and Applied Mathematics, vol. 59, no. 8, pp. 1207-1223, 2006.
[53] Roomsim Toolbox : http://media.paisley.ac.uk/~campbell/Roomsim/ , available on
2014 / 7 / 11.
[54] BSS eval ToolBox : http://bass-db.gforge.inria.fr/bss_eval/ , available on 2014 / 7 / 11

指導教授

王家慶(Jia-ching Wang)

審核日期

2014-8-26

推文