應用於旋積盲訊號源分離之BIC基礎式訊號源數目估測及相位補償技術

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：13

、訪客IP：3.145.54.83

姓名

莊祥瓏(Hsiang-lung Chuang) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

應用於旋積盲訊號源分離之BIC基礎式訊號源數目估測及相位補償技術
(BIC-Based Source Number Estimation and Phase Compensation Technique for Convolutive BSS)

相關論文

★ Single and Multi-Label Environmental Sound Recognition with Gaussian Process	★ 波束形成與音訊前處理之嵌入式系統實現
★ 語音合成及語者轉換之應用與設計	★ 基於語意之輿情分析系統
★ 高品質口述系統之設計與應用	★ 深度學習及加速強健特徵之CT影像跟骨骨折辨識及偵測
★ 基於風格向量空間之個性化協同過濾服裝推薦系統	★ RetinaNet應用於人臉偵測
★ 金融商品走勢預測	★ 整合深度學習方法預測年齡以及衰老基因之研究
★ 漢語之端到端語音合成研究	★ 基於 ARM 架構上的 ORB-SLAM2 的應用與改進
★ 基於深度學習之指數股票型基金趨勢預測	★ 探討財經新聞與金融趨勢的相關性
★ 基於卷積神經網路的情緒語音分析	★ 運用深度學習方法預測阿茲海默症惡化與腦中風手術存活

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 ( 永不開放)

摘要(中)

在過去近十年中，關於訊號分離的研究，受到了許多學者的注目。尤其盲訊號源的領域更是被受重視。所謂的盲訊號，意指任何關於源訊號以及混合程序的資訊都是未知的情況，在解決這類問題時，我們唯一可依賴的只有收錄到的混合訊號。然而，本論文欲解決的問題是在不知道源訊號個數的情況下，一個稀疏欠定的卷積盲訊號源分離。我們的演算法分為兩個階段，先估計混合矩陣然後才利用此矩陣分離源訊號。
在估計混合矩陣時，首先定義了兩個特徵參數，包括了level-ratio以及phase-difference，然後利用KNN Graph方式，去除資料中的離群樣本，並用K-Means演算法對其餘的資料分群，我們提出將K-Means演算法和貝氏資訊準則作組合以達到估測源訊號個數的目的。
由於利用K-Means演算法所得到的混合矩陣之行向量中存在著排列的問題。我們偵測源訊號之方位(Direction of Arrival, DOA)，參考每個行向量所對應到的角度進而解決排列問題，達到估計混合矩陣的目的。此外，我們對此混合矩陣，進行相位之補償，已獲得更精確之混合矩陣估計。得到混合矩陣後，再應用最大後驗機率所推導出的最小L1範數之欠定線性最佳化問題分離源訊號。
此外，實驗模擬的部分，我們將提出的方法與傳統方法作比較，這裡採用的傳統方法是利用階層式分群法估測混合矩陣，加上解最小L1範數的一個演算法。並且，也證實了此演算法在殘響環境下仍然保持著它的可行性。

摘要(英)

During the past decade, much attention has been given to the separation of source signals, in particular for the blind case where both the sources and the mixing process are unknown and only recordings of the mixtures are available. In this paper, the problem which we want to solve is a sparse under-determined convolution blind source separation (BSS) problem. We consider the case where the number of sources is unknown. We propose a two-step BSS algorithm. The mixing matrix estimation is the first step. In the second step, the estimated matrix estimation is used to separate source signals.
In the mixing matrix estimation, we first define two features called level-ratio and phase-difference. Next, we eliminate outliers by KNN Graph and use K-Means clustering to obtain the separated clusters. A DOA detection method is then used to solve the permutation problem, and provides a phase compensation technique for mixing matrix estimation. This paper is based on maximum a posterior approach. After obtaining precise mixing matrix, we solve an optimization problem so that L1 norm is minimized. Beside, the proposed method combines the K-Means algorithm and Bayesian information criterion (BIC) to achieve the goal of source number estimation. About the experiments, we make a performance comparison between the proposed method and the baseline method, which uses a hierarchical clustering to estimate mixing matrix and separating source signals by solving L1 norm minimization, too. Furthermore, we demonstrate the proposed method also work well in reverberation environment.

關鍵字(中)

★ 貝氏資訊準則
★ 語音訊號處理
★ 盲訊號源分離
★ 稀疏表示

關鍵字(英)

★ speech signal processing
★ blind source separation
★ bayesian information criterion
★ sparse representation

論文目次

Contents
摘要 i
Abstract ii
The List of Figures iv
The List of Tables v
Description of Symbols vi
Chapter 1 Introduction 1
1.1 Preface 1
1.2 Motivation and purpose 1
1.3 Research method and the organization of thesis 3
Chapter 2 Literature Survey of Blind Source Separation and Sparse Component Analysis 6
Chapter 3 Feature Extraction 12
3.1 Coefficient and mixing matrix 12
3.2 Sample form 14
Chapter 4 Estimation of the Source Signal Number by BIC and K-Means 16
4.1 Bayesian Information Criterion 16
4.2 K-Means clustering algorithm 17
4.3 Estimation the number of source signals 18
Chapter 5 Estimation of the Mixing Matrix 22
5.1 K-th nearest neighbor algorithm 22
5.2 Outlier elimination 24
5.3 Estimation of the column vectors of mixing matrix 27
5.4 Method of direction of arrival 27
5.5 Phase compensation technique 28
5.6 Source recovery 31
Chapter 6 Experimental Results 35
6.1 Experiment environment and installation 35
6.2 DOA with respect to the distance between each sensor 37
6.3 Performance comparison of the separation sources 40
6.4 The connection with distance between each sensor and performance 41
6.5 The performance under reverberation environment 46
6.6 Estimation of the number of source signals 54
Chapter 7 Conclusion and Future Work 56
Reference 57

參考文獻

[1] A. Cichocki and S. Amari, Adaptive blind signal and image processing. Wiley, 2002.
[2] S. Makino, H. Sawada and T. W. Lee, Blind speech separation. Springer, Netherlands, 2007.
[3] A. Hyvarinen, E. Oja, Independent component analysis. Wiley, 2002.
[4] S. Roberts and R. Everson, Independent component analysis : Principles and Practice. Cambridge University Press, 2001.
[5] S. C. Douglas, M. Gupta, H. Sawada and S. Makino, “Spatio – Temporal FastICA algorithm for the blind separation of convolutive mixtures,” IEEE Trans. Audio, Speech, Lang. Process., vol. 15, pp. 1540 – 1550, Jul. 2007.
[6] H. Saruwatari, T. Kawamura, T. Nishikawa, A. Lee and K. Shikano, “Blind source separation based on a fast – converge algorithm combining ICA and beamforming,” IEEE Trans. Audio, Speech, Lang. Process., vol. 14, pp. 666 – 678, Mar. 2006.
[7] A. Belouchrani and M. G. Amin, “Blind source separation based on time – frequency signal representation,” IEEE Trans. Signal Process., vol. 46, pp. 2888 – 2898, Nov. 1998.
[8] Y. Zhang and M. G. Amin, “Signal averaging of time – frequency distributions for signal recovery in uniform linear arrays,” IEEE Trans. Signal Process., vol. 48, pp. 2892 – 2902, Oct. 2000.
[9] J. F. Cardoso, “Blind signal separation : Statistical principles,” IEEE Process., vol. 86, pp. 2009 – 2025, Oct. 1998.
[10] K. Todros and J. Tabrikian, “Blind separation of independent sources using Gaussian mixture model,” IEEE Trans. Signal Process., vol. 55, pp. 3645 – 3658, Jul. 2007.
[11] M. Welling and M. Weber, “A constraint EM algorithm for independent component analysis,” Neural Comput., vol. 13, pp. 677 – 689, 2001.
[12] T. Routtenberg and J. Tabrikian, “MIMO-AR system identification and blind source separation for GMM-distributed sources,” IEEE Trans. Signal Process., vol. 57, pp. 1717 – 1730, May. 2009.
[13] S. Winter, W. Kellermann, H. Sawada and S. Makino, “MAP-based underdetermined blind source separation of convolutive mixtures by hierarchical clustering and ?1 norm minimization,” EURASIP Journal on Advances in Signal Process., vol. 2007, Article ID 24717, 12 pages.
[14] P. Bofill, “Underdetermined blind separation of delayed sound sources in the frequency domain,” Neurocomputing, vol. 55, no. 3 – 4, 99. 627 – 641, 2003.
[15] P. Bofill and M. Zibulevsky, “Underdetermined blind source separation using sparse representations,” Signal Process., vol. 81, pp. 2353 – 2362, Jun. 2001.
[16] Y. Li, S. I. Amari, A. Cichocki, D. W. C. Ho and S. Xie, “Underdetermined blind source separation based on sparse representation,” IEEE Trans. Signal Process., vol. 54, pp. 423 – 437, Feb. 2006.
[17] A. Aissa-El-Bey, K. Abed-Mraim and Y. Grenier, “Blind separation of underdetermined convolutive mixtures using their time-frequency representation,” IEEE Trans. Audio, Speech, Lang. Process., vol. 15, pp. 1540 – 1550, Jul. 2007.
[18] F. Abrard and Y. Deville, “A time-frequency blind signal separation method a applicable to underdetermined mixtures of dependent sources,” Signal Process., vol. 85, pp. 1389 – 1403, Jul. 2005.
[19] H. Sawada, S. Araki, R. Mukai and S. Makino, “Blind extraction of dominant target sources using ICA and time-frequency masking,” IEEE Trans. Audio, Speech, Lang. Process., vol. 14, pp. 2165 – 2173, Nov. 2006.
[20] V. G. Reju, S. N. Koh and I. Y. Soon, “Underdetermined convolutive blind source separation via time-frequency masking,” IEEE Trans. Audio, Speech, Lang. Process., vol. 18, pp. 101 – 116, Jan. 2010.
[21] S. Araki, H. Sawada, R. Mukai and S. Makino, “Underdetermined blind sparse source separation for arbitrarily arranged multiple sensors,” Signal Process., vol. 87, pp. 1833 – 1847, Feb. 2007.
[22] N. Mourad and J. P. Reilly, “Modified hierarchical clustering for sparse component analysis,” IEEE International Conference, ICASSP, pp. 2674 – 2677, Mar. 2010.
[23] M. K. Pakhira, S. Bandyopadhyay and U. Maulik, “Validity index for crisp and fuzzy clusters,” Pattern Recognition, vol. 37, pp. 487 – 501, Mar. 2004.
[24] L. Wang, C. Leckie, K. Ramamohanarao and J. Bezdek, “Automatically determining the number of clusters in unlabeled data sets,” IEEE Trans. Knowl. Data Eng., vol. 21, pp. 335 – 350, Mar. 2009.
[25] A. Cichocki, J. Karhunen, W. Kasprzak and R. Vigario, “Neural networks for blind separation with unknown number of sources,” Neurocomputing, vol. 24, pp. 55 – 93, Feb. 1999.
[26] S. Kurita, H. Saruwatari, S. Kajita, K. Takeda and F. Itakura, “Evaluation of blind signal separation method using directivity pattern under reverberant conditions,” IEEE International Conference, ICASSP, pp. 3140 – 3143, Jun. 2000.
[27] S. Araki, S. Makino, Y. Hinamoto, R. Mukai, T. Nishikawa and H. Saruwatari, “Equivalence between frequency domain blind source separation and frequency domain adaptive beamforming for convolutive mixtures,” EURASIP J. Appl. Signal Process., pp. 1157 – 1166, 2003.
[28] H. Sawada, R. Mukai, S. Araki and S. Makino, “A robust and precise method for solving the permutation problem of frequency-domain blind source separation,” IEEE Trans. Audio, Speech, Lang. Process., vol. 12, pp. 530 – 538, Sep. 2004.
[29] I. Takigawa, M. Kudo and J. Toyama, “Performance analysis of minimum ?1 norm solutions for underdetermined source separation,” IEEE Trans. Signal Process., vol. 52, pp. 582 – 591, 2004.
[30] E. Vincent, R. Gribonval and C. Fevotte, “Performance measurement in blind audio source separation,” IEEE Trans. Audio, Speech Lang. Process., vol. 14, pp. 1462 – 1469, 2006.
[31] B. D. Van Veen and K. M. Buckley, “Beamforming : a versatile approach to spatial filtering,” IEEE ASSP Mag., pp. 2 – 24, Apr. 1988.
[32] H. Sawada, R. Mukai, S. de la Kethulle, S. Araki and S. Makino, “Spectral smoothing for frequency-domain blind source separation,” International Workshop on Acoustic Echo and Noise Control, pp. 311 – 314, Sep. 2003.
[33] V. Hautamaki, I. Karkkainen and P. Franti, “Outlier detection using k-nearest neighbor graph,” IEEE International Conference, ICPR, pp. 430 – 433, Aug. 2004.
[34] Y. Li, A. Cichocki and S. Amari, “Analysis of sparse representation and blind source separation,” Neural Computation, pp. 1193 – 1234, Aug. 2004.

指導教授

王家慶(Jia-ching Wang)

審核日期

2011-8-23

推文