一個加速時頻域遮罩之盲訊號分離演算法

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：95

、訪客IP：18.227.102.205

姓名

劉佩昀(Pei-yun Liu) 查詢紙本館藏

畢業系所

電機工程學系

論文名稱

一個加速時頻域遮罩之盲訊號分離演算法
(Blind Source Separation Using a Fast Time Frequency Mask Technique)

相關論文

★ 即時的SIFT特徵點擷取之低記憶體硬體設計	★ 即時的人臉偵測與人臉辨識之門禁系統
★ 具即時自動跟隨功能之自走車	★ 應用於多導程心電訊號之無損壓縮演算法與實現
★ 離線自定義語音語者喚醒詞系統與嵌入式開發實現	★ 晶圓圖缺陷分類與嵌入式系統實現
★ 語音密集連接卷積網路應用於小尺寸關鍵詞偵測	★ G2LGAN: 對不平衡資料集進行資料擴增應用於晶圓圖缺陷分類
★ 補償無乘法數位濾波器有限精準度之演算法設計技巧	★ 可規劃式維特比解碼器之設計與實現
★ 以擴展基本角度CORDIC為基礎之低成本向量旋轉器矽智產設計	★ JPEG2000靜態影像編碼系統之分析與架構設計
★ 適用於通訊系統之低功率渦輪碼解碼器	★ 應用於多媒體通訊之平台式設計
★ 適用MPEG 編碼器之數位浮水印系統設計與實現	★ 適用於視訊錯誤隱藏之演算法開發及其資料重複使用考量

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

盲訊號分離主要處理雞尾酒會問題，他的概念是在一個派對中，一些人同時說話，即使身旁有很多干擾，我們也可以很容易去聽某個人的談話內容，這是因為人的大腦可以自然的去分離訊號，但這個過程對於數位電路來說卻很複雜。盲訊號分離的目的為，在一個房間用多個麥克風放不同位置同時錄音，並利用這個訊號，解析原始的聲音源。
應用層面廣泛，包含行動通訊、多使用者通訊系統、吵雜環境下增強語音訊號。
盲訊號分離是以摺積性混合訊號為假設基礎，去做訊號重建之技術。混合訊號會經過短時傅立葉轉換，轉換到頻域，因為訊號源有稀疏性特性，我們可以根據空間特徵，來聚集這些特徵時頻點。特徵擷取最重要的概念就是，兩個不移動的聲源會各自產生聲波傳遞到雙麥克風，因為麥克風相對於聲源有遠近的差異，所以聲波到達麥克風有先後順序。一般來說，可以用各個聲源到兩個麥克風的相位差和強度比作為空間特徵。空間特徵是以複數型態表示，散佈於複數平面上。之後，利用k-meams 演算法，將特徵點分成N類，每一類就代表一個聲源。接下來，使用二元時頻遮罩技術將分類好的時頻點標記出來，也就是說，如果此時頻點屬於目標語音則則為1，若非則為0。最後將完整的遮罩與混合訊號做點乘，即可以得到分離的訊號。最後，將結果利用反短時傅立葉轉換，回到時域。
為了解決旋積盲訊號源分離問題，本論文提出了一個加速時頻域遮罩之盲訊號分離演算法。首先我們先定義兩個特徵參數包括了訊號的強度比以及相位差，然後利用降低資料變異數方式，讓兩個特徵的變異數相似，好處是利於K-means的收斂，再用K-Means演算法對每個頻帶的資料群聚。最後。根據群聚的特徵點，將時頻遮罩結果計算出來。
在真實環境中，我們可以依據麥克風的收錄的聲音，直接分離訊號，再以SDR(Signal to distortion ratio ) 、SIR(Signal to interference ratio) 評估訊號品質。此方法讓聚類速度加快，不僅不會降低訊號品質，演算法簡易。

摘要(英)

The goal of BSS is solving cocktail party problem. Imagine a room with a number of persons and microphones for recording. When people are speaking at the same time, each microphone registers a different mixture of individual speaker′s audio signals. And the task of BSS is to untangle these mixtures into their sources. There are various applications including
mobile telephony, multiuser communication systems, voice reinforce in noisy environment.
The mixtures recorded by microphones will be transformed to frequency domain with STFT (Short-Time Fourier Transform). Owing to the characteristics, sparseness and the disjointness ,of the source signal, we can obtain those features from the mixtures during feature extraction
step. The features are represented as complex number. Afterwards, by utilizing K-meansalgorithm, we divide those features into N group, where N is the number of sources. Prior to transform the separated signal back to time domain, we adopt mask design to label the target signal, for example, if the target signal is a speech signal, we will label it one, otherwise zero.
To solve the convolutive blind source separation (BSS) problem, this thesis presents a new method which utilizing a fast time frequency mask technique. We first define two features, which are normalized level-ratio and phase-difference. Next, we reduce the variance of feature in order to obtain lower iterations of K-Means clustering. Afterwards, with K-means algorithm, we cluster the features by assigning them to the nearest group. In the end, according to the clustered features, a time frequency mask is generated. The method is not only easy, but also faster without reducing the quality of the target signal. In real environment,we also evaluate the separated signal in terms of SDR (signal to distortion ratio) and SIR (signal to interference ratio).

關鍵字(中)

★ 盲訊號分離
★ 摺積性混合訊號
★ 時頻域遮罩

關鍵字(英)

★ blind separation
★ convolutive mixture
★ time frequency mask

論文目次

摘要 ................................................................................................................................................................ VI
ABSTRACT ...................................................................................................................................................... VII
CHAPTER 1 .............................................................................................................................. 1
INTRODUCTION .................................................................................................................. 1
1.1 MOTIVATION .................................................................................................................... 2
1.2 THESIS ORGANIZATION .................................................................................................... 6
CHAPTER 2 .............................................................................................................................. 7
BACKGROUND ..................................................................................................................... 7
2.1 BSS BASED ON INDEPENDENT COMPONENT ANALYSIS .................................................... 8
2.2 BSS BASED ON SPARSENESS ........................................................................................... 10
CHAPTER 3 ............................................................................................................................ 15
ALGORITHM OVERVIEW ................................................................................................. 15
3.1 BINARY MASK BASED APPROACH .................................................................................. 16
3.2 NON-BINARY MASK BASED APPROACH ......................................................................... 21
3.3 FREQUENCY-DOMAIN INDEPENDENT COMPONENT ANALHYSIS .................................. 24
CHAPTER 4 ............................................................................................................................ 32
PROPSED ALGORITHM ................................................................................................... 32
4.1 BINARY MASK BASED APPROACH WITH REDUCTION OF DOA VARIANCE ....................... 33
4.2 AN EXAMPLE OF EXPERIMENT RESULT ............................................................................ 35
CHAPTER 5 ............................................................................................................................ 37
EXPERIMENT RESULTS .................................................................................................. 37
5.1PERFORMANCE EVALUATION ......................................................................................... 38
5.2EXPERIMENTAL SETTING ................................................................................................ 38
5.3 EXPERIMENTAL RESULTS OF FREQUENCY DOMAIN ICA .................................................. 39
5.4 EXPERIMENTAL RESULTS OF BINARY/NON-BINARY MASK APPROACH .......................... 44
CHAPTER 6 ............................................................................................................................ 50
CONCLUSION ..................................................................................................................... 51
Reference.................................................................................................................................. 52

參考文獻

[1] O. M. Mitchell; C. A. Ross; G. H. Yates. “Signal processing for a cocktail party effect,”
Journal of the Acoustic Society of America, 1971.
[2] M. S. Pedersen; D. Wang; J. Larsen; U. Kjems. “Separating Underdetermined
Convolutive Speech Mixtures,” Independent Component Analysis and Blind Signal
Separation Lecture Notes in Computer Science, LNCS 3889, pp. 674–681, 2006.
[3] Mitianoudis, N. ; Davies, M.E .Audio source separation of convolutive mixtures.
Speech and Audio Processing, IEEE Transactions on. Sept. 2003
[4] L. Parra; C. Spence. “Convolutive blind separation of non-stationary sources .” Speech
and Audio Processing, IEEE Transactions on, vol.8 ,May 2000, pp.320 - 327
[5] A. Mansour; N. Benchekroun; C. Gervaise. “Blind Separation of Underwater Acoustic
Signals,” Independent Component Analysis and Blind Signal Separation Lecture Notes
in Computer Science, vol.3889, pp. 181-188,2006
[6] Z. Koldovský ˇ; P. Tichavský, “Time-Domain Blind Separation of Audio Sources on
the Basis of a Complete ICA Decomposition of an Observation Space,” Audio, Speech,
and Language Processing, IEEE Transactions on ,vol.19 , ppF.406-416, Feb. 2011.
[7] H. Saruwatari; K. Sawai; T. Nishikawa; A. Lee; K. Shikano; A. Kaminuma; M. Sakata;
D. Saitoh,“Speech Enhancement Based on Blind Source Separation in Car
Environments,” Data Engineering Workshops 21st International Conference on, April
2005
[8] S. Araki; R. Mukai; S. Makino; T. Nishikawa; H. Saruwatari, “The Fundamental
Limitation of Frequency Domain Blind Source Separation for Convolutive Mixtures of
Speech,” Speech and Audio Processing, IEEE Transactions on ,vol.11,pp.109-116, Mar
2003
[9] S. Cruces-Alvarez; A. Cichocki; L. Castedo-Ribas, “An iterative inversion approach to
blind source separation”,Neural Networks, IEEE Transactions on,vol.11 ,pp.1423-1437,
Nov 2000.
[10] K. I. Diamantaras; Th. Papadimitriou,“ MIMO blind deconvoluition using
subspace-based filter deflation,” Acoustics, Speech, and Signal Processing, IEEE
International Conference on, vol.4, pp.433 - 436 , May 2004.
[11] D. Nuzillard; A. Bijaoui, “Blind source separation and analysis of multispectral
astronomical images,” Astronomy and Astrophysics Supplement, vol.147, pp.129-138,
Nov.2000.
[12] Jo¨rn Anemu¨ller; Terrence J. Sejnowski; Scott Makeiga.“Complex independent
- 53 -
component analysis of frequency-domain electroencephalographic data”, Neural
Networks, pp. 1311–1323, Aug.2003
[13] M. Dyrholm; S. Makeig; Lars Kai Hansen , “Model structure selection in convolutive
mixtures, Independent Component Analysis and Blind Signal Separation Lecture Notes
in Computer Science, vol. 3889, pp.74-81
, 2006.
[14] Carlos Vayá; José Joaquín Rieta; César Sánchez; David Moratal. “Performance study
of convolutive BSS algorithms applied to the electrocardiogram of atrial fibrillation”,
Independent Component Analysis and Blind Signal Separation Lecture Notes in
Computer Science ,vol. 3889, pp 495-502,2006.
[15] Lars Kai Hansen. “ICA of fMRI based on a convolutive mixture model”, Ninth Annual
Meeting of the Organization for Human Brain Mapping (HBM 2003), June.2003
[16] Araki, S., Sawada, H., Mukai, R. and Makino, S., Normalized observation vector
clustering approach for sparse source separation. In: Proceedings of the EUSIPCO
2006.
[17] A. Hyvärinen ; E. Oja,“Independent Component Analysis:Algorithms and
Applications,” Neural Networks, pp.411-430, 2000
[18] Yilmaz, O.; Rickard Scott. “Blind separation of speech mixtures via time–frequency
masking,” Signal Processing, IEEE Transactions on,vol.52, July 2004
[19] Amari, S.; Douglas, S.C. ; Cichocki, A. ; Yang, H.H. “Multichannel blind
deconvolution and equalization using the natural gradient”, Signal Processing
Advances in Wireless Communications, First IEEE Signal Processing Workshop on,
pp.101–104, April.1997,
[20] M. Kawamotoa; Ki. Matsuokab; N. Ohnishia, “A method of blind separation for
convolved non stationary signals”, Neurocomputing,vol.22, pp. 157–171, Nov.1998.
[21] P. Smaragdis,“blind separation of convolved mixture in the frequency domain,”
Neurocomputing, Vol. 22, No. 1-3. (20 November 1998), pp. 21-34
[22] A. Hyvärinen,“Fast and Robust Fixed-Point Algorithms for Independent Component
Analysis”, IEEE Trans. on Neural Networks, pp.626-634, 1999.
[23] E. Bingham; A. Hyvärinen,“a fast fixed point algorithm for independent analysis of
complex valued signals”,International Journal of Neural Systems,vol. 10, No. 1,Feb.
2000.
[24] H. Sawada; R. Mukai ;Se´bastien de la Kethulle de Ryhove; S. Araki; S.
Makino,“spectral smoothing for frequency domain blind source separation,”
International workshop on acoustic echo and noise control(IWAENC ),Sep.2003.
[25] Robledo-Arnuncio; E. ; Biing-Hwang Juang, “Issues in frequency domain blind
- 54 -
source separation - a critical revisit”, Acoustics, Speech, and Signal Processing, IEEE
International Conference on (ICASSP),vol.5,Mar.2005
[26] T. Nishikawa; H. Saruwatari; and K. Shikano,“Blind source separation of acoustic
signals based on multistage ICA combining frequency-domain ICA and time-domain
ICA,” IEICE Trans. Fundamentals,vol. E86-A, no. 4, pp. 846–858, Sep 2003.
[27] P. Bofill; M. Zibulevsky, Blind separation of more sources than mixtures using sparsity
of their short-time Fourier transform, in: Proceedings of the ICA2000, 2000, pp.
87–92.
[28] Jourjine, A. ; Rickard, Scott ; Yilmaz, O.” “Blind separation of disjoint orthogonal
signals: demixing N sources from 2 mixtures”,Acoustics, Speech, and Signal Processing
(ICASSP), 2000 IEEE International Conference on, vol. 5, 2000.
[29] S. Winter; W. Kellermann; H. Sawada; S. Makino1,“MAP based underdetermined
blind source separation of convolutive mixtures by hierarchical clustering and l1-norm
minimization”, EURASIP Journal on Advances in Signal Processing, 2007.
[30] M. Aoki; M.Okamoto; S. Aoki; H. Matsui; T. Sakurai; Y. Kaneda, “Sound source
segregation based on estimating incident angle of each frequency component of input
signals acquired by multiple microphones,” Acoustical Science and Technology,
pp.149-157,Jan. 2001.
[31] S. Rickard , R. Balan , J. Rosca,“Real-time time–frequency based blind source
separation,” in Proc. of International Conference on Independent Component Analysis
and Signal Separation ,2001
[32] S. Winter , H. Sawada , S.Araki , S. Makino,“Overcomplete BSS for convolutive
mixtures based on hierarchical clustering”, Independent Component Analysis and
Blind Signal Separation Lecture Notes in Computer Science, vol.3195, pp.
652-660,2004.
[33] Guy J. Brown; D. Wang, “Separation of Speech by Computational Auditory Scene
Analysis”, reprinted from Speech Enhancement, pp. 371–402, 2005.
[34] Yilmaz, O. ;Rickard, Scott, “Blind Separation of Speech Mixtures via Time-Frequency
Masking”, Signal Processing, IEEE Transactions on ,vol.52, July. 2004.
[35] S. Arakia; H. Sawadaa; R. Mukaia; S. Makinoa ,“Underdetermined blind sparse source
separation for arbitrarily arranged multiple sensors”, Signal Processing , pp.1833-1847,
Aug.2007.
[36] S. Araki; H. Sawada; R. Mukai; S. Makino, “Blind sparse source separation with
spatially smoothed time-frequency masking”, IWAENC, Sep. 2006.
[37] Muhammad Z. Ikram; Dennis R. Morgan, “Permutation inconsistency in blind speech
separation”, Speech and Audio Processing, IEEE Transactions on ,vol.13, Jan. 2005
- 55 -
[38] H. Sawada, R. Mukai, S. Araki ,S.Makino, “A Robust and Precise Method for Solving
the Permutation Problem of Frequency-Domain Blind Source Separation”, Speech and
Audio Processing, IEEE Transactions on ,vol. 12, No. 5, Sep. 2004.
[39] Cardoso, J.F.;

指導教授

蔡宗漢(Tsung-han Tsai)

審核日期

2014-7-30

推文