一個加速時頻域遮罩之盲訊號分離演算法;Blind Source Separation Using a Fast Time Frequency Mask Technique

NCU Institutional Repository > 資訊電機學院 > 電機工程研究所 > 博碩士論文 > Item 987654321/65766

jsp.display-item.identifier=請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/65766

题名:	一個加速時頻域遮罩之盲訊號分離演算法;Blind Source Separation Using a Fast Time Frequency Mask Technique
作者:	劉佩昀;Liu,Pei-yun
贡献者:	電機工程學系
关键词:	盲訊號分離;摺積性混合訊號;時頻域遮罩;blind separation;convolutive mixture;time frequency mask
日期:	2014-07-30
上传时间:	2014-10-15 17:09:50 (UTC+8)
出版者:	國立中央大學
摘要:	盲訊號分離主要處理雞尾酒會問題，他的概念是在一個派對中，一些人同時說話，即使身旁有很多干擾，我們也可以很容易去聽某個人的談話內容，這是因為人的大腦可以自然的去分離訊號，但這個過程對於數位電路來說卻很複雜。盲訊號分離的目的為，在一個房間用多個麥克風放不同位置同時錄音，並利用這個訊號，解析原始的聲音源。應用層面廣泛，包含行動通訊、多使用者通訊系統、吵雜環境下增強語音訊號。盲訊號分離是以摺積性混合訊號為假設基礎，去做訊號重建之技術。混合訊號會經過短時傅立葉轉換，轉換到頻域，因為訊號源有稀疏性特性，我們可以根據空間特徵，來聚集這些特徵時頻點。特徵擷取最重要的概念就是，兩個不移動的聲源會各自產生聲波傳遞到雙麥克風，因為麥克風相對於聲源有遠近的差異，所以聲波到達麥克風有先後順序。一般來說，可以用各個聲源到兩個麥克風的相位差和強度比作為空間特徵。空間特徵是以複數型態表示，散佈於複數平面上。之後，利用k-meams 演算法，將特徵點分成N類，每一類就代表一個聲源。接下來，使用二元時頻遮罩技術將分類好的時頻點標記出來，也就是說，如果此時頻點屬於目標語音則則為1，若非則為0。最後將完整的遮罩與混合訊號做點乘，即可以得到分離的訊號。最後，將結果利用反短時傅立葉轉換，回到時域。為了解決旋積盲訊號源分離問題，本論文提出了一個加速時頻域遮罩之盲訊號分離演算法。首先我們先定義兩個特徵參數包括了訊號的強度比以及相位差，然後利用降低資料變異數方式，讓兩個特徵的變異數相似，好處是利於K-means的收斂，再用K-Means演算法對每個頻帶的資料群聚。最後。根據群聚的特徵點，將時頻遮罩結果計算出來。在真實環境中，我們可以依據麥克風的收錄的聲音，直接分離訊號，再以SDR(Signal to distortion ratio ) 、SIR(Signal to interference ratio) 評估訊號品質。此方法讓聚類速度加快，不僅不會降低訊號品質，演算法簡易。;The goal of BSS is solving cocktail party problem. Imagine a room with a number of persons and microphones for recording. When people are speaking at the same time, each microphone registers a different mixture of individual speaker′s audio signals. And the task of BSS is to untangle these mixtures into their sources. There are various applications including mobile telephony, multiuser communication systems, voice reinforce in noisy environment. The mixtures recorded by microphones will be transformed to frequency domain with STFT (Short-Time Fourier Transform). Owing to the characteristics, sparseness and the disjointness ,of the source signal, we can obtain those features from the mixtures during feature extraction step. The features are represented as complex number. Afterwards, by utilizing K-meansalgorithm, we divide those features into N group, where N is the number of sources. Prior to transform the separated signal back to time domain, we adopt mask design to label the target signal, for example, if the target signal is a speech signal, we will label it one, otherwise zero. To solve the convolutive blind source separation (BSS) problem, this thesis presents a new method which utilizing a fast time frequency mask technique. We first define two features, which are normalized level-ratio and phase-difference. Next, we reduce the variance of feature in order to obtain lower iterations of K-Means clustering. Afterwards, with K-means algorithm, we cluster the features by assigning them to the nearest group. In the end, according to the clustered features, a time frequency mask is generated. The method is not only easy, but also faster without reducing the quality of the target signal. In real environment,we also evaluate the separated signal in terms of SDR (signal to distortion ratio) and SIR (signal to interference ratio).
显示于类别:	[電機工程研究所] 博碩士論文

文件中的档案:

档案	描述	大小	格式	浏览次数
index.html		0Kb	HTML	544	检视/开启

在NCUIR中所有的数据项都受到原著作权保护.

社群 sharing

数据加载中.....