語音分離在訊號處理中是一項具有挑戰性的問題,其在各種真實世界的應用中發揮了重要作用,例如語音辨識系統或電信通訊。語音分離的主要目標為從一個具有多個發話者的混合語音估計出個別發話者的語音。由於在一般自然環境下,語音訊號經常受到噪音或其它語音的干擾,語音分離因此變成一個有吸引力的研究課題。 另一方面,高斯過程(Gaussian Process, GP)是一種基於核函數的機器學習方法,並且已經大量的被應用在訊號處理上。在此研究中,我們提出基於高斯過程回歸(Gaussian Process Regression, GPR)的方法來模擬混合語音訊號與乾淨語音之間的非線性映射,被重建的語音訊號可由GP模型的平均函數求得。模型裡的超參數(Hyper-parameter)由共軛梯度法(Conjugate Gradient Method)來進行最佳化。在實驗上使用TIMIT的語音資料庫,其結果顯示提出的方法有較好的表現。 ;Speech separation is a challenging signal processing which plays a significant role in improving the accuracy of various real-world applications, such as speech recognition system and telecommunication. Its main goal is to isolate or estimate the target voice of each speaker from a mixed speech talked by various speakers at the same time. Due to the fact that speech signals collected in the natural environment are frequently corrupted by noise data, speech separation has become an attractive research topic over the past several decades. In addition, Gaussian process (GP) is a flexible kernel-based learning method which has found widespread application in signal processing. In this thesis, a supervised method is proposed for handling speech separation problem. In this work, we focus on modeling a nonlinear mapping between mixed and clean speeches based on GP regression, in which reconstructed audio signal is estimated by the predictive mean of GP model. The nonlinear conjugate gradient method was utilized to perform the hyper-parameter optimization. An experiment on a subset of TIMIT speech dataset is carried out to confirm the validity of the proposed approach.