本論文在探討如何利用深度學習來進行語音辨識,而使用的辨識方法是先透過梅爾倒頻譜係數((Mel frequency cepstral coefficients, MFCCs)取得語音特徵參數,並輸入卷積神經網路(Convolutional Neural Network, CNN)進行語音辨識。 此法與傳統語音辨識方法最大不同是在於不需要建立聲學模型,以中文為例就省去建立大量聲母(consonant)、韻母(vowel)比對的時間。藉由透過MFCCs取得特徵參數後就可以透過卷積神經網路實現語音辨識,並且不會受到語言種類的限制。 ;The thesis developed a speech recognition method for automatic speech recognition. In this speech recognition method, we obtained the speech feature parameters through Mel frequency cepstral coefficients and input a Convolutional Neural Network. The main difference between this Convolutional Neural Network speech recognition method and traditional speech recognition method is that it does not need to establish an acoustic model. For example, in Chinese, it saved a lot of time without establishing a large number of consonant and vowel models. After obtaining the speech feature parameters through the MFCCs, speech recognition is finished through Convolutional Neural Network.