中大機構典藏-NCU Institutional Repository-提供博碩士論文、考古題、期刊論文、研究計畫等下載:Item 987654321/89772
English  |  正體中文  |  简体中文  |  全文筆數/總筆數 : 80990/80990 (100%)
造訪人次 : 41639216      線上人數 : 1894
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜尋範圍 查詢小技巧:
  • 您可在西文檢索詞彙前後加上"雙引號",以獲取較精準的檢索結果
  • 若欲以作者姓名搜尋,建議至進階搜尋限定作者欄位,可獲得較完整資料
  • 進階搜尋


    請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/89772


    題名: Speech Recognition via Attention Mechanism on Raspberry Pi
    作者: 申自強;Shen, Tzu-Chiang
    貢獻者: 資訊工程學系在職專班
    關鍵詞: 語音識別;注意力機制;樹梅派;孤立詞;喚醒詞;Speech Recognition;Attention Mechanism;Raspberry Pi;Isolated Word;Wake-Up-Word
    日期: 2022-09-23
    上傳時間: 2022-10-04 11:59:10 (UTC+8)
    出版者: 國立中央大學
    摘要: 語音識別作為一種新的計算機界面形式。它啟用了語音助手(例如Alexa 和Siri),這可以幫助我們獲得許多服務,例如獲取日常信息和設置駕駛導航系統。自1990 年代初以來,語音識別已得到廣泛研究。然而,隨著越來越多的便攜式嵌入式設備(如導航系統、語言翻譯器等)出現在市場上,需要基於低計算設備的離線語音識別。在這項研究中,我們專注於將編碼器-解碼器神經網絡應用於Raspberry Pi 等低功耗設備。與需要將錄製的語音傳輸到昂貴的服務器以提供計算和推理的Alexa 和Siri 相比,我們構建了一個僅在本地推斷語音樣本的語音識別模型。我們的模型使用CNN 作為編碼器,使用具有註意力機制的LSTM 或GRU 作為解碼器。此外,採用Tensorflow Lite 將模型導入Raspberry Pi 進行語音推理。實驗結果表明,在Raspberry Pi 上使用注意力機制後,模型對孤立詞的識別能力在召回率上提高了約2% 到5%。由於低功耗設備的計算能力有限,Raspberry Pi 上的推理時間非常長。;Speech recognition serves as a new form of computer interface. It enables the voice assistant (e.g., Alexa and Siri), which helps us on many services like obtaining daily information and setting up driving navigation system. Speech recognition has been extensively studied since the early 1990s. However, as more and more portable embedded devices
    (e.g., navigation system, language translator, etc.) appear on the market, there is a need for offline speech recognition based on low computation device. In this research, we focus on applying an Encoder-Decoder neural network to a low-power device like the Raspberry Pi. In contrast to Alexa and Siri that require the transmission of recorded voice to expensive servers to provide computation and inference, we build a speech recognition model that just infers speech samples locally. Our model uses CNN as the encoder and LSTM or GRU with attention mechanism as the decoder. In addition, Tensorflow Lite is adopted to import the model to the Raspberry Pi for speech inference. The experimental results indicate that the model’s ability to recognize isolated words was improved about 2% to 5% in recall by using the attention mechanism on Raspberry Pi. Inference times on the Raspberry Pi are so long due to the limited computing power of the low-power device.
    顯示於類別:[資訊工程學系碩士在職專班 ] 博碩士論文

    文件中的檔案:

    檔案 描述 大小格式瀏覽次數
    index.html0KbHTML85檢視/開啟


    在NCUIR中所有的資料項目都受到原著作權保護.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明