中大機構典藏-NCU Institutional Repository-提供博碩士論文、考古題、期刊論文、研究計畫等下載:Item 987654321/8363
English  |  正體中文  |  简体中文  |  Items with full text/Total items : 80990/80990 (100%)
Visitors : 41262631      Online Users : 539
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version


    Please use this identifier to cite or link to this item: http://ir.lib.ncu.edu.tw/handle/987654321/8363


    Title: 語音關鍵詞辨識擷取系統;A system for Keyword Spotting
    Authors: 白育昇;Yu-sheng Pai
    Contributors: 通訊工程研究所碩士在職專班
    Keywords: 語音辨識;關鍵詞;馬可夫;HMM;speech;keywrod;spotting;HTK
    Date: 2009-05-26
    Issue Date: 2009-09-22 11:23:46 (UTC+8)
    Publisher: 國立中央大學圖書館
    Abstract: 本論文主要的目標為研究語音辨識相關技術,並加以實現出一套可移植性高、靈活性強、實用性好及辨識率佳的語音關鍵詞辨識擷取系統,此系統主要由三大部份構成,分別為語料讀取程式及關鍵詞語音擷取程式作業於Windows XP SP2作業系統下,以Borland C++ Builder 5為主要開發平台,語音關鍵詞辨識程式作業於Linux Fedora 5作業系統下,使用HTK 3.3工具進行開發。 在此系統中我們使用HTK工具開發HMM來建立聲學模型,並以21個聲母、36個韻母所組成的411個音節,訓練出一個以HMM狀態數、高斯混合數分別為6、17的最佳聲學模型,其訓練語料擷取率高達92%,假警報率低於13%。在進行非訓練語料實驗時,純關鍵詞模組其擷取率及假警報率更是維持僅各差約3%,分別為89%及16%。 最後以HMM狀態數、高斯混合數分別為6、17的聲學模型建構一套語音關鍵詞辨識擷取系統,並設計其介面程式提供使用者便於操作。 This paper’s goal is to research voice reorganization technique and to develop a speech keyword spotting system which can be working on any operation system and have the feature of probability and easy to use. This system are consist of three part, voice data reading program and keyword spotting program are working in the Microsoft Windows XP SP system, and develop platform is Borland C++ Builder 5. Speech keyword reorganization program is developed by HTK 3.3 and working in the Linux Fedora 5system. In this system we use HTK to develop HMM and to build the acoustics model, and we use 411 syllables which is build by 21 initials and 36 finals to develop a acoustics model which HMM state and mixtures is 6 and 17. In this model the training speech detection ratio must reach 92%, false alarm rate must under 13%. In the practical keywod model speech material input experiment, the differential between detection ratio and false alarm ratio keep in 3%, and detection ratio must reach 89%, false alarm rate under 16%. Finally we will use this model to build a speech keyword spotting reorganization system, and we will design a human interface program to provide to the operator, so that they can easy to use this system.
    Appears in Collections:[Executive Master of Communication Engineering] Electronic Thesis & Dissertation

    Files in This Item:

    File SizeFormat


    All items in NCUIR are protected by copyright, with all rights reserved.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明