中大機構典藏-NCU Institutional Repository-提供博碩士論文、考古題、期刊論文、研究計畫等下載:Item 987654321/98329
English  |  正體中文  |  简体中文  |  Items with full text/Total items : 83776/83776 (100%)
Visitors : 59566573      Online Users : 915
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version


    Please use this identifier to cite or link to this item: https://ir.lib.ncu.edu.tw/handle/987654321/98329


    Title: 通過語音命令修正實現對話式用戶界面;Toward Conversational User Interface via Voice Command Correction
    Authors: 丁仕杰;Ding, Shi-Jie
    Contributors: 資訊工程學系
    Keywords: 語音辨識;錯誤修正;語音指令;自動修正模組;中文 自然語言處理;Automatic Speech Recognition;Error Correction;Voice Command;Automatic Correction System;Spelling error Correction
    Date: 2025-07-25
    Issue Date: 2025-10-17 12:38:23 (UTC+8)
    Publisher: 國立中央大學
    Abstract: 近年來,人工智慧技術迅速發展,語音辨識(ASR)技術亦有顯
    著進展,並廣泛應用於對話系統、智慧家電與語音助理等日常場景。
    然而,ASR 在實際應用中仍常出現錯誤,特別容易受到發音差異與
    同音異字等因素影響,導致辨識結果與原意不符,例如「這個程式很
    棒」被誤辨為「這個城市很棒」。
    以往的研究大多著重於自動錯誤修正,雖具一定成效,但對於
    如人名等專有名詞的修正仍存在挑戰。為此,本研究提出一套基於
    語音指令的語音辨識錯誤修正系統,允許使用者透過語音下達「新
    增」、「刪除」與「修改」等自然語言指令,達到精確修正辨識結果、
    減少鍵盤輸入的目的。
    本系統包含三大核心模組:1. 輸入分類器,用以判斷語音輸入為
    敘述或指令;2. 指令分類器,辨別指令所屬類型;3. 指令標註器,標
    記錯誤位置及對應修改內容。為訓練上述模組,我們採用 SIGHAN-15
    與 zh-tw-wikipedia 語料,並以 TTS 與 ASR 技術模擬錯誤,再利用大
    型語言模型與中文部件結合常用字詞生成自然指令,模擬真實使用情
    境下的修正方式。
    實驗結果顯示,原先的兩個模型在各自的資料集上皆能正確修正
    超過 80% 的錯誤句子,展現出良好的準確性與容錯能力。我們也嘗試
    將兩個資料集進行混合,並訓練出 Model-Mix 模型,其在整體表現上
    亦具備穩定且優異的修正能力。此外,我們將系統建置為 API 形式,
    提供其他語音辨識應用串接使用,並持續蒐集實際指令資料以優化模
    型。我們亦將大型語言模型導入系統,以提升指令理解能力並擴展系
    統的應用範圍,並測試 LLM 使否能理解修改指令。
    綜上所述,本研究提出一套創新且具實用性的語音辨識錯誤修正
    流程,不僅有效解決自動修正機制的限制,也顯著降低使用者的手動
    輸入成本。;Recent advances in AI have improved ASR performance, enabling
    its widespread use in dialogue systems and smart devices. However, real-
    world ASR still struggles with errors caused by pronunciation variations
    and homophones.
    To address limitations in prior automatic correction methods—
    especially with proper nouns and user-specific terms—we propose a
    speech-command-based ASR correction system. It allows users to is-
    sue natural language voice instructions to refine recognition results and
    reduce manual input.
    The system consists of three modules: an input classifier to detect
    commands, a command classifier to determine instruction type, and a
    command labeler to locate correction targets. We train these modules
    using data from SIGHAN-15 and zh-tw-wikipedia, simulate ASR errors
    via TTS/ASR, and generate realistic correction commands using LLMs
    and linguistic features.
    Experiments show that the original models each achieved over 80%
    correction accuracy, and a combined model maintained strong, stable
    performance. The system is deployed as an API for integration with
    ASR applications, with real user data continuously collected for opti-
    mization. LLMs are also integrated to enhance instruction understand-
    ing and expand application scope.
    In summary, our method provides a practical, flexible ASR correc-
    tion workflow that reduces user effort and improves correction precision.
    Appears in Collections:[Graduate Institute of Computer Science and Information Engineering] Electronic Thesis & Dissertation

    Files in This Item:

    File Description SizeFormat
    index.html0KbHTML8View/Open


    All items in NCUIR are protected by copyright, with all rights reserved.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明