中大機構典藏-NCU Institutional Repository-提供博碩士論文、考古題、期刊論文、研究計畫等下載:Item 987654321/9396
English  |  正體中文  |  简体中文  |  Items with full text/Total items : 78818/78818 (100%)
Visitors : 34993304      Online Users : 447
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version


    Please use this identifier to cite or link to this item: http://ir.lib.ncu.edu.tw/handle/987654321/9396


    Title: 語者調適之應用研究;The Research of Speaker Adaptation
    Authors: 廖家慶;Chia-Ching Liau
    Contributors: 電機工程研究所
    Keywords: 語者調適;speaker adaptation
    Date: 2002-06-04
    Issue Date: 2009-09-22 11:46:53 (UTC+8)
    Publisher: 國立中央大學圖書館
    Abstract: 摘 要 在語音辨識系統中,特定語者(Speaker-dependent)型語音辨識系統雖有高辨識率的優點,但當應用到新語者時須花釵h語音訓練資料和時間;而不限語者(Speaker-independent)或多語者(multi-speaker)型的語音辨識系統,除最初建立系統時所需語音資料外,應用於新語者時不再需新語音訓練資料,但其辨識率普遍不高。語者調適(Speaker-adaptive)辨識系統則利用一充分訓練過的參考系統已知資訊,藉新語者少量語音資料訓練 ,可達到接近特定語者系統的辨識率,因此論文中將針對語者調適系統進行研究。 本論文內容包含兩個主要研究主軸,其一為如何在少量調適語料之狀況下,增進改善調適演算法,藉此提升系統辨識率與調適結果;另一主軸則為利用增進後之調適演算法實際應用於線上辨識與調適。 於第一研究主軸中,其重點在於考慮初始模型與最大可能性線性迴歸(Maximum Likelihood Linear Regression,MLLR)兩者間貢獻的比重分配,藉由找出最佳平衡點來提升調適性能。接著並考慮向量場平滑化(Vector-Field-Smoothing,VFS)轉移向量場的調適方式,針對沒有觀測到之調適語料模型,加以參考有調適語料之模型來進行調整,藉此特性再搭配權重化之MLLR調適方法研究其調適效果。接者利用特定語者模型與不特定語者模型來架構出特徵向量空間,由此特徵向量空間來找出語者的代表點所在,藉此調整系統模型參數。而在第二研究主軸內,藉由所發展出少量調適語料即能達到調適系統之演算法,將此調適演算法應用於線上系統,使語者能夠感受到辨識與調適之即時變化。 Speaker adaptation has been applied to speech recognition to get a speaker dependent system with a good performance. Most adaptation techniques use the initial model as a starting point and then introduce speaker’s specific information. By using the adapted parameters, the recognition performance can be significantly improved. In this thesis, we present a variation on improving the performance of maximum likelihood linear regression (MLLR) in cases of little adaptation data. The transformed Gaussian means are interpolated with the means in the initial mean models. The VFS algorithm proposed by the following steps. First, the transfer vectors are estimated. Then, interpolation and smoothing are performed using the transfer vectors. We applied the idea of using eigenvoices, a set of orthogonal basis vectors derived from the parameters of speaker dapendent models trained on reference speakers.
    Appears in Collections:[Graduate Institute of Electrical Engineering] Electronic Thesis & Dissertation

    Files in This Item:

    File SizeFormat


    All items in NCUIR are protected by copyright, with all rights reserved.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明