中大機構典藏-NCU Institutional Repository-提供博碩士論文、考古題、期刊論文、研究計畫等下載:Item 987654321/84098
English  |  正體中文  |  简体中文  |  全文笔数/总笔数 : 78818/78818 (100%)
造访人次 : 34713109      在线人数 : 785
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜寻范围 查询小技巧:
  • 您可在西文检索词汇前后加上"双引号",以获取较精准的检索结果
  • 若欲以作者姓名搜寻,建议至进阶搜寻限定作者字段,可获得较完整数据
  • 进阶搜寻


    jsp.display-item.identifier=請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/84098


    题名: 基於遷移學習之低資源語音辨識;Low-Resource Speech Recognition Based on Transfer Learning
    作者: 蔡緯鴻;Tsai, Wei-Hong
    贡献者: 資訊工程學系
    关键词: 語音辨識;低資源;端到端;speech recognition;low-resource;end-to-end
    日期: 2020-07-30
    上传时间: 2020-09-02 18:04:16 (UTC+8)
    出版者: 國立中央大學
    摘要: 近年端到端語音辨識(End-to-End Speech Recognition)成為語音辨識的研究趨勢,許多研究致力於探索語音辨識更高的準確性,並且在各個著名的語料庫上達到更高的準確性。然而,這些高度的準確性建立在龐大的語料上,而世界上有許多少數語言,沒有充足的語料建立該種語言的語音辨識,所建構出的語音辨識往往準確性過低,因此,如何以少量的語料建立語音辨識系統一直是語音辨識上的一項議題。
    本論文使用ESPnet toolkit實現序列對序列的(Sequence to Sequence, Seq2Seq)端到端語音辨識模型,以及Fairseq toolkit實現輔助語音辨識的無監督預訓練模型,利用無標籤的(Unlabeled)單一語音資料協助擷取語音特徵,並透過遷移學習(Transfer Learning),將建立於語料較充足的語音辨識模型遷移至語料較缺乏的客語語音辨識,以此建立一個較強健的低資源(Low Resource)客語語音辨識。
    ;Recent years, end-to-end speech recognition become a popular architecture. Many research aim to improve accuracy in end-to-end speech recognition, and they achieve higher accuracy on various famous corpora indeed. However, there are many language which do not have enough data to build their speech recognition system in the world. The system often can not get a reliable result and can not be used in real-world. Therefore, how to build a reboust low-resource speech recognition is an important issue in speech recognition.
    This paper uses ESPnet toolkit to implement an end-to-end speech recognition model based on sequence-to-sequence architecture, and also uses Fairseq toolkit to implement an unsupervised pre-training model for assisted speech recognition. Using unlabeled speech data to help extract speech features, and transfer a speech recognition model based on sufficient corpus to Haaka speech recognition with less corpus through transfer learning. Establish a more robust low-resource Hakka speech recognition.
    显示于类别:[資訊工程研究所] 博碩士論文

    文件中的档案:

    档案 描述 大小格式浏览次数
    index.html0KbHTML130检视/开启


    在NCUIR中所有的数据项都受到原著作权保护.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明