English  |  正體中文  |  简体中文  |  Items with full text/Total items : 69937/69937 (100%)
Visitors : 23411067      Online Users : 219
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version


    Please use this identifier to cite or link to this item: http://ir.lib.ncu.edu.tw/handle/987654321/78672


    Title: 基於深度智能之口語處理技術( I );Deep Intelligence Based Spoken Language Processing( I )
    Authors: 王家慶;陳柏琳;王新民;李宏毅;蔡宗翰;張寶基;曹昱
    Contributors: 國立中央大學資訊工程系
    Keywords: 口語處理;語音分離;混語辨識;口語翻譯;語音情緒辨識;對話系統;深度學習;Spoken language processing;speech separation;code-switching speech recognition;spoken language translation;speech emotion recognition;dialogue system;deep learning
    Date: 2018-12-19
    Issue Date: 2018-12-20 13:42:43 (UTC+8)
    Publisher: 科技部
    Abstract: 語音是人類交流最主要也是最自然的方式,而且是人機互動裡最有效的手段。要讓電腦口語處理系統能類似人類一樣無障礙且高度智能地運作,是一個大問題,也是學者們數十年來努力追求的目標。隨著深度學習的成功,上述目標不再遙不可及,為了解決此大問題,本計畫「基於深度智能之口語處理技術」,將以深度學習之技術來研發深度智能的口語處理系統,有效整合訊號處理、聲學處理、語言處理以及深度學習,研發以下五項關鍵技術:智能多通道處理暨混和語音訊號分離、混雜語言語音辨識、口語翻譯、語音情緒辨識、不限領域語音對話。在口語的選擇上,我們將著重在地之國語、閩南語以及客語。 在智能多通道處理暨混和語音訊號分離方面,本計畫擬建立一深度學習之架構來去除背景噪音、回音、及包含語音在內之干擾音源,藉此增進後端系統之混語辨識效果。在混雜語言語音辨識方面,本計畫首先擬發展中、英、台、客語的單一語言語音辨識,並架構於深度學習之語言模型及聲學模型之上,其後進一步以整體學習發展混雜語言之語音辨識。在口語翻譯方面,本計畫將根據語音辨識之字詞,發展可處理口語不流利狀況之中台、中客、中英的口語互譯系統。在語音情緒辨識方面,我們透過擷取語音與語意之情緒特徵,發展同時考量情緒內部變異性及情緒間重疊性之語音情緒辨識系統。而在語音對話方面,本計畫擬發展具不限領域語言理解及考慮使用者情緒之任務導向型、問答型以及閒聊型三種對話系統。此外,本計畫所發展之口語處理關鍵技術,也將進行在醫療與居家照護場域下的應用研發。 本計畫以解決機器口語處理這個大問題作為行動目標,結合有志之士並與海內外頂尖實驗室合作,志在研發領先全球之口語處理技術,目標是成立國際級智能口語處理研究中心。除了培育國家人工智慧的頂尖人才之外,所研發之技術與成果也將在國際上佔有舉足輕重的地位,進而帶領國家擺脫Google等美系大廠在相關產業的可能壟斷趨勢,有效提升國家之產業競爭力。 ;Speech is not only the most natural means of communication among people, but also the most effective means of human-computer interaction. Enabling a computer to process spoken language like a human is a great problem, which scholars have been trying to solve for decades. Deep leaning brings this goal into reach. To solve this problem, this project develops a plan for processing spoken language that integrates speech processing, acoustic signal processing, natural language processing, and deep learning techniques. Five key techniques - intelligent multi-channel speech processing and speech separation, code-switching speech recognition, spoken language translation, speech emotion recognition, and open field dialogue will be developed. Local languages - Mandarin, Minnan, and Hakka - will be particularly addressed. For intelligent multi-channel speech processing and speech separation, deep learning architectures will be developed to eliminate interference and background noise and thus improve the back-end speech recognition system. For code-switching speech recognition, deep learning-based language models and acoustic models of Mandarin, English, Minnan, and Hakka will be constructed, and then a code-switching speech recognition system will be developed using ensemble learning. For spoken language translation, a Mandarin-Minnan mutual translation system, a Mandarin-Hakka mutual translation system, and a Mandarin-English mutual translation system will be developed. For speech emotion recognition, acoustic and semantic emotional feature extraction will be conducted first. A recognizer that considers both the intra-variance and inter-overlapping of emotional classes will be developed to increase the accuracy of speech emotion recognition. In the dialogue system, open-field language understanding as well as the emotional state will be used in developing task-oriented, question answering, and chit-chat dialog systems. Finally, the critical spoken language processing techniques that are developed herein will be used on an intelligent interactive platform that can be applied to healthcare and home care. The project objective is to handle the problem of machine spoken language processing. We will collaborate with people with enhanced ideals and the well-known laboratories to explore the cutting-edge technologies on spoken language processing, and establish an outstanding spoken language processing research center. In addition to training talent for artificial intelligence, our research achievements will also hold an important position globally. Besides, it will prevent the market monopoly of American corporations including Google and effectively enhance the competitiveness of the industry in Taiwan.
    Relation: 財團法人國家實驗研究院科技政策研究與資訊中心
    Appears in Collections:[資訊工程學系] 研究計畫

    Files in This Item:

    File Description SizeFormat
    index.html0KbHTML100View/Open


    All items in NCUIR are protected by copyright, with all rights reserved.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - Feedback  - 隱私權政策聲明