中大機構典藏-NCU Institutional Repository-提供博碩士論文、考古題、期刊論文、研究計畫等下載:Item 987654321/80989
English  |  正體中文  |  简体中文  |  全文笔数/总笔数 : 80990/80990 (100%)
造访人次 : 41641944      在线人数 : 1472
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜寻范围 查询小技巧:
  • 您可在西文检索词汇前后加上"双引号",以获取较精准的检索结果
  • 若欲以作者姓名搜寻,建议至进阶搜寻限定作者字段,可获得较完整数据
  • 进阶搜寻


    jsp.display-item.identifier=請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/80989


    题名: 結合心理特徵與情緒標籤訓練之語音情感辨識技術;Speech Emotion Recognition Based on Joint Training by Self-Assessment Manikins and Emotion Labels
    作者: 陳靖明;Chen, Jing-Ming
    贡献者: 通訊工程學系
    关键词: 語音情緒辨識;心理狀態特徵;深度學習;卷積遞迴神經網路;Speech emotion recognition;Self-Assessment Manikin;Deep learning;Convolutional recurrent neural network.
    日期: 2019-07-29
    上传时间: 2019-09-03 15:23:46 (UTC+8)
    出版者: 國立中央大學
    摘要: 隨著人工智慧的發展,人與機器之間的互動變得越加頻繁,如聊天機器人或居家照護系統都是常見的人機互動應用。而情感辨識技術可以用來提升人機之間的互動性,亦可將情緒機器人應用於醫療方面,如病患的情緒識別等。我們希望利用深度學習的技術來學習語音訊號中的情緒特徵,達到情感辨識的效果。
    本研究為「結合心理特徵與情緒標籤訓練之語音情感辨識技術」,提出藉由結合心理狀態程度的情緒特徵,輔助情緒標籤訓練神經網路,來提升語音情感的辨識率。本研究同時使用了迴歸模型以及分類模型,迴歸模型用來進行心理狀態程度的預測,而分類模型則是用來進行情緒標籤的辨識。此語音情感辨識技術於腳本與即興演出混合情境的資料集中,辨識率能夠達到64.70%,若於只有即興演出情境的資料集,辨識率則是能達到66.34%,相對於未結合心理狀態特徵的辨識技術,此方法的辨識率各自提升了2.95%以及2.09%,因此結合心理狀態的特徵能夠有效地幫助語音情感進行辨識。;With the development of artificial intelligence, the interaction between humans and machines has become more and more often, such as chat robots or home care systems, which are common human-computer interaction applications. Emotional recognition can improve the interaction between man and machine, and can also apply the emotional recognition of the robot to medical aspects, such as emotional identification of patients. The objective of this work is to develop a speech emotion recognition system by learning the emotional characteristics of audio using deep learning.
    In this work, we propose a system that can recognize speech emotion and use both regression models and classification models. This speech emotion recognition technology can achieve the accuracy of 64.70% in the dataset of script and improvised mixed scenes. If the dataset has only impromvised scenes, the accuracy can reach 66.34%. Compared with the characteristics of uncombined mental state, the accuracy of this technology is increased by 2.95% and 2.09%, respectively. So the characteristics of mental state can effectively help the speech emotion recognition.
    显示于类别:[通訊工程研究所] 博碩士論文

    文件中的档案:

    档案 描述 大小格式浏览次数
    index.html0KbHTML221检视/开启


    在NCUIR中所有的数据项都受到原著作权保护.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明