中大機構典藏-NCU Institutional Repository-提供博碩士論文、考古題、期刊論文、研究計畫等下載:Item 987654321/92242
English  |  正體中文  |  简体中文  |  全文笔数/总笔数 : 80990/80990 (100%)
造访人次 : 42409969      在线人数 : 1178
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜寻范围 查询小技巧:
  • 您可在西文检索词汇前后加上"双引号",以获取较精准的检索结果
  • 若欲以作者姓名搜寻,建议至进阶搜寻限定作者字段,可获得较完整数据
  • 进阶搜寻


    jsp.display-item.identifier=請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/92242


    题名: 於對話中定位特定發音之研究 – 以滿意為例;Locating Satisfaction in Vocal Dialogue
    作者: 蘇筱凌;Su, Hsiao-Ling
    贡献者: 企業管理學系
    关键词: 關鍵字搜尋;顧客滿意度;梅爾倒頻譜係數;交叉注意力機制;語音辨識;Keyword search;Customer satisfaction;Mel-frequency cepstral coefficients;Cross-attention mechanism;Speech recognition
    日期: 2023-07-13
    上传时间: 2024-09-19 15:26:38 (UTC+8)
    出版者: 國立中央大學
    摘要: 在企業中,了解顧客對產品或服務的滿意度對於提高顧客的再購率和推薦意願至關重要。因此,建立一種有效率的語音辨識方法,能夠準確分析客服語音,成為一個迫切的需求。然而,在長語音訊號中定位出顧客滿意情緒的聲音位置是一項具有挑戰性的任務。
    本研究旨在將關鍵字搜索與交叉注意力的技術相結合,以有效定位出特定聲音位置。研究中採用了包含不同說話者聲音的特定發音資料集以及業界電話訪談聲音資料集,透過對這些聲音資料進行分析和交叉匹配,目標是找到長語音訊號中正向或負向滿意情緒的聲音位置。在研究過程中,首先對這些資料進行資料前處理和聲音特徵萃取,接著,運用交叉注意力模型,將處理後的資料輸入其中,透過計算兩不同特徵向量之間的注意力分數,定位出具有最高注意力分數的滿意聲音位置。
    實驗結果顯示,濾波器組數量和位移步伐參數是影響命中率的重要因素,根據研究結果顯示,在不同的參數設置下,最佳參數為濾波器組數量30且位移步伐10的設置表現最佳,評估指標HR@5達到95.08%,HR@3達到84.15%,HR@1達到60.11%。
    ;In the business, understanding customer satisfaction with products or services is crucial for improving customer repurchase rates and willingness to recommend. Therefore, establishing an efficient method of speech recognition that can accurately analyze customer service voice becomes an urgent requirement. However, locating the dialogues of customer satisfaction emotions within long speech signals is a challenging task.
    This research aims to combine keyword search with cross-attention techniques to effectively locate satisfaction vocal dialogue. The research utilizes specific pronunciation datasets containing voices from different speakers, as well as business telephone interview voice datasets. By analyzing and cross-matching these voice data, the goal is to find the dialogues of satisfied vocals conveying positive or negative emotions in long speech signals. In the research process, the data undergo preprocessing and feature extraction, followed by the application of a cross-attention model to input the processed data. By calculating the attention scores between different features, we can locate the dialogues of satisfied vocals with the highest attention scores.
    The experimental results demonstrate that the number of filter banks and the shift stride parameters are important factors affecting the hit ratio. According to the research findings, the optimal parameters are a filter banks quantity of 30 and a shift stride of 10, achieving the best performance across different evaluation metrics. The HR@5 reaches 95.08%, HR@3 reaches 84.15%, and HR@1 reaches 60.11%.
    显示于类别:[企業管理研究所] 博碩士論文

    文件中的档案:

    档案 描述 大小格式浏览次数
    index.html0KbHTML31检视/开启


    在NCUIR中所有的数据项都受到原著作权保护.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明