於對話中定位特定發音之研究 – 以滿意為例;Locating Satisfaction in Vocal Dialogue

NCU Institutional Repository > 管理學院 > 企業管理研究所 > 博碩士論文 > Item 987654321/92242

jsp.display-item.identifier=請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/92242

题名:	於對話中定位特定發音之研究 – 以滿意為例;Locating Satisfaction in Vocal Dialogue
作者:	蘇筱凌;Su, Hsiao-Ling
贡献者:	企業管理學系
关键词:	關鍵字搜尋;顧客滿意度;梅爾倒頻譜係數;交叉注意力機制;語音辨識;Keyword search;Customer satisfaction;Mel-frequency cepstral coefficients;Cross-attention mechanism;Speech recognition
日期:	2023-07-13
上传时间:	2024-09-19 15:26:38 (UTC+8)
出版者:	國立中央大學
摘要:	在企業中，了解顧客對產品或服務的滿意度對於提高顧客的再購率和推薦意願至關重要。因此，建立一種有效率的語音辨識方法，能夠準確分析客服語音，成為一個迫切的需求。然而，在長語音訊號中定位出顧客滿意情緒的聲音位置是一項具有挑戰性的任務。本研究旨在將關鍵字搜索與交叉注意力的技術相結合，以有效定位出特定聲音位置。研究中採用了包含不同說話者聲音的特定發音資料集以及業界電話訪談聲音資料集，透過對這些聲音資料進行分析和交叉匹配，目標是找到長語音訊號中正向或負向滿意情緒的聲音位置。在研究過程中，首先對這些資料進行資料前處理和聲音特徵萃取，接著，運用交叉注意力模型，將處理後的資料輸入其中，透過計算兩不同特徵向量之間的注意力分數，定位出具有最高注意力分數的滿意聲音位置。實驗結果顯示，濾波器組數量和位移步伐參數是影響命中率的重要因素，根據研究結果顯示，在不同的參數設置下，最佳參數為濾波器組數量30且位移步伐10的設置表現最佳，評估指標HR@5達到95.08％，HR@3達到84.15％，HR@1達到60.11%。 ;In the business, understanding customer satisfaction with products or services is crucial for improving customer repurchase rates and willingness to recommend. Therefore, establishing an efficient method of speech recognition that can accurately analyze customer service voice becomes an urgent requirement. However, locating the dialogues of customer satisfaction emotions within long speech signals is a challenging task. This research aims to combine keyword search with cross-attention techniques to effectively locate satisfaction vocal dialogue. The research utilizes specific pronunciation datasets containing voices from different speakers, as well as business telephone interview voice datasets. By analyzing and cross-matching these voice data, the goal is to find the dialogues of satisfied vocals conveying positive or negative emotions in long speech signals. In the research process, the data undergo preprocessing and feature extraction, followed by the application of a cross-attention model to input the processed data. By calculating the attention scores between different features, we can locate the dialogues of satisfied vocals with the highest attention scores. The experimental results demonstrate that the number of filter banks and the shift stride parameters are important factors affecting the hit ratio. According to the research findings, the optimal parameters are a filter banks quantity of 30 and a shift stride of 10, achieving the best performance across different evaluation metrics. The HR@5 reaches 95.08%, HR@3 reaches 84.15%, and HR@1 reaches 60.11%.
显示于类别:	[企業管理研究所] 博碩士論文

文件中的档案:

档案	描述	大小	格式	浏览次数
index.html		0Kb	HTML	31	检视/开启

在NCUIR中所有的数据项都受到原著作权保护.

社群 sharing

数据加载中.....