Multi-Scale Cross-Modal Fusion for Facial Expression Recognition

NCU Institutional Repository > 資訊電機學院 > 資訊工程研究所 > 博碩士論文 > Item 987654321/98276

請使用永久網址來引用或連結此文件: https://ir.lib.ncu.edu.tw/handle/987654321/98276

題名:	Multi-Scale Cross-Modal Fusion for Facial Expression Recognition
作者:	許顥蓉;Hsu, Hao-Jung
貢獻者:	資訊工程學系
關鍵詞:	面部表情識別;電腦視覺;Facial Expression Recognition;Computer Vision
日期:	2025-07-21
上傳時間:	2025-10-17 12:34:21 (UTC+8)
出版者:	國立中央大學
摘要:	面部表情識別已成為計算機視覺和模式識別領域的重要研究方向，在人機互動、情感計算、心理健康評估和智能監控等應用中發揮著關鍵作用。然而，面部表情識別任務面臨著類間相似性、類內差異性和類別不平衡等重大挑戰，這些挑戰在非受控或野外環境中更為嚴峻，進而削弱了模型的辨識準確性與穩定性。本研究旨在解決上述問題，故提出了一種新穎的多尺度跨模態融合模型方法，整合圖像特徵和面部關鍵點資訊進行表情識別。此外，我們採用針對性的數據增強和專門的損失函數來處理類別不平衡問題。根據實驗結果表明，我們的方法在AffectNet和RAF-DB數據集上均實現顯著進步，優於現有的頂尖面部表情識別模型。;Facial expression recognition has emerged as a significant area of study within computer vision and pattern recognition, contributing to a wide range of applications, including human-computer interaction, affective computing, mental health assessment, and intelligent surveillance. Nevertheless, facial expression recognition encounters significant challenges including inter-class similarity, intra-class difference, and class imbalance, which are particularly prominent in wild environments and affect the accuracy and reliability of recognition systems. To effectively deal with these challenges, we propose a novel multi-scale cross-modal fusion approach that integrates image features and facial landmark information for expression recognition. Additionally, we employ targeted data augmentation and a specialized loss function to handle class imbalance issues. Experimental results verify that our approach surpasses existing state-of-the-art facial expression recognition models, achieving significant improvements on the widely used AffectNet and RAF-DB benchmark datasets.
顯示於類別:	[資訊工程研究所] 博碩士論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	8	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....