適用於少量訓練資料之深度學習手語辨識輸入組合;Suitable Data Input for Deep-Learning-Based Sign  Language Recognition with a Small Training Dataset

NCU Institutional Repository > 資訊電機學院 > 資訊工程研究所 > 博碩士論文 > Item 987654321/90023

請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/90023

題名:	適用於少量訓練資料之深度學習手語辨識輸入組合;Suitable Data Input for Deep-Learning-Based Sign Language Recognition with a Small Training Dataset
作者:	陳昱任;Chen, Yu-Jen
貢獻者:	資訊工程學系
關鍵詞:	手語辨識;特徵擷取;深度學習;Sign Language Recognition;Feature Extraction;Deep Learning
日期:	2022-09-19
上傳時間:	2022-10-04 12:08:10 (UTC+8)
出版者:	國立中央大學
摘要:	基於深度學習的手語辨識通常需要大量視訊來訓練神經網路模型，本研究考量在手語視訊較不足的情況下，透過特徵擷取及擴大訓練資料等方式，產生有效的手語訓練資料以協助建構深度學習辨識模型。我們利用 Mediapipe 嘗試由手語視訊中取得手部骨架，分析幾種手部骨架調整策略以及顏色安排，並由骨架產生手部遮罩以模擬生成不同人的手部型態。由於手部偵測有時會因手指快速移動的動態模糊導致失誤，我們因此結合光流圖以確保每張畫面保留手部移動資訊。我們將手部骨架、手部型態以及畫面光流作為 3D-ResNet 模型的三個通道輸入，採用不同的空間域變化與時間域採樣策略，模擬不同大小的手、不同拍攝角度、不同手速等情形。實驗結果顯示我們所提出的方式於美國手語資料集中可以有效提高辨識準確度。關鍵字 - 手語辨識、特徵擷取、深度學習;Deep learning-based sign language recognition usually requires a large number of sign language videos to train neural network models. In this study, we consider generating effective sign language training data to help construct deep learning recognition models through feature extraction and expansion of training data when a smaller number of sign language videos are used for training. We use Mediapipe to obtain the hand skeleton from the sign language video, analyze several hand skeleton adjustment policies and color arrangement, and generate hand masks from the skeleton to simulate hands of different persons. Since the miss detection of hands may happen due to the motion blurring caused by rapid hand movements, we incorporate optical flows to ensure that the hand movement information is retained in each frame. We use different spatial and temporal processing strategies to simulate different hand sizes, different filming angles, and different hand speeds. The experimental results show that the proposed approach is effective in improving the accuracy of sign language recognition in the American Sign Language dataset. Index Terms - Sign Language Recognition, Feature Extraction, Deep Learning
顯示於類別:	[資訊工程研究所] 博碩士論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	174	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....