基於長短期記憶深層學習方法之動作辨識

NCU Institutional Repository > 資訊電機學院 > 資訊工程研究所 > 博碩士論文 > Item 987654321/72287

jsp.display-item.identifier=請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/72287

题名:	基於長短期記憶深層學習方法之動作辨識
作者:	江金晉;Chiang,Chin-Chin
贡献者:	資訊工程學系
关键词:	動作辨識;長短期記憶;深層學習;注意力模型;卷積神經網路;類神經網路;Action recognition;Long short-term memory;Deep learning;Attention model;Convolutional neural network;Neural network
日期:	2016-08-29
上传时间:	2016-10-13 14:37:16 (UTC+8)
出版者:	國立中央大學
摘要:	生活品質不斷提升、便捷性不斷增加的同時，多少功能與應用仰賴於背後的技術支援與開發。從影像到影片、從姿勢到動作，隨著技術與硬體的不斷進步，我們所需要、所面對的，是更上層樓的功能與效果。基於長短期記憶的深層學習架構，我們提出了光流注意力模型。該模型透過光流圖的使用進行影片中的動作辨識。在提出的架構中，各影片皆切為多個幀影像，每幀影像都透過CNN進行特徵擷取，並依時序將特徵輸入至光流注意力模型中。注意力模型主要由LSTM組成，其特色在於輸入資料會先透過處理過的光流注意力權重圖做為特徵的權重值以提高特徵中的重要部分。而調整後的特徵會再繼續輸入至LSTM，並產生該時序的辨識結果。本論文藉由光流圖作為權重值對影像的重要區域進行動態追蹤，以提高重要特徵所具有的權重。在動作辨識的實驗中，我們提出的光流注意力模型高於僅使用LSTM約3.6%，高於參考的視覺注意力模型約2.4%。而若與視覺注意力結合，則整體架構能高於僅使用LSTM約4.5%，高於只使用視覺注意力模型約3.3%。實驗結果顯示出以光流圖作為權重值能有效地捕捉影片動作中的具鑒別性區域，並能與視覺注意力作互補產生更好的辨識效果。 ;In the meantime while the quality of life promotes continuously and the convenience increase constantly, so many uses and applications rely on the support of technology and exploitation behind. From image to video, and from gesture to action, what we need to face with the succeeding improvement of technology and hardware, is the much better function and effect. Based on the architecture of deep learning of long short-term memory, we proposed the optical flow attention model. This model do action recognition for videos through the use of optical flow images. In the proposed architecture, each video is separated to frame images, and feed into CNN for feature extraction. Each feature input into the optical flow model followed by the time sequence. The attention model is mainly composed by LSTM, and the characteristic of optical flow attention is that the input feature weighted by the optical flow weight image firstly to highlight the important part of current feature. And the adjusted feature input into LSTM after weighted and produce the recognition result at that time step. The thesis does dynamical tracing on the important area of image using optical flow image as weights to promote the weights at the important part of feature. In the experiment of action recognition, the optical flow image we proposed grows about 3.6% accuracy compared with the model only use LSTM, and get 2.4% higher compared with the visual attention model we referenced. And we combine the visual attention model with our optical flow attention model, getting 4.5% higher than LSTM and 3.6% higher than the visual attention model. The experiment result shows that using optical flow image as weights brings the effect to capture the discriminate area of action in video, and can complement with visual attention to reach better recognition effect.
显示于类别:	[資訊工程研究所] 博碩士論文

文件中的档案:

档案	描述	大小	格式	浏览次数
index.html		0Kb	HTML	472	检视/开启

在NCUIR中所有的数据项都受到原著作权保护.

社群 sharing

数据加载中.....