使用分級時序記憶實作視角無關手勢辨識問題

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：31

、訪客IP：18.227.209.101

姓名

林仕庭(Shih-ting Lin) 查詢紙本館藏

畢業系所

電機工程學系

論文名稱

使用分級時序記憶實作視角無關手勢辨識問題
(View-Independent Hand Gesture Recognition using Hierarchical Temporal Memory)

相關論文

★ E2T-iSEE:應用於事件與情感狀態轉移排程器之編輯	★ “偶”:具情感之球型機器人
★ 陣列區塊電容產生器於製程設計套件之評量	★ 應用於數位家庭整合計畫影像傳輸子系統之設計考量與實現
★ LED 背光模組靜電放電路徑	★ 電阻串連式連續參考值產生器於製程設計套件之評量
★ 短篇故事分類與敘述	★ 用於類比/混和訊號積體電路可靠度增強的加壓測試
★ 延伸考慮製程參數相關性之類比電路階層式變異數分析器	★ 以電子電路觀點對田口式惠斯登電橋模擬實例的再分析
★ 應用於交換電容ΔΣ調變電路之電容排列良率自動化擺置平台	★ 陣列MiM電容的自動化佈局
★ 陣列MiM電容的平衡接點之通道繞線法	★ 氣象資訊達人
★ 嵌入式WHDVI多核心Forth微控制器之設計	★ 應用於電容陣列區塊之維持比值良率的通道繞線法

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

肢體語言辨識（Gesture Recognition）為人機互動（Human Computer Interaction）運用中一項重要的技術，其中視角無視辨識（View-Independent Recognition）為機器視覺辨識的難題。為了賦予機器學習模型（Machine Learning model）辨識事物於不同視角的能力，時間資訊的運用是一道線索。然而多數的機器學習模型本質為辨識模型（discriminative model），運算複雜度的問題使其困難於運用時間資訊，並被認為欠缺對輸入訓練資料的歸納性（generalization）與藉由過去經驗幫助新事物學習，增進學習（incremental learning）的能力。
分級時序記憶（Hierarchical Temporal Memory）為近年新發展的機器學習模型。根據人類大腦皮質的運算假說：記憶預測架構，建構非辨識模型（non-discriminative model）。分級時序記憶利用時間資訊行使非監督式學習，使機器學習模型具備歸納訓練資料與增進學習的能力，同時達到可信賴的辨識結果。本論文使用電腦視覺演算法與分級時序記憶實作兩個手勢辨識問題，於視角變動的連續影像的單張辨識中（snap shot）分別得到辨識正確率91%與84%的辨識結果。

摘要(英)

Gesture Recognition is importance in designing efficient Human Computer Interaction (HCI) applications and View-Independent Recognition is one of a difficult computer vision gesture recognition problem. Temporal information is a clue to provide the ability to recognize object in variant phase for Machine Learning model. However, most of the Machine Learning Model is discriminative model. It has computational complexity problem for using temporal information and proves inadequate at the ability of training data generalization and incremental learning essentially.
Hierarchical Temporal Memory is a novel Machine Learning model studying in recent years. According to the memory prediction framework hypothesis of brain new cortex, Hierarchical Temporal Memory builds a non-discriminative model using temporal information to do unsupervised learning. Try to achieve training data generalization and incremental learning ability without losing recognition reliability. Combining computer vision image process algorithm and Hierarchical Temporal Memory Machine Learning model, a hand gesture recognition system was built in this paper. Two continuous view-point change recognition problems was tested, the continuous image sequence snap shot recognition accuracy results were 91% and 84% respectively.

關鍵字(中)

★ 分級時序記憶
★ 手勢辨識
★ 視角無關辨識
★ 機器學習

關鍵字(英)

★ Hierarchical Temporal Memory
★ Hand Gesture Recognition
★ View-Independent Recognition
★ Machine Learning

論文目次

第一章緒論 .......................................................................................................................................... 1
1.1 研究背景與目的 .......................................................................................................................... 1
1.2 過去手勢辨識研究回顧 ............................................................................................................... 1
1.3 論文組織 ..................................................................................................................................... 4
第二章辨識問題與分級時序記憶 ....................................................................................................... 5
第三章分級時序記憶演算介紹 ......................................................................................................... 10
3.1 分級時序記憶網路架構 ............................................................................................................. 10
3.2 節點運算 .................................................................................................................................... 11
3.2.1 學習階段運算 ...................................................................................................................... 12
3.2.1.1 樣式記憶 ......................................................................................................... 12
3.2.1.2 轉換機率學習 ................................................................................................. 13
3.2.1.3 時序分群 ......................................................................................................... 15
3.2.2 推論階段運算 ...................................................................................................................... 16
3.3 階層運算 ................................................................................................................................... 17
3.3.1 階層運算流程 ..................................................................................................................... 17
3.3.2 貝氏訊息傳播 ..................................................................................................................... 19
3.4 分級時序記憶歸納性與增進學習 ............................................................................................ 21
3.5 非辨識模型 ............................................................................................................................... 21
第四章影像處理演算法 ..................................................................................................................... 23
4.1 使用統計模型的前景擷取演算法 ............................................................................................ 23
4.2 物件連通演算與雜訊去除 ........................................................................................................ 25
4.3 膚色偵測 ................................................................................................................................... 26
4.4 邊緣偵測 ................................................................................................................................... 27
4.5 手掌區域偵測 ............................................................................................................................ 28
4.6 影像維度正規化 ........................................................................................................................ 30
4.7 賈伯濾波器 ............................................................................................................................... 31
4.8 分級時序記憶機器學習模型 .................................................................................................... 32
4.9 支持向量機分類器 .................................................................................................................... 32
第五章手勢辨識 ................................................................................................................................ 34
5.1 辨識問題敘述 ............................................................................................................................ 34
5.2 辨識環境 ................................................................................................................................... 35
5.3 分級時序記憶訓練流程 ............................................................................................................ 35
5.4 辨識效能驗證 ............................................................................................................................ 38
5.4.1 辨識錯誤分析 ..................................................................................................................... 39
5.4.2 猜拳手勢辨識 ..................................................................................................................... 43
5.4.2.1 剪刀手勢 ......................................................................................................... 43
5.4.2.2 石頭手勢 ......................................................................................................... 44
5.4.2.3 布手勢 ............................................................................................................. 45
5.4.3 手指數手勢辨識 ................................................................................................................. 47
5.4.3.1 手指數一 ......................................................................................................... 47
5.4.3.2 手指數二 .......................................................................................................... 48
5.4.3.3 手指數三 ......................................................................................................... 49
5.4.3.4 手指數四 ......................................................................................................... 50
5.4.3.5 手指數五 ......................................................................................................... 51
5.5 分級時序記憶生成模型特性 .................................................................................................... 53
5.6 時序推論輸出統計辨識 ............................................................................................................ 55
第六章結論與未來展望 ..................................................................................................................... 60
6.1 結論 ........................................................................................................................................... 60
6.2 未來展望 ................................................................................................................................... 61
參考文獻 .............................................................................................................................................. 62

參考文獻

[1] Z. Sun, G. Bebis, and R. Miller, "On-road vehicle detection using evolutionary Gabor filter optimization," IEEE Transaction on Intelligent Transportation Systems, vol.6, no.2, pp. 125-137, June 2005.
[2] T. Starner and A. Pentland, “Real-time American Sign Language recognition from video using hidden Markov models,” Computational Image and Vision, vol.9, pp. 227-244, Apr. 1997.
[3] C. Tomasi, S. Petrov and A. Sastry, "3D tracking = classification + interpolation," Proc. of 9th IEEE international Conference on Computer Vision, vol.2, Oct. 2003, pp. 1441-1448.
[4] H. Park and J. W. Lee, “Recognition-based gesture spotting in video games,” International Symposium on Ubiquitous VR, 2007, pp. 1-2.
[5] S. A. de Araujo and H. Y. Kim, “Rotation, scale and translation-invariant segmentation-free grayscale shape recognition using mathematical morphology,” Proc. of the 8th International Symposium on Mathematical Morphology, vol. 2, Oct. 2007, pp. 61- 62.
[6] E. J. Holden and R. Owens, “Recognising Moving Hand shapes,” Proc. of 12th International Conference on Image Analysis and Processing, Mantova, Italy, 17-19 Sept. 2003, pp. 14-19.
[7] 曹文潔，「猜拳機」，國立中央大學電機工程研究所，碩士論文，民國九十六年。
[8] 黃俊捷，「互動雙足式機器人之設計與實現（I）手勢辨識」，國立中央大學電機工程研究所，碩士論文，民國九十七年。
[9] Y. Fang, K. Wang, J. Cheng and H. Lu, “A Real-Time Hand Gesture Recognition Method,” IEEE International Conference on Multimedia and Expo, 2-5 July 2007, pp.995-998.
[10] C. Manresa, J. Varona, R. Mas and F. J. Perales, “Hand Tracking and Gesture Recognition for Human-Computer Interaction,” Electronic Letters on Computer Vision and Image Analysis, vol. 5, no. 3, pp. 96-104, May 2005.
[11] Y. Wu, T. S. Huang, “View-independent Recognition of Hand Postures,” Proc. of IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, Hilton Head Island, SC, USA, June 2000, pp.88-94.
[12] S. Mitra and T. Acharya, “Gesture Recognition: A Survey,” IEEE Transactions on systems, man, and cybernetics-part C: applications and reviews, vol. 37, no. 3, pp.311-324, May 2007.
[13] T. Maung, “Real-Time Hand Tracking and Gesture Recognition System Using Neural Networks,” Proc. of World Academy of Science, Engineering and Technology, vol. 38, Feb. 2009, pp.470-474.
[14] A. Ramamoorthy, N. Vaswani, S. Chaudhury and S. Banerjee, “Recognition of Dynamic Hand Gestures,” Pattern Recognition, vol. 36, no. 9, pp. 2069-2081, Apr. 2003.
[15] R. L. Stratonovich, “Optimum nonlinear systems which bring about a separation of a signal with constant parameters from noise,” Radiofizika, vol. 2, no. 6, pp. 892-901, Feb. 1959.
[16] S. Fine, Y. Singer and N. Tishby, “The Hierarchical Hidden Markov Model: Analysis and Applications,” Machine Learning, vol.32, no. 1, pp. 41-62, Sep. 1998.
[17] F. S. Chen, C. M. Fu, C. L. Huang , “Hand gesture recognition using a real-time tracking method and hidden Markov models,” Image and Vision Computing, vol. 21, no. 8, pp.745-758, Mar. 2003.
[18] L. Gui1, J. P. Thiran and N. Paragios, “Finger-spelling recognition within a collaborative segmentation/behavior inference framework,” Proc. of the 16th European Signal Processing, Lausanne, Switzerland, 2008.
[19] T. Mitchell et al., Machine Learning, McGraw Hill, 1997, p. 334.
[20] J. Hawkins and D. George. Hierarchical Temporal Memory Concepts, Theory, and Terminology [Online]. Available: http://www.numenta.com/
[21] J. Shlens. A Tutorial on Principal Component Analysis [Online]. Available: http://www. brainmapping.org/
[22] J. L. ELMAN, “Finding structure in time,” Cognitive Science, vol. 14, no.2, pp. 179-211, Apr. 1990.
[23] A. G. Tijsseling, “Sequential Information Processing Using Time-Delay Connections in Ontogenic CALM Networks,” IEEE Transaction on neural network, vol. 16, no. 1, pp. 145-156, Jan. 2005.
[24] J. H. Wang, M. C. Tsai, and W. sheng, “Learning temporal sequences using dual-weight neurons,” Journal of the Chinese Institute of Engineers, vol. 24, no. 3, pp. 329-344, Mar. 2001.
[25] H. Jaeger. A tutorial on training recurrent neural networks, covering BPPT, RTRL, EKF and the "echo state network" approach [Online]. Available: http://www.ifi.uzh.ch/ailab/teaching/
[26] H. Jaeger, The ‘echo state’ approach to analyzing and training recurrent neural networks, Sankt Augustin: GMD Forschungszentrum Informationstechnik, 2001.
[27] W. Maass, T. Natschlager, and H. Markram, “Fading memory and kernel properties of generic cortical microcircuit models,” Journal of Physiology-Paris, vol. 98, no. 4-6, pp. 315–330, July. 2004.
[28] D. George. How the Brain Might Work: A Hierarchical and Temporal Model for Learning and Recognition [Online]. Available: http://www.numenta.com/
[29] D. Wang and B. Yuworio, “Incremental Learning of Complex Temporal Patterns,” IEEE Transaction on neural network, vol. 7, no. 6, pp. 1465-1482, Nov. 1996.
[30] 陳律宇，「以自我組織特徵映射圖為基礎之模糊系統實作連續性Q-Learning」，國立中央大學資訊工程研究所，碩士論文，民國九十五年。
[31] J. Hawkins and S. Blakeslee, On Intelligence, Owl Books, 2005, ch.5.
[32] K. P. Murphy, “Dynamic Bayesian Networks: Representation, Inference and Learning,” PhD thesis, University of California, Berkeley, Computer Science Division, 2002.
[33] Stephen C. Johnson, “Hierarchical Clustering Schemes,” Pyschometrika, vol. 32, no. 3, pp. 241-254, Sept. 1967.
[34] L. I. Kuncheva and L. C. Jain, “Nearest neighbor classifier: Simultaneous editing and feature selection,” Pattern Recognition Letters, vol.20, no. 11-13, pp. 1149-1156, Nov. 1999.
[35] E. J. Ong and R. Bowden, “A Boosted Classifier Tree for Hand Shape Detection ," Proc. of 6th IEEE International Conference on Automatic Face and Gesture Recognition, 17-19 May 2004, pp. 889-894.
[36] B. Stenger, “Template-Based Hand Pose Recognition Using Multiple Cues,” Lecture Notes in Computer Science, vol. 3852, pp. 551-560, 2006.
[37]森田真司, 山澤一誠, 寺沢征彦, 横矢直和, “全方位画像センサを用いたネットワーク対応型遠隔監視システム,” 奈良先端科学技術大学院大学, 碩士論文, 2005.
[38] S. K. Singh, D. S. Chauhan, M. Vatsa and R. Singh, “A Robust Skin Color Based Face Detection Algorithm,” Journal of Science and Engineering, vol. 6, no. 4, pp. 227-234, Sep. 2003.
[39] R. C. Gonzalez, R. E. Woods, Digital Image Processing, 2nd edition, Prentice Hall Press, p. 295.
[40] J. Daugman, “Entropy reduction and decorrelation in visual coding by oriented neural receptive fields,” Transaction on Biomedical Engineering, vol. 36, no. 1, pp. 107-114, Jan. 1989.
[41] C. J. C. Burges, “A Tutorial on Support Vector Machines for Pattern Recognition,” Data Mining and Knowledge Discovery, vol. 2, no. 2, pp. 121-167. June 1998.

指導教授

陳竹一、魏慶隆
(Jwu E Chen、Chin-Long Wey)

審核日期

2009-8-16

推文