摘要: | 近年來人機互動興起,對事物的操控不再只局限於以按鍵來遙控,隨著手勢辨識研究逐漸嶄露成果,空中手寫文字辨識也有越來越多研究單位積極投入,除了被全球較廣泛使用的英文字母及阿拉伯數字外,擁有廣大人口量使用的漢字也慢慢受到重視。 相較於傳統手寫輸入,空中手寫具有僅以一筆畫完成的特性,實、虛筆參雜其中使文字組成更為複雜;而與拉丁字母相比,漢字又多了百倍以上的變化,再加上每個使用者在寫繁體字時筆畫順序不盡相同,會直接影響虛筆產生的筆劃數、位置與方向。 我們使用Kinect做影像擷取,用以獲得深度資訊,再透過分析人體骨架抓取手部移動軌跡,並利用起始與結束動作構成每一個字的筆劃。將完整正規化到一定大小後針對文字軌跡降維,從中提取轉折點、形狀上下文、八方向比例等特徵。最後進入辨識模組,結合動態時間校正設計出合適的損失函數,藉以顯示前五名候選字。 ;Human-computer interaction has risen in recent years, and the manipulation of things is no longer limited to the remote control via buttons. With the development of gesture recognition research, there have been more and more research institutions actively investing in handwriting recognition in the air, in addition to being widely used globally. Chinese characters that are used by a large number of people have also gradually received attention. Different from touch-screen handwriting, the in-air written character has no pen-lift information, i.e., a character is always finished writing in one stroke. Compared with the Latin alphabet, the Chinese characters have more than one hundred times more change. In addition, each user′s stroke order when writing Traditional Chinese characters will have a direct impact on the number, position, and direction of strokes generated by the virtual pen. In this paper, Kinect is used for image capture to obtain depth information, and the movement of the hand become trajectory by analyzing the human skeleton, and the strokes of each word are formed by using the starting and ending motions. After normalization to a certain size, the dimension of the text trajectory is reduced, and features such as turning point, shape context, and eight-direction ratio are extracted. Finally, the identification module is entered and a suitable loss function is designed in conjunction with dynamic time warping to display the first three candidate words. |