摘要: | 在人類資料傳播的過程中,閱讀及手寫是最基本的能力之一,而文字佔了很重要的一環,因此,在自動化手寫辨識系統中,相較於類別較少的英文字母和數字,中文字不論是在筆劃複雜度又或是常用字的數量上都有極大的差異。而中文字型具備字型結構與筆順寫法的差異,又因為在空中手寫的情境,讓每個中文字都同時具有實筆以及虛筆的特徵,和一般手寫紙本的呈現方式不同,紙本只會呈現實筆部分,虛筆部分則會因為提筆行為,所以不會呈現在紙上或是螢幕上。但使用者在空中手寫中文字時,文字起點至終點的過程是連續且單一的筆劃,使得空中手寫文字具有實虛筆相連且有時間序列性兩大特點。 基於以上特點,本文提出使用時間遞迴神經網路(RNN)家族中的長短期記憶(Long Short-Term Memory, LSTM)模型作為辨識的核心架構。因深度學習需要龐大的訓練資料,雖簡體字資料庫在中國大陸已有許多單位投入並建立資料庫,但並不符合國人撰寫習慣,且繁體字並無相關開放式資料庫,所以本論文自行收集了492個繁體字,總計共2萬多筆資料,將資料透過預處理提取筆劃轉折點,而為符合LSTM需固定時序的特性,本文將筆劃切成多種固定的數量,並利用形狀上下文(Shape Context)統計空間分布特徵作為辨識模型的輸入,透過實驗設計藉由中文字的筆劃數增減及設定不同維度的形狀上下文進行準確度及穩定度的測試中,實驗的結果可得本文之中文空中手寫辨識準確度達到98.6%。 ;In the process of human information communication, reading and writing are the most basic skills, and characters are the one of important parts. Therefore, in the automated handwriting recognition system, Chinese characters are more complicated and have a larger number of common words than English characters and numerals. Chinese characters possess the difference of font structure and stroke sequence. In-air handwriting scenario, each the Chinese characters has both features of real stroke and virtual stroke, and the presentation is different from handwriting on paper. Handwriting on paper only appears the real stroke, the virtual stroke couldn’t show on paper or screen because of lifting pen. However, when users handwrite in the air, the process of stroke is continuous and one stroke-finished; it makes the in-air handwriting own two characteristics: real and virtual strokes and time sequence. Based on the above characteristics, this paper proposes to use the Long Short-Term Memory (LSTM) model as the core model for recognition. Deep learning requires a lot of training data. Although there are many institutions in China which devote to establish the Simplified Chinese dataset, it doesn’t fit the Taiwanese habit. Therefore, we collect 492 Traditional Chinese characters, about more than 20,000 data. To extract the turning point of the stroke through the preprocessing. In order to conform the characteristic of LSTM which fixed timing, the stroke is cut many fixed quantities, and been the input of recognition model by using shape context statistical spatial distribution feature. This paper test accuracy and stability by increasing and decreasing of strokes and setting the shape context of different dimensions. According experiments, the accuracy of recognizing Chinese characters is 98.6%. |