基於LSTM之中文空中手寫辨識

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：13

、訪客IP：18.119.126.168

姓名

王昕盈(Hsin-Ying Wang) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

基於LSTM之中文空中手寫辨識
(In-air Handwriting Chinese Character Recognition base on LSTM)

相關論文

★ 基於RGB無深度影像之中文空中手寫辨識	★ 基於物件偵測之多物件追蹤關聯策略
★ 結合跨尺度自注意力與分割混合層之輕量化分類網路

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

在人類資料傳播的過程中，閱讀及手寫是最基本的能力之一，而文字佔了很重要的一環，因此，在自動化手寫辨識系統中，相較於類別較少的英文字母和數字，中文字不論是在筆劃複雜度又或是常用字的數量上都有極大的差異。而中文字型具備字型結構與筆順寫法的差異，又因為在空中手寫的情境，讓每個中文字都同時具有實筆以及虛筆的特徵，和一般手寫紙本的呈現方式不同，紙本只會呈現實筆部分，虛筆部分則會因為提筆行為，所以不會呈現在紙上或是螢幕上。但使用者在空中手寫中文字時，文字起點至終點的過程是連續且單一的筆劃，使得空中手寫文字具有實虛筆相連且有時間序列性兩大特點。
基於以上特點，本文提出使用時間遞迴神經網路(RNN)家族中的長短期記憶(Long Short-Term Memory, LSTM)模型作為辨識的核心架構。因深度學習需要龐大的訓練資料，雖簡體字資料庫在中國大陸已有許多單位投入並建立資料庫，但並不符合國人撰寫習慣，且繁體字並無相關開放式資料庫，所以本論文自行收集了492個繁體字，總計共2萬多筆資料，將資料透過預處理提取筆劃轉折點，而為符合LSTM需固定時序的特性，本文將筆劃切成多種固定的數量，並利用形狀上下文(Shape Context)統計空間分布特徵作為辨識模型的輸入，透過實驗設計藉由中文字的筆劃數增減及設定不同維度的形狀上下文進行準確度及穩定度的測試中，實驗的結果可得本文之中文空中手寫辨識準確度達到98.6%。

摘要(英)

In the process of human information communication, reading and writing are the most basic skills, and characters are the one of important parts. Therefore, in the automated handwriting recognition system, Chinese characters are more complicated and have a larger number of common words than English characters and numerals. Chinese characters possess the difference of font structure and stroke sequence. In-air handwriting scenario, each the Chinese characters has both features of real stroke and virtual stroke, and the presentation is different from handwriting on paper. Handwriting on paper only appears the real stroke, the virtual stroke couldn’t show on paper or screen because of lifting pen. However, when users handwrite in the air, the process of stroke is continuous and one stroke-finished; it makes the in-air handwriting own two characteristics: real and virtual strokes and time sequence.
Based on the above characteristics, this paper proposes to use the Long Short-Term Memory (LSTM) model as the core model for recognition. Deep learning requires a lot of training data. Although there are many institutions in China which devote to establish the Simplified Chinese dataset, it doesn’t fit the Taiwanese habit. Therefore, we collect 492 Traditional Chinese characters, about more than 20,000 data. To extract the turning point of the stroke through the preprocessing. In order to conform the characteristic of LSTM which fixed timing, the stroke is cut many fixed quantities, and been the input of recognition model by using shape context statistical spatial distribution feature. This paper test accuracy and stability by increasing and decreasing of strokes and setting the shape context of different dimensions. According experiments, the accuracy of recognizing Chinese characters is 98.6%.

關鍵字(中)

★ 空中手寫中文字
★ 長短期記憶網路
★ 文字辨識

關鍵字(英)

★ In-air Handwriting Chinese Character
★ Long Short-Term Memory
★ Character Recognition

論文目次

摘要 i
Abstract ii
致謝 iii
目錄 iv
圖目錄 vi
表目錄 viii
第一章緒論 1
1.1 研究背景及動機 1
1.2 研究目的 1
1.3 論文架構 2
第二章相關文獻探討 3
2.1 手寫辨識 3
2.2 人機互動 (HCI) 4
2.3 形狀上下文 (Shape Context) 5
2.4 遞迴神經網路 (RNN) 6
2.5 長短期記憶 (LSTM) 8
2.6 激活函數 11
2.7 批次正規化(Batch Normalization) 12
2.8 丟棄法(Dropout) 12
第三章中文字手寫辨識系統 13
3.1 系統架構 13
3.2 資料收集 14
3.3 前處理 15
3.3.1 刪除重疊點、角點偵測 15
3.3.2 軌跡正規化、筆劃切割 18
3.3.3 形狀上下文 19
3.4 辨識模型 20
第四章實驗結果與討論 22
4.1 實驗環境 22
4.2 訓練、測試資料庫 22
4.3 實驗設計與結果分析 29
第五章結論與未來展望 38
5.1 結論 38
5.2 未來展望 39
參考文獻 40

參考文獻

[1] Ning Xu, Weiqiang Wang, and Xiwen Qu, “Recognition of In-air Handwritten Chinese Character Based on Leap Motion Controller,” International Conference on Image and Graphics(ICIG), pp. 160-168, 2015.
[2] Xiaobo Mao, Zhiyuan Cheng, and Xiaodong Zhou, “Offline Handwritten Chinese Character Recognition Based on Concatenated Feature Maps,” J. Zhengzhou Univ.(Nat. Sei. Ed.), Vol. 50, No. 3, 2018.
[3] Haiqing Ren, Weiqiang Wang, Ke Lu, Jianshe Zhou, and Qiuchen Yuan, “An end-to-end recognizer for in-air handwritten Chinese characters based on a new recurrent neural networks,” 2017 IEEE International Conference on Multimedia and Expo (ICME), pp. 841-846, 2017.
[4] Zhuoyao Zhong, Lianwen Jin, and Zecheng Xie, “High performance offline handwritten Chinese character recognition using GoogLeNet and directional feature maps,” 2015 13 th International Conference on Document Analysis and Recognition (ICDAR), pp. 846-850, 2015.
[5] Xiwen Qu, Weiqiang Wang, Ke Lu, and Jianshe Zhou, “In-air handwritten Chinese character recognition with locality-sensitive sparse representation toward optimized prototype classifier,” Pattern Recognition, Vol. 78, pp. 267-276, 2018.
[6] Xuyao Zhang, Fei Yin, Yanming Zhang, Chenglin Liu, and Yoshua Bengio, “Drawing and Recognizing Chinese Characters with Recurrent Neural Network,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 40, No. 4, pp. 849-862, 2017.
[7] 人機互動. [Accessed: 08-Apr-2019]. Available from: https://zh.wikipedia.org/wiki/人机交互.
[8] Serge Belongie, Jitendra Malik, and Jan Puzicha, “Shape matching and object recognition using shape contexts,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 24, pp. 509-522, 2002.
[9] Tom ́aˇs Mikolov, Martin Karafi ́at, Luk ́aˇs Burget, Jan “Honza”ˇCernock ́y, and Sanjeev Khudanpur, “Recurrent neural network based language model,” Eleventh annual conference of the international speech communication association, 2010.
[10] Understanding LSTM Networks. [Accessed:15-Oct-2018]. Available from: http://colah.github.io/posts/2015-08-Understanding-LSTMs/.
[11] The Unreasonable Effectiveness of Recurrent Neural Networks. [Accessed:15-Oct-2018]. Available from: http://karpathy.github.io/2015/05/21/rnn-effectiveness/.
[12] Sepp Hochreiter, and Jürgen Schmidhuber, “Long short-term memory,” Neural computation, Vol. 9, pp. 1735-1780, 1997.
[13] Vinod Nair, and Geoffrey E. Hinton, ”Rectified Linear Units Improve Restricted Boltzmann Machines,” Proceedings of the 27th International Conference on International Conference on Machine Learning, pp. 807-814, 2010.
[14] Deep Learning : the role of the activation function. [Accessed:18-Jan-2019]. Available from: https://mropengate.blogspot.com/2017/02/deep-learning-role-of-activation.html.
[15] Sergey Ioffe, and Christian Szegedy, “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift,” arXiv preprint arXiv:1502.03167, 2015.
[16] Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov, “Dropout: A Simple Way to Prevent Neural Networks from Overfitting,” The Journal of Machine Learning Research, 2014.
[17] 鄒佩珊, “空中手寫中文字辨識,” 國立中央大學資訊工程學系碩士論文, 2018.
[18] Xiwen Qu, Weiqiang Wang, Ke Lu, and Jianshe Zhou, “Data augmentation and directional feature maps extraction for in-air handwritten Chinese character recognition based on convolutional neural network,” Pattern Recognition Letters, Vol. 111, pp. 9-15, 2018.

指導教授

范國清謝君偉(Kuo-Chin Fan Jun-Wei Hsieh)

審核日期

2019-7-24

推文