基於漸進式修正網路與拼寫錯誤修正語言模型之場景文字辨識

DC 欄位	值	語言
DC.contributor	資訊工程學系	zh_TW
DC.creator	彭明正	zh_TW
DC.creator	MING-CHENG PENG	en_US
dc.date.accessioned	2022-9-23T07:39:07Z
dc.date.available	2022-9-23T07:39:07Z
dc.date.issued	2022
dc.identifier.uri	http://ir.lib.ncu.edu.tw:444/thesis/view_etd.asp?URN=109522143
dc.contributor.department	資訊工程學系	zh_TW
DC.description	國立中央大學	zh_TW
DC.description	National Central University	en_US
dc.description.abstract	場景文字辨識因為擁有廣大的應用領域而快速地成為了一個熱門的研究主題，不同於一般的文本辨識，複雜的背景、不規則方向、字元遮擋、影像模糊等等情況經常出現在場景文字之中，因此場景文字辨識必須要比一般的文本辨識更具備處理影像多樣化和影像品質下降的能力。近年來隨著深度學習技術的發展，已經有不少方法嘗試著解決場景文字辨識任務，然而對於人類來說，文字辨識這項任務不僅只從眼睛看到所判斷，同時還會考慮語意知識而給出更合理的辨識結果，為了使深度學習模型更接近於人類閱讀文字的過程，近年來越來越多方法開始轉往如何使模型學會更豐富的語義資訊，然而在現有文獻中，大部分都使用了英文資料集做研究，若直接將這些研究用在中文資料集上可能並不適合。有鑑於此本論文提出了一個更適合中文文字辨識的深度學習模型，我們加入了語言模型並且使用額外的文本資料做拼寫錯誤修正欲訓練，這樣能使我們的場景文字辨識模型架構具有更好的語意推理能力，此外我們還提出了漸進式修正網路，取代了現有文獻方法中最常使用的修正網路[1]，漸進式修正網路能夠使模型更好的處理不規則方向的字。在實驗中我們展現了本論文所提出的方法優於[1, 2]這兩種經典的場景文字辨識架構，這兩種架構也經常被其他文獻拿來比較，本論文的方法也優於[3, 4]這兩種近年所提出的方法，另外在消融實驗中我們還探討了模型中各個部分的有效性，我們相信本論文是一個更適合中文文字辨識任務的方法。	zh_TW
dc.description.abstract	Scene text recognition has quickly become a hot research topic due to its wide range of applications. Different from general text recognition, complex backgrounds, irregular directions, occlusion of characters, blurred images, etc. often appear in scene texts. Therefore, scene text recognition must be more capable of dealing with image diversification and image quality degradation than general text recognition. In recent years, with the development of deep learning technology, many methods have been tried to solve the task of scene text recognition. However, for humans, the task of text recognition is not only judged from what the eyes see, but also considers semantic knowledge to give more reasonable recognition results. In order to make the deep learning model closer to human reading, more and more methods have begun to turn to how to make the model learn richer semantic information in recent years. However, in the existing literature, most of them use English datasets for research, and it may not be suitable to directly apply these studies to Chinese datasets. In view of this, this paper proposes a deep learning model that is more suitable for Chinese scene text recognition. We added a language model and used additional text data for spelling error correction training, which enabled our scene text recognition model to have better semantic reasoning capabilities. In addition, we also propose a progressive rectification network, which replaces the most commonly used rectification network in existing literature [1], which enables the model to better handle text with irregular orientations. In the experiments, we show that the method proposed in this paper outperforms the two classic scene text recognition method [1, 2], which are often compared by other literatures. The method of this paper is also better than the two methods proposed in recent years [3, 4]. In addition, in the ablation study, we also explored the effectiveness of each part of the model, and we believe that this paper is a more suitable method for Chinese scene text recognition.	en_US
DC.subject	深度學習	zh_TW
DC.subject	文字辨識	zh_TW
DC.subject	語言模型	zh_TW
DC.subject	拼寫錯誤修正	zh_TW
DC.subject	Deep Learning	en_US
DC.subject	Text Recognition	en_US
DC.subject	Language Model	en_US
DC.subject	Spelling Error Correction	en_US
DC.title	基於漸進式修正網路與拼寫錯誤修正語言模型之場景文字辨識	zh_TW
dc.language.iso	zh-TW	zh-TW
DC.title	Progressive Rectification Network and Spelling Error Correction Language Model Based Scene Text Recognition	en_US
DC.type	博碩士論文	zh_TW
DC.type	thesis	en_US
DC.publisher	National Central University	en_US

博碩士論文 109522143 完整後設資料紀錄