基於影像分割之多語言場景文字字元偵測與語言辨識

DC 欄位	值	語言
DC.contributor	資訊工程學系	zh_TW
DC.creator	林佳穎	zh_TW
DC.creator	Chia-Yin Lin	en_US
dc.date.accessioned	2022-9-19T07:39:07Z
dc.date.available	2022-9-19T07:39:07Z
dc.date.issued	2022
dc.identifier.uri	http://ir.lib.ncu.edu.tw:88/thesis/view_etd.asp?URN=109522054
dc.contributor.department	資訊工程學系	zh_TW
DC.description	國立中央大學	zh_TW
DC.description	National Central University	en_US
dc.description.abstract	基於深度學習的自然場景文字分析相關研究在近年來十分盛行，文字區域偵測更是其中的重要環節。現今文字偵測大多以字串為標記單位，然而字串中可能包含不同語言的文字，標記時較不易確認該字串文字所屬語言。本研究提出以字元為單位的偵測方式，不僅能準確標記所屬語言，也讓辨識時能採用相對應語言模型以達到更好的效果。對於辨識模型而言，字串需要考量不規則的文字走向，且字串辨識模型通常需要較大量的訓練資料與訓練時間。反觀字元辨識則不太需要考慮文字走向，訓練模型相對簡單省時，且面對多語言自然場景文字時能更有彈性地根據語言特性，選擇適合的辨識單位與方法。本研究使用高解析度網路架構，以字元為偵測單位，標記字元區域並點出字元中心，且利用多個通道進行語言分類。由於真實資料集字元標記的缺乏，我們提出針對字元的弱監督式學習方法，使得網路在缺乏字元標記的情況下也能在偵測字元的表現有明顯的效果提升。在多語言分類上，不管是偵測後用個別分類器亦或是在偵測的同時進行語言辨識皆有一定的效果，驗證了字元辨識的可行性。我們實驗以拉丁文(英數字)、中文、日文、韓文為範例，分析本設計的可行性與合理性。	zh_TW
dc.description.abstract	In recent years, scene text analysis based on deep learning techniques draw a lot of research attention. Text detection in natural scenes is an important step of scene text analysis and most of the existing text detection designs are based on string detection. However, a string may contain words of different languages so it is not easy to mark the language to which the string belongs accurately. Scene text recognition using string-level annotations need to consider the effect of irregular orientations and require a lot of training data and training time. Conversely, character-based recognition methodologies do not need to consider orientations, which simplifies the training processes. Multilingual natural scene text recognition may be benefited from the flexibility of selecting suitable recognition models according to different language characteristics. In this research, we use a high-resolution network architecture to label word regions and point out the centers of characters, and also employ multiple channels for substring language classification. Due to the lack of character-level annotations in real datasets, we propose a weakly supervised learning approach for characters, enabling the network to improve the detection of characters significantly. The performance of multi-language recognition is verified by using individual classifiers after detection or by performing language recognition at the same time. The feasibility of the proposed design is verified by showing the character detection of different languages, including Latin, Chinese, Japanese, and Korean, as examples.	en_US
DC.subject	深度學習	zh_TW
DC.subject	街景文字定位	zh_TW
DC.subject	多語言文本辨識	zh_TW
DC.subject	弱監督式學習	zh_TW
DC.subject	Deep learning	en_US
DC.subject	Scene text spotting	en_US
DC.subject	semantic segmentation	en_US
DC.subject	weakly supervised learning	en_US
DC.title	基於影像分割之多語言場景文字字元偵測與語言辨識	zh_TW
dc.language.iso	zh-TW	zh-TW
DC.title	Character Spotting and Language Recognition for Multilingual Scene Texts based on Image Segmentation	en_US
DC.type	博碩士論文	zh_TW
DC.type	thesis	en_US
DC.publisher	National Central University	en_US

博碩士論文 109522054 完整後設資料紀錄