基於弱監督式學習之自然場景文字字元分割

DC 欄位	值	語言
DC.contributor	資訊工程學系	zh_TW
DC.creator	陳莉筑	zh_TW
DC.creator	Li-Zhu Chen	en_US
dc.date.accessioned	2023-7-25T07:39:07Z
dc.date.available	2023-7-25T07:39:07Z
dc.date.issued	2023
dc.identifier.uri	http://ir.lib.ncu.edu.tw:88/thesis/view_etd.asp?URN=110522083
dc.contributor.department	資訊工程學系	zh_TW
DC.description	國立中央大學	zh_TW
DC.description	National Central University	en_US
dc.description.abstract	近年來基於深度學習於自然場景文字檢測的相關研究盛行，普遍以偵測字詞(word)為主要目標，並取得不錯的效果。然而，文字字體型態多變，且待測影像背景趨於複雜，文字可能受到遮蔽物阻擋，特別是當自然場景文字走向多元時，準確的字詞偵測並不容易達成，也影響下一階段文字辨識的準確度。本研究提出像素級字元(character)偵測網路，透過偵測字元的方式嘗試解決不規則走向字詞不易偵測的問題。字元偵測能讓偵測框更緊貼文字邊緣，降低複雜背景對於偵測網路所造成的影響，後續的文字辨識或可使用較輕量的辨識網路，減少訓練所需的資源與時間。字元偵測的主要挑戰在於現有自然場景文字檢測資料集皆採用字詞標記，因為針對字元的人工標記相當耗時費力。我們藉由生成大量貼近真實場景的合成資料來解決訓練集缺少字元標記的問題，並結合弱監督式學習在含有字詞標記的真實影像進行模型訓練。對於這些沒有字元標記的真實資料，我們以迭代更新結果的方式使網路自動學習偵測更可靠的字元位置，提升模型表現。另外，因應缺少字元標記的測試資料，我們提出新的字元偵測評估方式。實驗結果顯示我們的方法在ICDAR2017、TotalText和CTW-1500資料集上皆優於其他字元偵測模型，我們也將同樣的方式運用於訓練中文字元偵測以驗證所提出方法在其他語言內容的可行性。	zh_TW
dc.description.abstract	In recent years, there has been a prevailing trend in deep learning-based research for natural scene-text detection. The primary focus has generally been on word-level detection, which has yielded promising results. However, text fonts have significant variations, and the backgrounds of test images tend to be complex. Text may also be obstructed by occlusions, particularly in cases where natural scene text exhibits diverse orientations. Achieving accurate word-level detection under such circumstances is challenging and can also impact the subsequent text recognition accuracy. To address the difficulty of detecting irregularly oriented words, this paper proposes a pixel-level character detection network. By detecting individual characters, the detection boxes can adhere more closely to the text boundaries, reducing the negative influence of complex backgrounds on the detection network. Lighter-weight recognition networks can thus be employed for subsequent text recognition, reducing the resource and time requirements for training. The main challenge in character detection lies in the fact that existing natural scene-text detection datasets focus on word-level annotations, since character-level annotation is a laborious and time-consuming task. To overcome this challenge, we generate a large volume of synthetic data that closely resembles real-world scenarios. We employ partially annotated data for training, incorporating weakly supervised learning techniques and the inclusion of real-world data during training. For real-world data without character-level annotations, we adopt an iterative update approach to automatically learn more reliable character positions through the use of updated results to improve the accuracy of the model. Additionally, we propose a new evaluation method for character detection to address the lack of character-level annotated test datasets. Experimental results demonstrate the superiority of our method over other character detection models on the ICDAR2017, TotalText, and CTW-1500 datasets. We also apply the same approach to train models for character detection in other languages to validate the feasibility of the proposed method.	en_US
DC.subject	深度學習	zh_TW
DC.subject	語意分割	zh_TW
DC.subject	任意走向文字定位	zh_TW
DC.subject	弱監督式學習	zh_TW
DC.subject	Deep learning	en_US
DC.subject	semantic segmentation	en_US
DC.subject	arbitrary orientations text localization	en_US
DC.subject	weakly supervised learning	en_US
DC.title	基於弱監督式學習之自然場景文字字元分割	zh_TW
dc.language.iso	zh-TW	zh-TW
DC.title	Character Segmentation in Scene-Text Images Based on Weakly Supervised Learning	en_US
DC.type	博碩士論文	zh_TW
DC.type	thesis	en_US
DC.publisher	National Central University	en_US

博碩士論文 110522083 完整後設資料紀錄