博碩士論文 111522030 完整後設資料紀錄

DC 欄位 語言
DC.contributor資訊工程學系zh_TW
DC.creator李孟潔zh_TW
DC.creatorMeng-Chieh Leeen_US
dc.date.accessioned2024-8-7T07:39:07Z
dc.date.available2024-8-7T07:39:07Z
dc.date.issued2024
dc.identifier.urihttp://ir.lib.ncu.edu.tw:444/thesis/view_etd.asp?URN=111522030
dc.contributor.department資訊工程學系zh_TW
DC.description國立中央大學zh_TW
DC.descriptionNational Central Universityen_US
dc.description.abstract自然場景文字包含街景路標、商店招牌、告示牌以及商品包裝等,可靠地偵測與辨識這些文字有助於實現多種具潛力的應用。自然場景文字可能出現於複雜街景或非平整背景,易受到光線變化、反光、角度扭曲或其他遮蔽物影響,於自然場景影像中準確偵測與辨識文字並不容易。現今常見的研究方法是利用深度學習模型,並以字詞為單位進行標記以利後續的字詞分割、文字偵測及辨識,通常需要較多資料與較大型的深度學習模型來因應存在於字詞的多樣性。此外,經常出現的不同語種文字會增加標記與辨識的困難。考量模型訓練成本與多語種文字偵測的需求,本研究提出以字元間隙為標的之文字偵測模型來協助定位自然場景中的多語種字元,透過字元間隙決定字元中心,再使用近鄰演算法畫出字元框區域,可與其中以較輕量的模型進行字元辨識。然而,偵測字元間隙的挑戰在於現今大部分資料集的標記都是針對字詞,在缺乏字元或字元間隙標記的情況下,本研究先產生接近自然場景的人工資料集,該資料集包含字元標記框以及字元間隙標記框,再搭配弱監督式學習以含有字詞標記的真實資料集進行模型調整,使得模型在微調以及迭代更新下能更準確地定位字元間隙,進而找出字元位置。實驗結果顯示,對於包含多國語種的文字資料集,我們所提出的偵測字元間隙方法以定位字元中心位置是可行的。zh_TW
dc.description.abstractScene text indicates text appearing in street signs, shop signs, notices, and product packaging, etc. and reliably detecting and recognizing scene text is beneficial for a variety of potential applications. Text in natural scenes may appear in complex street views or on uneven backgrounds, and its detection and recognition are easily affected by changes in lighting, reflections, angle distortions, or other obstructions. Nowadays, common research methods adopt deep learning models, with words labeled as units to facilitate subsequent word segmentation, text detection, and recognition. These methods usually require more data and larger deep learning models to handle the diversity of text words. Besides, multilingual text appears quite often and labeling in a unified manner is not a trivial task. Considering the cost of model training and the detection of multilingual text, this study proposes using character gaps or spacings as detection targets to assist in the segmentation of multilingual characters. By detecting character gaps to locate character centers, and then using a nearest neighbor algorithm to draw character bounding boxes, a lighter model can be used for single-character recognition. However, the challenge of detecting character gaps or spacings lies in the fact that most current datasets are labeled for words, lacking labels for characters or character gaps. We form an synthetic image dataset that mimics natural scenes, containing character bounding boxes and character gap boxes. Combined with weakly supervised learning on real datasets with word labels, this approach allows the model to be fine-tuned and iteratively updated to more accurately locate character gaps. Experimental results show that the proposed method is feasible for detecting character gaps or spacings to locate characters in the multilingual datasets.en_US
DC.subject深度學習zh_TW
DC.subject語義分割zh_TW
DC.subject自然場景文字定位zh_TW
DC.subject多國語言文字定位zh_TW
DC.subject字元辨識zh_TW
DC.subject弱監督式學習zh_TW
DC.subjectDeep learningen_US
DC.subjectsemantic segmentationen_US
DC.subjectscene text localizationen_US
DC.subjectmultilingual text localizationen_US
DC.subjectcharacter recognitionen_US
DC.subjectweakly supervised learningen_US
DC.title基於字元間隙偵測之自然場景文字分割與辨識zh_TW
dc.language.isozh-TWzh-TW
DC.titleScene-Text Segmentation and Recognition via Character Spacing Detectionen_US
DC.type博碩士論文zh_TW
DC.typethesisen_US
DC.publisherNational Central Universityen_US

若有論文相關問題,請聯絡國立中央大學圖書館推廣服務組 TEL:(03)422-7151轉57407,或E-mail聯絡  - 隱私權政策聲明