基於深度學習網路之繁體中文場景文字辨識策略;Traditional Chinese Scene Text Recognition Strategies based on Deep Learning Networks

NCU Institutional Repository > 資訊電機學院 > 資訊工程研究所 > 博碩士論文 > Item 987654321/86585

請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/86585

題名:	基於深度學習網路之繁體中文場景文字辨識策略;Traditional Chinese Scene Text Recognition Strategies based on Deep Learning Networks
作者:	許馨文;Syu, Sin-Wun
貢獻者:	資訊工程學系
關鍵詞:	深度學習;光學字元辨識;繁體中文字辨識;字串校正;Deep Learning;Optical Character Recognition;Scene Text Recognition;Text-line Correction
日期:	2021-07-30
上傳時間:	2021-12-07 13:00:03 (UTC+8)
出版者:	國立中央大學
摘要:	文字辨識是一個從圖像中提取文字特徵的影像辨識任務，目前也有許多相關的應用場景，例如：印刷文件文字辨識、手寫字辨識、車牌辨識等。相較於針對文件掃描的文字識別，自然場景中的文字因為多樣化的字型、角度、光線變化以及障礙物遮擋等，增加了文字辨識的挑戰性。繁體中文自然場景文字識別的相關研究目前較為少見，主因是僅台灣廣泛地使用繁體中文字，且相較於英數字，中文字元種類數量龐大，蒐集足夠數量的街景文字圖片十分困難，影像標記也非常耗時。本研究使用多種字型檔產生人工資料集，並針對街景文字場景設計多種資料增強方法，包括調整文字大小、傾斜角度、背景紋理變化以及文字輪廓外框等，於訓練過程中策略性隨機調用，期使人工資料集達到模擬真實街景影像的效果，不僅增強資料的可靠性，也解決了資料類別不平衡、以及可能的標記錯誤。本研究提出基於深度學習網路的繁體中文字辨識策略，並且設計文字串校正機制，針對字串中少部分文字辨識錯誤的情況，使用校正方法來提升文字串的整體辨識準確度。實驗結果顯示，本研究能有效識別自然場景中的繁體中文字，與現有方法評比擁有更佳的準確度。;Text recognition is an important task for extracting information from imagery data. Scene text recognition is one of its challenging scenarios since the texts appearing in natural scenes may have diversified fonts or size, be occluded by other objects and be captured from varying angles or under different light conditions. In contrast to alpha-numerical characters, Traditional Chinese Characters (TCC) receive less attention and the large number of TCC makes it difficult to collect and label enough scene-text images. This research aims at developing a set of strategies for TCC recognition. We develop a synthetic dataset using a variety of data augmentation methods, including text deformations, noise adding and background changes, which appear often in natural scenes. A segmentation-based text spotting scheme is used to locate the areas of text-lines and characters so that the characters can be recognized by the trained model and then linked into meaning text-lines. The text-lines can be corrected via network search, which will further boost the model performance after re-training. The experimental results show that the proposed strategies work better in recognizing TCC in natural scenes, when compared with existing publicly available tools.
顯示於類別:	[資訊工程研究所] 博碩士論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	49	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....