使用CRAFT模型於古文書文字切割

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：30

、訪客IP：3.128.247.220

姓名

郭彥明(Yan-Ming Guo) 查詢紙本館藏

畢業系所

資訊工程學系在職專班

論文名稱

使用CRAFT模型於古文書文字切割
(Chinese Ancient Calligraphy Word Detection Based On CRAFT Model)

相關論文

★ 整合GRAFCET虛擬機器的智慧型控制器開發平台	★ 分散式工業電子看板網路系統設計與實作
★ 設計與實作一個基於雙攝影機視覺系統的雙點觸控螢幕	★ 智慧型機器人的嵌入式計算平台
★ 一個即時移動物偵測與追蹤的嵌入式系統	★ 一個固態硬碟的多處理器架構與分散式控制演算法
★ 基於立體視覺手勢辨識的人機互動系統	★ 整合仿生智慧行為控制的機器人系統晶片設計
★ 嵌入式無線影像感測網路的設計與實作	★ 以雙核心處理器為基礎之車牌辨識系統
★ 基於立體視覺的連續三維手勢辨識	★ 微型、超低功耗無線感測網路控制器設計與硬體實作
★ 串流影像之即時人臉偵測、追蹤與辨識─嵌入式系統設計	★ 一個快速立體視覺系統的嵌入式硬體設計
★ 即時連續影像接合系統設計與實作	★ 基於雙核心平台的嵌入式步態辨識系統

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 (2027-7-21以後開放)

摘要(中)

在數位典藏的推廣應用中，對於中文古籍全文資料庫的建立，古籍文字切割與辨識是重要環節。光學字元辨識(Optical Character Recognition, OCR)，需要切割出每一個文字的座標再加以辨識，切割文字錯誤時，辨識的結果也就不可能正確。因此文本的文字切割對於正確的文字識別有著重大的影響和價值，本研究我們設計了一個中文古籍文本的文字切割系統。我們設計的系統輸入影像為彩色掃瞄圖檔，我們使用CRAFT文字偵測模型對輸入影像進行文字偵測，找出文字像素範圍，加以切割出文字區塊圖片。

摘要(英)

In the promotion and application of digital archives, for the establishment of the full-text database of Chinese ancient books, the text cutting and identification of ancient books is an important link. Optical Character Recognition (OCR) requires cutting out the coordinates of each character and then recognizing it. When the character is cut wrong, the recognition result cannot be correct. Therefore, the text cutting of the text has a significant impact and value on the correct text recognition. In this research, we have designed a text cutting system for Chinese ancient texts. The input image of the system we designed is a color scan image file. We use the CRAFT text detection model to detect the text of the input image, find the text pixel range and then cut out the text block image.

關鍵字(中)

★ 文字切割
★ 古文書法

關鍵字(英)

★ CRAFT
★ OCR

論文目次

目錄
摘要 I
Abstract II
致謝 III
目錄 IV
圖目錄 VI
表目錄 VIII
第一章、緒論 1
1.1研究背景與動機 1
1.2 研究目標 2
1.3 論文架構 2
第二章、方法回顧 3
2.1 光學字元識別OCR(Optical Character Recognition) 3
2.1.1 文字辨識引擎Tesseract 3
2.2 CRAFT 文字偵測 4
2.2.1 神經網路模型架構 4
2.2.2 熱力值產生(字元區域分數和親和力分數) 5
2.2.3 弱監督學習 7
2.3 遷移學習 8
2.3.1新的訓練資料集小、與預訓練資料集相似度高 9
2.3.2新的訓練資料集小、與預訓練資料集相似度低 9
2.3.3新的訓練資料集大、與預訓練資料集相似度高 9
2.3.4新的訓練資料集大、與預訓練資料集相似度低 10
第三章、系統架構介紹與方法 11
3.1 MIAT系統設計方法論 11
3.2文字切割系統模型架構 12
3.3遷移學習訓練系統模型架構 14
3.3.1訓練資料集收集 16
3.3.2標記ROI字元框 18
3.3.3熱力圖生成 19
3.3.4 模型訓練經驗 20
第四章、實驗結果與分析 21
4.1 實驗環境 21
4.2 評估方法 21
4.2.1偵測率評估方法 22
4.2.2計算TP/FP/FN/TN各類別像素個數 22
4.2.3計算平均Recall與Precision 23
4.2.4計算F1 Score 23
4.3偵測性能實驗 24
4.3.1文字偵測率 24
4.3.2文字偵測IoU效能評估 24
第五章、結論與未來展望 27
5.1 結論 27
5.2 未來展望 27
參考文獻 28

參考文獻

[1] Y. Baek, B. Lee, D. Han, S. Yun and H. Lee, "Character Region Awareness for Text Detection," 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 9357-9366.
[2] Y. Qin and Z. Zhang, "Summary of Scene Text Detection and Recognition," 2020 15th IEEE Conference on Industrial Electronics and Applications (ICIEA), 2020, pp. 85-89.
[3] B. Dessai and A. Patil, "A Deep Learning Approach for Optical Character Recognition of Handwritten Devanagari Script," 2019 2nd International Conference on Intelligent Computing, Instrumentation and Control Technologies (ICICICT), 2019, pp. 1160-1165.
[4] X. Li, J. Liu and S. Zhang, "Text Recognition in Natural Scenes: A Review," 2020 International Conference on Culture-oriented Science & Technology (ICCST), 2020, pp. 154-159.
[5] Y. Xu, Y. Wang, W. Zhou, Y. Wang, Z. Yang and X. Bai, "TextField: Learning a Deep Direction Field for Irregular Scene Text Detection," in IEEE Transactions on Image Processing, vol. 28, no. 11, pp. 5566-5579, Nov. 2019.
[6] P. He, W. Huang, T. He, Q. Zhu, Y. Qiao and X. Li, "Single Shot Text Detector with Regional Attention," 2017 IEEE International Conference on Computer Vision (ICCV), 2017, pp. 3066-3074.
[7] L. Shao, F. Zhu and X. Li, "Transfer Learning for Visual Categorization: A Survey," in IEEE Transactions on Neural Networks and Learning Systems, vol. 26, no. 5, pp. 1019-1034, May 2015.
[8] S. S. Rajeswari and M. Nair, "A Transfer Learning Approach for Predicting Alzheimer′s Disease," 2021 4th Biennial International Conference on Nascent Technologies in Engineering (ICNTE), 2021, pp. 1-5.
[9] G. Li, H. Zhen, F. Jiao, T. Hao, D. Wang and K. Ni, "Research on tobacco leaf grading algorithm based on transfer learning," 2021 IEEE International Conference on Artificial Intelligence and Computer Applications (ICAICA), 2021, pp. 32-35,.
[10] H. Liu, C. Li, S. Jia and D. Zhang, "Text detection for dust image based on deep learning," 2018 33rd Youth Academic Annual Conference of Chinese Association of Automation (YAC), 2018, pp. 754-759.
[11] R. Durga, G. Yamuna and R. Barkavi, "Video Segmentation Using Short Term Hierarchical Fast Watershed Algorithm," 2018 International Conference on Communication and Signal Processing (ICCSP), 2018, pp. 0281-0284.
[12] L. Vincent and P. Soille, "Watersheds in digital spaces: an efficient algorithm based on immersion simulations," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 13, no. 6, pp. 583-598, June 1991.
[13] W. Zhang, P. Tang, L. Zhao and Q. Huang, "A Comparative Study of U-Nets with Various Convolution Components for Building Extraction," 2019 Joint Urban Remote Sensing Event (JURSE), 2019, pp. 1-4.
[14] A. Kamel, B. Sheng, P. Li, J. Kim and D. D. Feng, "Hybrid Refinement-Correction Heatmaps for Human Pose Estimation," in IEEE Transactions on Multimedia, vol. 23, pp. 1330-1342, 2021.
[15] R.Smith, "An Overview of the Tesseract OCR Engine, " in Ninth International Conference on Document Analysis and Recognition (ICDAR 2007), 23-26 Sep. 2007, vol. 2, pp. 629-633.
[16] B. Dessai and A. Patil, "A Deep Learning Approach for Optical Character Recognition of Handwritten Devanagari Script," 2019 2nd International Conference on Intelligent Computing, Instrumentation and Control Technologies (ICICICT), 2019, pp. 1160-1165.
[17] R. Smith, "An Overview of the Tesseract OCR Engine," Ninth International Conference on Document Analysis and Recognition (ICDAR 2007), 2007, pp. 629-633.

指導教授

陳慶瀚(Ching-Han Chen)

審核日期

2022-7-21

推文