摘要(英) |
This study investigates a rapid recognition method for extracting text content from cells in specific engineering chart images using morphology and optical character recognition (OCR) techniques and recording the results. The research is applicable to specific engineering chart images, and if it needs to be applied to different types of engineering chart images, the parameters of the corresponding engineering chart image rules can be modified.
Python programming language serves as the foundation for this research. In the preprocessing stage, the Otsu thresholding method is utilized for image binarization, and morphology operations are employed to extract the positions of cells in specific engineering chart images. In the text recognition process, the Tesseract-OCR package is used and divided into three stages for text recognition and extraction: 1. automatic page segmentation with a pre-trained English model, 2. word segmentation with a retrained English model, and 3. character segmentation with a retrained English model. Finally, regular expressions combined with an exhaustive approach are used to correct errors and content that deviate from the rules.
The experimental results indicate that although the Tesseract-OCR package provides users with a pre-trained English model, which exhibits excellent recognition capabilities for long strings, it tends to generate errors in recognizing words or individual characters within cells. Using the three-stage approach with the pre-trained English model, the recognition accuracy is only 14.65%. However, by retraining the English model using a dataset created from specific engineering chart images, the recognition capability for words or individual characters within cells improves, achieving an accuracy of 58.04%. In the post-processing stage, by listing all errors and content that deviate from the rules based on specific engineering chart rules and replacing them with correct characters, the accuracy can be enhanced to 100%. |
參考文獻 |
[1] S. Papert, “The Summer Vision Project”, Massachusetts Institute Of Technology Project Mac, Artificial Intelligence Group, Vision Memo, No. 100. July 1966.
[2] 莊永裕,「矽眼:電腦視覺初探」,探索基礎科學系列講座,第20期,2018年12月1日,取自臺大科學教育發展中心的YOUTUBE影音平台https://www.youtube.com/watch?v=7-Mk-VMM9F8
[3] 「LINE實用技:掃碼功能隱藏小技巧,一拍輕鬆擷取、翻譯文字」, 20 July 2021, 取自LINE官網https://official-blog-tw.line.me/archives/10528346.html
[4] G. Tauschek, M. Lakes and N. J., “READING MACHINE”美國專利,公告號US2026330A,December 1935。
[5] 林巧敏和蔡瀚緯,「光學字元辨識古籍之全文轉置經驗:以明人文集為例」,圖資與檔案學刊,12:2(No.97),76-117頁,December 2020。
[6] J. Shashirangana, H. Padmasiri, D. Meedeniya, et al. “Automated License Plate Recognition: A Survey on Methods and Techniques”, IEEE Access, Vol 9, pp. 11203-11225, December 2020.
[7] X. Zhi, B. Zhao and Y. Wang, “A Hybrid Framework for Text Recognition Used in Commodity Futures Document Verification”, 2021 6th International Conference on Computational Intelligence and Applications (ICCIA), June 2021.
[8] M. Tamilselvi, G. Ramkumar, G Anitha, et al. “A Novel Text Recognition Scheme using Classification Assisted Digital Image Processing Strategy”, 2022 International Conference on Advances in Computing, Communication and Applied Informatics (ACCAI), January 2022.
[9] H. Arslan, “End to End Invoice Processing Application Based on Key Fields Extraction”, IEEE Access, Vol 10, pp. 78398-78413, July 2022.
[10] S.A. Siddiqui, M.I. Malik, S. Agne, et al. “DeCNT: Deep Deformable CNN for Table Detection”, IEEE Access, Vol 6, pp. 74151-74161, November 2018.
[11] A. Sinha, J. Bayer and S.S. Bukhari, “Table Localization and Field Value Extraction in Piping and Instrumentation Diagram Images”, 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW), September 2019.
[12] N. Sun, Y. Zhu and X. Hu, “Faster R-CNN Based Table Detection Combining Corner Locating”, 2019 International Conference on Document Analysis and Recognition (ICDAR), September 2019.
[13] S.S. Paliwal, V. D, R. Rahul, et al. “TableNet: Deep Learning Model for End-to-end Table Detection and Tabular Data Extraction from Scanned Document Images”, 2019 International Conference on Document Analysis and Recognition (ICDAR), September 2019.
[14] Nidhi, K. Saluja, A. Mahajan, et al. “Table Detection and Extraction using OpenCV and Novel Optimization Methods”, 2021 International Conference on Computational Performance Evaluation (ComPE), December 2021.
[15] S. Uchida, “Image Processing and Recognition for Biological Images”, Development, Growth & Differentiation (DGD), Vol 55, Issue 4, pp. 523-549, May 2013.
[16] “Python”, 2 April 2023, 取自Python 官方使用手冊https://docs.python.org/zh-tw/3/
[17] “OpenCV ThresholdTypes”, 2 April 2023, 取自OpenCV官方使用手冊
https://docs.opencv.org/4.7.0/d7/d1b/group__imgproc__misc.html
[18] “OpenCV MorphTypes”, 2 April 2023, 取自OpenCV官方使用手冊
https://docs.opencv.org/4.7.0/d4/d86/group__imgproc__filter.html
[19] R. Smith, S. Weil, Z. Podobny, et al. “Tesseract-OCR”, 25 March 2023,
取自Tesseract-OCR的Github網站https://github.com/tesseract-ocr/tesseract
[20] “Python Re”, 2 April 2023, 取自Python 官方使用手冊
https://docs.python.org/zh-tw/3.11/library/re.html
[21] “Python Os”, 2 April 2023, 取自Python官方使用手冊
https://docs.python.org/zh-tw/3/library/os.html
[22] “Python Statistics”, 2 April 2023, 取自Python官方使用手冊
https://docs.python.org/zh-tw/3/library/statistics.html
[23] “Python Zip”, 2 April 2023, 取自Python官方使用手冊
https://docs.python.org/zh-tw/3/library/functions.html?highlight#zip
[24] “Python Sorted”, 2 April 2023, 取自Python官方使用手冊
https://docs.python.org/zh-tw/3/library/functions.html?highlight#sorted
[25] “Matplotlib Pyplot”, 2 April 2023, 取自Matplotlib官方網站
https://matplotlib.org/stable/gallery/pyplots/index.html
[26] “Numpy”, 2 April 2023, 取自Numpy官方網站https://numpy.org/
[27] “Pandas DataFrame”, 2 April 2023, 取自Pandas官方網站https://pandas.pydata.org/docs/reference/frame.html
[28] S. Bardhan, “Table_Data_Extraction”, 2 October 2021, 取自S. Bardhan的Github網站 https://github.com/Soumi7/Table_Data_Extraction
[29] 李文丁, 「Tesseract-OCR LSTM模型訓練指南」, 15 June 2021, 取自李文丁的HackMD網站 https://hackmd.io/@garyli-wd/rJ619THsO#Case1%EF%BC%9ACompute-CTC-targets-failed
[30] 李馨,從零開始學Python程式設計,初版,博碩文化,新北市,民國107年。
[31] 繆鵬,CV+深度學習:AI最完整的跨套件Python人工智慧電腦視覺,初版,深智數位,臺北市,民國108年。
[32] 洪錦魁,OpenCV影像創意邁向AI視覺王者歸來,初版,深智數位,台灣,民國111年。 |