中文歷史地圖字符偵測與地名文字群集技術：以1930 年代中國地形圖為例

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：10

、訪客IP：3.144.216.188

姓名

陳浩瑋(Hao-Wei Chen) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

中文歷史地圖字符偵測與地名文字群集技術：以1930 年代中國地形圖為例
(Character-Level Detection and Grouping Techniques for Geographical Names in Chinese Historical Maps: Case Study of 1930s Chinese Topographical Maps)

相關論文

★ A Real-time Embedding Increasing for Session-based Recommendation with Graph Neural Networks	★ 基於主診斷的訓練目標修改用於出院病摘之十代國際疾病分類任務
★ 混合式心臟疾病危險因子與其病程辨識於電子病歷之研究	★ 基於 PowerDesigner 規範需求分析產出之快速導入方法
★ 社群論壇之問題檢索	★ 非監督式歷史文本事件類型識別──以《明實錄》中之衛所事件為例
★ 應用自然語言處理技術分析文學小說角色之關係：以互動視覺化呈現	★ 基於生醫文本擷取功能性層級之生物學表徵語言敘述：由主成分分析發想之K近鄰算法
★ 基於分類系統建立文章表示向量應用於跨語言線上百科連結	★ Code-Mixing Language Model for Sentiment Analysis in Code-Mixing Data
★ 藉由加入多重語音辨識結果來改善對話狀態追蹤	★ 對話系統應用於中文線上客服助理:以電信領域為例
★ 應用遞歸神經網路於適當的時機回答問題	★ 使用多任務學習改善使用者意圖分類
★ 使用轉移學習來改進針對命名實體音譯的樞軸語言方法	★ 基於歷史資訊向量與主題專精程度向量應用於尋找社群問答網站中專家

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 (2026-11-20以後開放)

摘要(中)

歷史地圖中的地理名稱是我們了解過去地理、文化和社會政治樣貌的重要資源。然而，對於研究人員來說，手動標註地理名稱既耗時又耗力，成為了一大挑戰。再加上已標註的中文歷史地圖地理名稱資料集十分稀少，使得從這些珍貴的地圖中提取資訊變得更加困難與複雜。

在我們的實驗中得知，手動標註一張完整的中文歷史地圖的地理名稱需要耗費一到兩天，效率不太理想，不僅會拖慢了研究進度，還可能因長時間進行產生疲勞導致標註錯誤。此外，現有的光學字符識別（OCR）系統主要針對現代文本進行訓練，難以應對歷史地圖中的獨特繪製風格，如手寫字體、富有手繪地圖標記以及黑白影像。

為了解決這些問題，我們提出了一個五階段的自動化系統，專門用於從中文歷史地圖中提取地理名稱。這個系統包括字符偵測、字符辨識、合併結果以及地名文字群集，藉由OCR技術提升準確度與效率。由於已標註的中文歷史地圖資料稀缺，我們使用HSV（色相、飽和度、量度）增強技術來擴增訓練資料集，讓模型更好地學習到歷史地圖中的特殊特徵。

我們還解決了中文地理名稱常有間距不等長的特性，因此採用字符級的偵測和辨識來準確提取所有文字，藉此進行群集還原出完整的地理名稱。為此，我們運用了德勞內三角剖分技術，幫助我們找出文字框之間的相互關聯，有效地群集出完整的地理名稱。我們以1930年代河北、遼寧、山西省的部分地形圖作為訓練資料。在整體系統評估中，以1930年代河北省的地形圖作為測試資料集為例，我們的自動化系統在提取正確地理名稱的準確率達到70%，還能檢測出其他分散的文字框。

與人工標註一張完整的中文歷史地圖的地理名稱耗費一到兩天相比，我們提出的中文歷史地圖OCR系統只需七到十分鐘即可完成提取，大幅降低了時間和人力成本。對於需要處理大量歷史地圖的歷史學家和學者來說，提供了一個非常實用的自動化工具。

摘要(英)

Historical maps are essential resources for understanding the geography, culture, and socio-political landscapes of the past. However, the manual interpretation of these maps presents significant challenges for researchers due to its labor-intensive and time-consuming nature. This difficulty is compounded by the lack of annotated datasets of geographical names for Chinese historical maps, making it even more challenging to extract and analyze the information contained within these historical maps.

Current methods often fall short in effectively handling the complexities of historical maps. Our experiments show that manual annotation can take 1–2 days per Chinese historical map, which not only hampers research productivity but also leads to errors stemming from fatigue and the irregular spacing of geographical names. Additionally, Existing Optical Character Recognition (OCR) systems are typically optimized for contemporary texts and struggle with the unique characteristics of historical maps, such as handwritten annotations and grayscale imagery.

To address these shortcomings, this study introduces a five-stage automated process for extracting and recognizing geographical name from Chinese historical maps. This method encompasses character detection, character recognition, character reintegration, and character grouping for geographical names, leveraging Optical Character Recognition (OCR) to enhance both accuracy and efficiency. To expand the training dataset for character detection and recognition, data augmentation techniques, specifically HSV (Hue, Saturation, Value) transformations, are employed. These augmentations improve the model′s ability to manage the distinctive features of historical maps.

One major challenge tackled in this research is the irregular spacing of geographical names, which complicates automatic grouping. To resolve this, Delaunay triangulation is utilized to group geographically related geographical names effectively. We used topographic maps of Hebei, Liaoning, and Shanxi provinces from the 1930s as training datasets. In the overall system evaluation, using topographic maps of Hebei Province from the 1930s as a test dataset, our system achieved 70% accuracy in extracting correct geographical names while also detecting additional scattered character boxes.

In contrast to manual annotation, our proposed Chinese Historical MapOCR system completes the extraction process in just 7-10 minutes, significantly reducing both time and labor costs. This substantial improvement in efficiency provides an invaluable tool for historians and scholars working with large collections of historical maps.

關鍵字(中)

★ 光學字符識別
★ 中文歷史地圖
★ 地理名稱文字群集
★ 自動化系統

關鍵字(英)

★ Optical Character Recognition (OCR)
★ Chinese historical maps
★ character grouping for geographical names
★ automatic system

論文目次

Contents
中文摘要 iv
Abstract v
致謝 vii
Contents viii
List of Figures x
List of Tables xiii
1 Introduction 1
2 Related work 5
2.1 OCR in Historical Maps . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Optical Character Recognition (OCR) . . . . . . . . . . . . . . . . . . . 6
2.3 Character-Level OCR . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.4 Character Grouping for Geographical Names . . . . . . . . . . . . . . . 9
3 Methodology 10
3.1 Task Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.2 Framework Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.3 Historical Map Tiling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.4 Character Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.5 Character Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.6 Character Reintegration into Map . . . . . . . . . . . . . . . . . . . . . 16
3.7 Character Grouping for Geographical Names . . . . . . . . . . . . . . . 17
4 Dataset and Evaluation Metrics 20
4.1 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.2 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5 Experiment 23
5.1 Evaluation of Character Detection . . . . . . . . . . . . . . . . . . . . . 23
5.2 Evaluation of Character Recognition . . . . . . . . . . . . . . . . . . . . 26
5.3 Evaluation of Character Grouping for Geographical Names . . . . . . . . 27
5.4 Evaluation of Chinese Historical MapOCR . . . . . . . . . . . . . . . . 30
5.5 Comparison with PP-OCR and ChatGPT-4o . . . . . . . . . . . . . . . . 30
6 Conclusion and Future Work 33
6.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
7 Limitation 35
Bibliography 36
Output and Real-World Coordinate Transformation of Geographical Names 40

參考文獻

[1] Y. Du, C. Li, R. Guo, X. Yin, W. Liu, J. Zhou, Y. Bai, Z. Yu, Y. Yang, and Q. Dang, “Pp-ocr: A practical ultra lightweight ocr system,” arXiv preprint arXiv:2009.09941, 2020.
[2] J. Kim, Z. Li, Y. Lin, M. Namgung, L. Jang, and Y.-Y. Chiang, “The mapkurator system: A complete pipeline for extracting and linking text from historical maps,” arXiv preprint arXiv:2306.17059, 2023.
[3] Y.-Y. Chiang, W. Duan, S. Leyk, J. H. Uhl, and C. A. Knoblock, Using historical maps in scientific studies: Applications, challenges, and best practices. Springer Publishing Company, Incorporated, 2019.
[4] Z. Li, Y.-Y. Chiang, S. Tavakkol, B. Shbita, J. H. Uhl, S. Leyk, and C. A. Knoblock, “An automatic approach for generating rich, linked geo-metadata from historical map images,” in Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 3290–3298, 2020.
[5] E. H. B. Smith, Document Analysis and Recognition-ICDAR 2024: 18th International Conference, Athens, Greece, August 30–September 4, 2024, Proceedings, Part I. Springer Nature, 2024.
[6] T. C. U. of Hong Kong Library, “Chinese classic text ocr challenge 2022.” https://dsprojects.lib.cuhk.edu.hk/en/2022-chinese-ocr-challenge/, 2022. [Online; accessed 19-July-2024].
36
[7] M. Liao, Z. Wan, C. Yao, K. Chen, and X. Bai, “Real-time scene text detection with differentiable binarization,” in Proceedings of the AAAI conference on artificial intelligence, vol. 34, pp. 11474–11481, 2020.
[8] M. Liao, Z. Zou, Z. Wan, C. Yao, and X. Bai, “Real-time scene text detection with differentiable binarization and adaptive scale fusion,” IEEE transactions on pattern analysis and machine intelligence, vol. 45, no. 1, pp. 919–931, 2022.
[9] M. Li, T. Lv, J. Chen, L. Cui, Y. Lu, D. Florencio, C. Zhang, Z. Li, and F. Wei, “Trocr: Transformer-based optical character recognition with pre-trained models,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pp. 13094–13102, 2023.
[10] Y. Baek, B. Lee, D. Han, S. Yun, and H. Lee, “Character region awareness for text detection,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 9365–9374, 2019.
[11] Y. Baek, D. Nam, S. Park, J. Lee, S. Shin, J. Baek, C. Y. Lee, and H. Lee, “Cleval: Character-level evaluation for text detection and recognition tasks,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 564–565, 2020.
[12] M. Namgung and Y.-Y. Chiang, “Incorporating spatial context for post-ocr in map images,” in Proceedings of the 5th ACM SIGSPATIAL International Workshop on AI for Geographic Knowledge Discovery, pp. 14–17, 2022.
[13] M. Ester, H.-P. Kriegel, J. Sander, X. Xu, et al., “A density-based algorithm for discovering clusters in large spatial databases with noise,” in kdd, vol. 96, pp. 226–231, 1996.
[14] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778, 2016.37
[15] X. Zhang, X. Zhou, M. Lin, and J. Sun, “Shufflenet: An extremely efficient convolutional neural network for mobile devices,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 6848–6856, 2018.
[16] A. G. Howard, “Mobilenets: Efficient convolutional neural networks for mobile vision applications,” arXiv preprint arXiv:1704.04861, 2017.

指導教授

蔡宗翰(Tzong-Han Tsai)

審核日期

2024-11-15

推文