博碩士論文 105522014 詳細資訊




以作者查詢圖書館館藏 以作者查詢臺灣博碩士 以作者查詢全國書目 勘誤回報 、線上人數:8 、訪客IP:3.226.245.48
姓名 謝柏維(Po-Wei Hsieh)  查詢紙本館藏   畢業系所 資訊工程學系
論文名稱 基於全卷積神經網路之中文字分割機制
(Chinese Character Segmentation via Fully Convolutional Neural Network)
相關論文
★ 基於QT之跨平台無線心率分析系統實現★ 網路電話之額外訊息傳輸機制
★ 針對與運動比賽精彩畫面相關串場效果之偵測★ 植基於向量量化之視訊/影像內容驗證技術
★ 植基於串場效果偵測與內容分析之棒球比賽精華擷取系統★ 以視覺特徵擷取為基礎之影像視訊內容認證技術
★ 使用動態背景補償以偵測與追蹤移動監控畫面之前景物★ 應用於H.264/AVC視訊內容認證之適應式數位浮水印
★ 棒球比賽精華片段擷取分類系統★ 利用H.264/AVC特徵之多攝影機即時追蹤系統
★ 利用隱式型態模式之高速公路前車偵測機制★ 基於時間域與空間域特徵擷取之影片複製偵測機制
★ 結合數位浮水印與興趣區域位元率控制之車行視訊編碼★ 應用於數位智權管理之H.264/AVC視訊加解密暨數位浮水印機制
★ 基於文字與主播偵測之新聞視訊分析系統★ 植基於數位浮水印之H.264/AVC視訊內容驗證機制
檔案 [Endnote RIS 格式]    [Bibtex 格式]    [相關文章]   [文章引用]   [完整記錄]   [館藏目錄]   至系統瀏覽論文 (2022-9-1以後開放)
摘要(中) 自然場景中的文字與人工符號傳達的重要訊息,因此從影像中擷取文字具有許多潛在的用途,然而目前的方法多根基於對拼音文字文本處理,對於中文這類語素文字文本仍有改進的空間。本研究嘗試以單一中文字作為標記重點,提出結合語意分割 (semantic segmentation) 的自然場景中文字偵測機制。我們所提出的方法分成兩階段:第一階段採用全卷積網路 (Fully Convolutional Network, FCN) 訓練對自然場景的中文文本偵測模型,在訓練時除了真實場景訓練集資料外,也加入模擬資料彌補資料集的缺失,強化模型的偵測能力。第二階段則協助分離文字區域,並以區域分布關係對文字框分組,使節和的文字串在不同文字書寫方向和排版中仍然有效,提升應用價值。實驗結果顯示所提出的方法能有效偵測中文文本,並探討各步驟對偵測結果的影響。
摘要(英) The important information conveyed by texts and artificial symbols in natural scenes, so capturing text context from images has many potential applications. However, the current methods are almost based on the processing of phonetic text, and the methods for morpheme text such as Chinese are still improved. This study attempts to propose a Chinese character text detection mechanism of semantic segmentation for natural scene images, with marking the label for each individual Chinese character. The proposed method is divided into two stages: in the first stage, we trained the Fully Convolutional Network (FCN) as the Chinese text detection model for natural scenes. We adopted real natural scene as the training dataset, and added synthetic datasets and to enhance the detection ability of the model. In the second stage, it assisted in separating the text areas and grouping the text boxes by the regional distribution relationship, and combined the character information in different writing directions and layouts to improve the worth of application. The experimental results show that the proposed method can effectively detect Chinese text in natural scenes, and explore the impact of each step on the detection results.
關鍵字(中) ★ 文字偵測
★ 自然場景
★ 全卷積神經網路
關鍵字(英) ★ text detection
★ natural scenes
★ full convolutional neural networks
論文目次 論文摘要 I
Abstract II
致謝 III
目錄 IV
附圖目錄 VI
表格目錄 IX
第一章 緒論 1
1.1 研究背景與動機 1
1.2 研究貢獻 3
1.3 論文架構 4
第二章 相關研究 5
2.1 深度學習之文字偵測 5
2.2 模擬資料使用 12
第三章 提出方法 15
3.1 中文字定位偵測 15
3.1.1 DeepLab v3+ 介紹 16
3.1.2 真實場景資料集 23
3.1.3 模擬資料生成 25
3.1.4 資料標記與數據平衡 31
3.2 候選文字區域後處理 34
3.2.1 遮罩處理 35
3.2.2 候選文字區域提取 36
3.2.3 候選文字區域排列 39
3.2.4 文字形態校正 50
第四章 實驗結果 51
4.1 開發環境與訓練方法介紹 51
4.2 排列比較 52
4.3 相關應用流程 58
第五章 結論與未來展望 62
參考文獻 63
參考文獻 [1] L. Chen, Y. Zhu, G. Papandreou, F. Schroff, and H. Adam. “Encoder-decoder with atrous separable convolution for semantic image segmentation.” In European Conference on Computer Vision (ECCV), 2018.
[2] L.C. Chen, G. Papandreou, F. Schroff, and H. Adam, “Rethinking atrous convolution for semantic image segmentation.” arXiv:1706.05587 (2017).
[3] Qi, H., Zhang, Z., Xiao, B., Hu, H., Cheng, B., Wei, Y., Dai, J. “Deformable convolutional networks.” In coco detection and segmentation challenge 2017 entry. ICCV COCO Challenge Workshop (2017).
[4] F. Yu and V. Koltun, “Multi-scale context aggregation by dilated convolutions,” arXiv preprint arXiv:1511.07122, 2015.
[5] L. C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, A. L. Yuille, “DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs.” IEEE transactions on pattern analysis and machine intelligence, vol. 40, no. 4, April 2018.
[6] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, ”Mobilenets: Efficient convolutional neural networks for mobile vision applications,” arXiv:1704.04861 [cs], Apr 2017.
[7] Tai-Ling Yuan, Zhe Zhu, Kun Xu, Cheng-Jun Li, Tai-Jiang Mu, and Shi-Min Hu. “A large Chinese text dataset in the wild.” Journal of Computer Science and Technology, 2019, 34(3): 509-521.
[8] M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and B. Schiele, “The cityscapes dataset for semantic urban scene understanding,” in CVPR, 2016.
[9] A. Gupta, A. Vedaldi, and A. Zisserman. “Synthetic data for text localisation in natural images.” in Proc. CVPR, 2016, pp. 2315-2324.
[10] Z. Tian, W. Huang, T. He, P. He, and Y. Qiao, “Detecting text in natural image with connectionist text proposal network,” in Proc. Eur. Conf. Comput. Vis., 2016, pp. 56–72.
[11] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R CNN: Towards Real Time Object Detection with Region Proposal Networks,” in IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 6, pp. 1137-1149, 2017.
[12] R. Girshick, “Fast R CNN,” in Proc. IEEE Int. Conf. Comput. Vis. Vis., pp. 1440-1448, 2015.
[13] M. Liao, B. Shi, X. Bai, X. Wang, and W. Liu. “Textboxes: A fast text detector with a single deep neural network.” In AAAI, pp. 4161–4167, 2017.
[14] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, and S. E. Reed, “SSD: single shot multibox detector”. In Proc. ECCV, 2016.
[15] Y. Liu and L. Jin. “Deep matching prior network: Toward tighter multi-oriented text detection.” In CVPR, pp. 3454– 3461, 2017.
[16] J. Long, E. Shelhamer, and T. Darrell. “Fully convolutional networks for semantic segmentation.” In CVPR, 2015.
[17] D. He, X. Yang, C. Liang, Z. Zhou, G. Alexander, I. Ororbia, D. Kifer, and C. L. Giles. “Multi-scale fcn with cascaded instance aware segmentation for arbitrary oriented word spotting in the wild.” In CVPR, pp. 474–483, 2017.
[18] P. He, W. Huang, T. He, Q. Zhu, Y. Qiao, and X. Li. “Single shot text detector with regional attention.” In ICCV, volume 6, 2017.
[19] S. Long, J. Ruan, W. Zhang, X. He, W. Wu, and C. Yao. “Textsnake: A flexible representation for detecting text of arbitrary shapes.” arXiv preprint arXiv:1807.01544, 2018.
[20] A. Shrivastava, T. Pfister, O. Tuzel, J. Susskind, W. Wang, and R. Webb, "Learning from Simulated and Unsupervised Images through Adversarial Training", Computer Vision and Pattern Recognition (CVPR) 2017 IEEE Conference on, pp. 2242-2251, 2017.
[21] Debidatta Dwibedi, Ishan Misra, Martial Hebert, "Cut Paste and Learn: Surprisingly Easy Synthesis for Instance Detection", Computer Vision (ICCV) 2017 IEEE International Conference on, pp. 1310-1319, 2017.
[22] F. Liu, C. Shen, and G. Lin. “Deep convolutional neural fieldsfor depth estimation from a single image.” In Proc. CVPR, 2015.
[23] P. Perez, M. Gangnet, and A. Blake. “Poisson image editing.” ACM Transactions on Graphics, 22(3):313–318, 2003.
[24] Y. Baek, B. Lee, D. Han, S. Yun, and H. Lee. “Character region awareness for text detection.” In CVPR, pp. 4321–4330. IEEE, 2019.
[25] P. Zhang, P. Su. “Text Detection in Street View Images with Hirarchical Fully Convolution Neural Networks.” National Central University, 2018.
[26] https://language.moe.gov.tw/001/Upload/files/SITE_CONTENT/M0001/PIN/biau1.htm
[27] https://yinguobing.com/separable-convolution/
指導教授 蘇柏齊 審核日期 2019-8-21
推文 facebook   plurk   twitter   funp   google   live   udn   HD   myshare   reddit   netvibes   friend   youpush   delicious   baidu   
網路書籤 Google bookmarks   del.icio.us   hemidemi   myshare   

若有論文相關問題,請聯絡國立中央大學圖書館推廣服務組 TEL:(03)422-7151轉57407,或E-mail聯絡  - 隱私權政策聲明