博碩士論文 110522090 詳細資訊




以作者查詢圖書館館藏 以作者查詢臺灣博碩士 以作者查詢全國書目 勘誤回報 、線上人數:42 、訪客IP:18.218.55.14
姓名 王譽鈞(Yuh-Jiun Wang)  查詢紙本館藏   畢業系所 資訊工程學系
論文名稱 基於深度學習之文件影像陰影偵測及去除演算法
(A Deep Learning-based Algorithm for Shadow Detection and Removal from Document Images)
相關論文
★ 以Q-學習法為基礎之群體智慧演算法及其應用★ 發展遲緩兒童之復健系統研製
★ 從認知風格角度比較教師評量與同儕互評之差異:從英語寫作到遊戲製作★ 基於檢驗數值的糖尿病腎病變預測模型
★ 模糊類神經網路為架構之遙測影像分類器設計★ 複合式群聚演算法
★ 身心障礙者輔具之研製★ 指紋分類器之研究
★ 背光影像補償及色彩減量之研究★ 類神經網路於營利事業所得稅選案之應用
★ 一個新的線上學習系統及其於稅務選案上之應用★ 人眼追蹤系統及其於人機介面之應用
★ 結合群體智慧與自我組織映射圖的資料視覺化研究★ 追瞳系統之研發於身障者之人機介面應用
★ 以類免疫系統為基礎之線上學習類神經模糊系統及其應用★ 基因演算法於語音聲紋解攪拌之應用
檔案 [Endnote RIS 格式]    [Bibtex 格式]    [相關文章]   [文章引用]   [完整記錄]   [館藏目錄]   [檢視]  [下載]
  1. 本電子論文使用權限為同意立即開放。
  2. 已達開放權限電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。
  3. 請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。

摘要(中) 隨著科技不斷發展和進步,幾乎每個人都有一支智慧型手機,時常需要用於拍照或是拍攝文件來記錄重要的資訊,但拍攝過程中卻經常因為光線被物體所阻擋,例如拍攝者的手或是手機本身,而導致拍攝的影像中產生不必要的陰影。如此一來,除了會造成照片本身觀感不佳之外,有時甚至還會影響到文字的閱讀。為了避免陰影的產生,拍攝者必須調整成特定的拍攝角度,或是再後續自行使用修圖軟體,選取陰影部分並且調整亮度、色調等,但這些步驟不僅花費大量時間且修改完的結果不盡理想。而對於文件影像的處理,不但要去除陰影,還要同時確保文字可被識別,自然就更加困難了。
本論文提出基於深度學習的演算法,可以針對文件影像偵測及去除陰影。首先,訓練一個條件式生成對抗網路,使能夠找出一張影像中陰影區域,並產生陰影遮罩。從陰影區域與非陰影區域找出各自的主要背景顏色,並結合輸入影像明度資訊與前一個階段的陰影遮罩,透過另一個條件式生成對抗網路生成出影像修復的結果,以達成陰影去除的目的。在實驗結果中,本論文的方法所生成的結果,能夠同時達成陰影去除且使文字可閱讀,與未經過處理之原始輸入影像相較之下,PSRN 與SSIM評估指標皆有所提升,也大幅提高光學字元辨識的正確率。
摘要(英) With the constantly development of technology, almost everyone has a smartphone, which is often used for taking pictures or documenting important information. Nevertheless, unwanted shadows may appear in the captured picture due to the blocked light cause by user’s hand or the phone itself. In this way, it will not only result in bad visual quality of images, but also make the text unreadable sometimes. In order to prevent shadows in images, users need to capture images under well-controlled lighting conditions or use an image editing tool to get rid of shadows by selecting the shadow areas and adjusting the brightness or hue. However, these processes waste a lot of time and do not always come up with a result that users really want. Correcting illumination distortion of document images is even greater challenges because it requires not only removing shadows but also ensuring the legibility of the text.
This paper proposes a deep learning-based algorithm to detect and remove shadows from document images. The algorithm starts with a conditional Generative Adversarial Network (cGAN), which has a generator that can find shadow areas from an image and create shadow detection mask. Then, estimating the main background color of the shadow and non-shadow areas combine with the brightness information of original image and its shadow detection mask as input. With the second cGAN, the input goes through the generator to get a shadow-free image. According to the experimental results, the proposed method can be more efficient at both correcting illumination and making text more legible. Compared to original images, both PSNR and SSIM have been increased and the correct rate of Optical Character Recognition (OCR) has also been greatly improved.
關鍵字(中) ★ 深度學習
★ 陰影偵測
★ 陰影去除
★ 條件式生成對抗網路
★ 光學字元辨識
關鍵字(英) ★ Deep Learning
★ Shadow Detection
★ Shadow Removal
★ cGAN
★ OCR
論文目次 摘要iv
Abstract v
誌謝vii
目錄viii
一、緒論1
1.1 研究動機.................................................................. 1
1.2 研究目的.................................................................. 3
1.3 論文架構.................................................................. 3
二、文獻回顧4
2.1 戶外影像.................................................................. 4
2.1.1 資料集............................................................ 4
2.1.2 陰影偵測及去除方法.......................................... 5
2.2 文件影像.................................................................. 7
2.2.1 資料集............................................................ 7
2.2.2 陰影偵測及去除方法.......................................... 8
2.3 相關研究總結............................................................ 10
三、研究方法12
3.1 演算法流程............................................................... 12
3.2 系統介紹.................................................................. 13
3.3 條件式生成對抗網路................................................... 15
3.3.1 模型介紹......................................................... 15
3.3.2 生成器............................................................ 16
3.3.3 判別器............................................................ 17
3.4 陰影偵測演算法......................................................... 18
3.4.1 色彩空間......................................................... 18
3.4.2 類神經網路架構與訓練方法................................. 19
3.4.3 陰影遮罩修補................................................... 21
3.5 陰影去除演算法......................................................... 23
3.5.1 K-means 分群演算法........................................... 23
3.5.2 陰影與非陰影區域背景顏色估計........................... 24
3.5.3 影像明度圖...................................................... 25
3.5.4 類神經網路架構與訓練方法................................. 26
四、實驗設計與結果29
4.1 資料集..................................................................... 29
4.2 陰影偵測實驗與評估................................................... 31
4.3 陰影去除實驗與評估................................................... 33
4.4 OCR 準確率實驗........................................................ 37
五、總結40
5.1 結論........................................................................ 40
5.2 未來展望.................................................................. 40
參考文獻42
參考文獻 [1] R. Smith, “An overview of the tesseract ocr engine,” in Ninth international conference on document analysis and recognition (ICDAR 2007), IEEE, vol. 2, 2007, pp. 629–633.
[2] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation,
vol. 9, no. 8, pp. 1735–1780, 1997.
[3] M. Mirza and S. Osindero, “Conditional generative adversarial nets,” arXiv preprint
arXiv:1411.1784, 2014.
[4] A. Ecins, C. Fermüller, and Y. Aloimonos, “Shadow free segmentation in still images using local density measure,” in 2014 IEEE International Conference on Computational Photography (ICCP), 2014, pp. 1–8.
[5] M. Zhang, W. Zhao, X. Li, and D. Wang, “Shadow detection of moving objects in traffic monitoring video,” in 2020 IEEE 9th Joint International Information Technology and Artificial Intelligence Conference (ITAIC), vol. 9, 2020, pp. 1983–1987.
[6] J. Wang, X. Li, and J. Yang, “Stacked conditional generative adversarial networks for jointly learning shadow detection and shadow removal,” in Proceedings of the IEEE
Conference on Computer Vision and Pattern Recognition, 2018, pp. 1788–1797.
[7] L. Qu, J. Tian, S. He, Y. Tang, and R. W. Lau, “Deshadownet: A multi-context embedding deep network for shadow removal,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 4067–4075.
[8] G. Finlayson, S. Hordley, C. Lu, and M. Drew, “On the removal of shadows from images,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, no. 1, pp. 59–68, 2006.
[9] R. Guo, Q. Dai, and D. Hoiem, “Single-image shadow detection and removal using paired regions,” in CVPR 2011, IEEE, 2011, pp. 2033–2040.
[10] D. Comaniciu and P. Meer, “Mean shift: A robust approach toward feature space analysis,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 5, pp. 603–619, 2002.
[11] V. Nguyen, T. F. Yago Vicente, M. Zhao, M. Hoai, and D. Samaras, “Shadow detection with conditional generative adversarial networks,” in Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 4510–4518.
[12] X. Hu, Y. Jiang, C.-W. Fu, and P.-A. Heng, “Mask-shadowgan: Learning to remove shadows from unpaired data,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 2472–2481.
[13] J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired image-to-image translation using cycle-consistent adversarial networks,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 2223–2232.
[14] Z. Liu, H. Yin, Y. Mi, M. Pu, and S. Wang, “Shadow removal by a lightness-guided network with training on unpaired data,” IEEE Transactions on Image Processing, vol. 30, pp. 1853–1865, 2021.
[15] Y. Jin, A. Sharma, and R. T. Tan, “Dc-shadownet: Single-image hard and soft shadow removal using unsupervised domain-classifier guided network,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 5027–5036.
[16] L. Guo, S. Huang, D. Liu, H. Cheng, and B. Wen, “Shadowformer: Global context helps image shadow removal,” arXiv preprint arXiv:2302.01650, 2023.
[17] A. Vaswani, N. Shazeer, N. Parmar, et al., “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.
[18] S. Jung, M. A. Hasan, and C. Kim, “Water-filling: An efficient algorithm for digitized document shadow removal,” in Computer Vision–ACCV 2018: 14th Asian Conference on Computer Vision, Perth, Australia, December 2–6, 2018, Revised Selected Papers, Part I 14, Springer, 2019, pp. 398–414.
[19] B. Wang and C. L. P. Chen, “Local water-filling algorithm for shadow detection and removal of document images,” Sensors, vol. 20, no. 23, 2020.
[20] S. Bako, S. Darabi, E. Shechtman, J. Wang, K. Sunkavalli, and P. Sen, “Removing shadows from images of documents,” Asian Conference on Computer Vision (ACCV 2016), 2016.
[21] N. Kligler, S. Katz, and A. Tal, “Document enhancement using visibility detection,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 2374–2382.
[22] J.-R. Wang and Y.-Y. Chuang, “Shadow removal of text document images by estimating local and global background colors,” in ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020, pp. 1534–1538.
[23] K. Nazeri, E. Ng, T. Joseph, F. Qureshi, and M. Ebrahimi, “Edgeconnect: Structure
guided image inpainting using edge prediction,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, Oct. 2019.
[24] I. Goodfellow, J. Pouget-Abadie, M. Mirza, et al., “Generative adversarial nets,” in Advances in Neural Information Processing Systems, Z. Ghahramani, M. Welling, C. Cortes, N. Lawrence, and K. Weinberger, Eds., vol. 27, Curran Associates, Inc., 2014.
[25] J. Gauthier, “Conditional generative adversarial nets for convolutional face generation,” Class project for Stanford CS231N: convolutional neural networks for visual recognition, Winter semester, vol. 2014, no. 5, p. 2, 2014.
[26] D. Michelsanti and Z.-H. Tan, “Conditional generative adversarial networks for speech enhancement and noise-robust speaker verification,” arXiv preprint arXiv:1709.01703, 2017.
[27] H. Park, Y. Yoo, and N. Kwak, “Mc-gan: Multi-conditional generative adversarial network for image synthesis,” arXiv preprint arXiv:1805.01123, 2018.
[28] H. Zhang, V. Sindagi, and V. M. Patel, “Image de-raining using a conditional generative adversarial network,” IEEE transactions on circuits and systems for video technology, vol. 30, no. 11, pp. 3943–3956, 2019.
[29] S. Murali, M. R. Rajati, and S. Suryadevara, “Image generation and style transfer using conditional generative adversarial networks,” in 2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA), 2019, pp. 1415–1419.
[30] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros, “Image-to-image translation with conditional adversarial networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 1125–1134.
[31] O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, Springer, 2015, pp. 234–241.
[32] A. Aghabiglou and E. M. Eksioglu, “Projection-based cascaded u-net model for mr image reconstruction,” Computer Methods and Programs in Biomedicine, vol. 207, p. 106 151, 2021.
[33] N. Siddique, S. Paheding, C. P. Elkin, and V. Devabhaktuni, “U-net and its variants for medical image segmentation: A review of theory and applications,” Ieee Access, vol. 9, pp. 82 031–82 057, 2021.
[34] A. Kar and K. Deb, “Moving cast shadow detection and removal from video based on hsv color space,” in 2015 International Conference on Electrical Engineering and Information Communication Technology (ICEEICT), IEEE, 2015, pp. 1–6.
[35] S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” in International conference on machine learning, pmlr, 2015, pp. 448–456.
[36] J. A. Hartigan and M. A. Wong, “Algorithm as 136: A k-means clustering algorithm,” Journal of the royal statistical society. series c (applied statistics), vol. 28, no. 1, pp. 100–108, 1979.
[37] H. Kim and S. Kim, “Automated target detection using k-means based on per-norm for invariant illumination in hyperspectral image,” in 2015 12th International Conference on Ubiquitous Robots and Ambient Intelligence (URAI), 2015, pp. 570–572.
[38] C. Clausner, A. Antonacopoulos, and S. Pletschacher, “Icdar2017 competition on recognition of documents with complex layouts - rdcl2017,” in 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 01, 2017, pp. 1404–1410.
[39] Z. Wang, A. Bovik, H. Sheikh, and E. Simoncelli, “Image quality assessment: From error visibility to structural similarity,” IEEE Transactions on Image Processing, vol. 13, no. 4, pp. 600–612, 2004.
指導教授 蘇木春(Mu-Chun Su) 審核日期 2023-8-1
推文 facebook   plurk   twitter   funp   google   live   udn   HD   myshare   reddit   netvibes   friend   youpush   delicious   baidu   
網路書籤 Google bookmarks   del.icio.us   hemidemi   myshare   

若有論文相關問題,請聯絡國立中央大學圖書館推廣服務組 TEL:(03)422-7151轉57407,或E-mail聯絡  - 隱私權政策聲明