摘要(英) |
The license plate image captured by the dashcam in the vehicle or the monitor may blur due to the distance, lack of focus, or high speed of the vehicle. Therefore, the license plate recognition system can’t accurately identify the license plate. Although there is a lot of literature that uses Generative Adversarial Networks (GAN) to achieve and have related research results, the results are different. Some reconstruction success rates are low, and some are only effective for specific blurring methods. After studying different literature, it’s found that different GAN architectures have obvious differences in the results of reconstructed images. Therefore, this thesis searches for and modifies the existing GAN architecture, pairs and combines different generator architectures, discriminator architectures, and loss functions, and compares the effects of reconstructed images under different combinations to find the combination with the best reconstruction effect. In addition, we are also curious about whether increasing the number of image reconstructions will improve the reconstruction effect, and give experiments and analysis.
We divide the reconstruction success criteria into two types, one is "the license plate is completely reconstructed correctly", and the other is "the license plate can be recognized after reconstruction". The first type is that the human eye can barely read the license plate number in the blurred image, and the license plate number in the reconstructed image is completely correct. The second type is that the license plate is so blurred that it can not be read by the human eye, and the reconstructed license plate image can be recognized as a reference (the reconstruction may not be completely correct). Both metrics for evaluating the success of reconstruction use SSIM (structural similarity). The final results of this thesis show that no matter which standard is used, the reconstruction effect using DeblurGAN is the best, where the generator uses ResNet with global skip connections, and the discriminator uses multi-scale PatchGAN. Without classification, the overall avgSSIM is 0.8036. Regarding the number of image reconstruction times, if it is the first type of license plate, the avgSSIM of the license plate image reconstructed once has reached 0.8536, which means that the license plate image reconstructed once is clear enough and completely correct, and there is no need for secondary reconstruction. The avgSSIM of reconstruction twice drops to 0.7647 because the background or a few blocks are more different from the original image. Generally, if it is the second type of license plate, the reconstructed license plate is still blurred or not completely correct. If the reconstructed license plate is not clear enough, the license plate can be reconstructed several times until the license plate can be recognized as a reference or can not be changed clearly. Even though the avgSSIM of reconstruction once is higher than that of reconstruction twice, it is more important to be able to recognize the reconstructed license plate than to be correct. Therefore, the second type of license plate has to determine the optimal number of image reconstructions according to the condition of the reconstructed license plate. |
參考文獻 |
[1] V. Moslemi, “De-blurring methodology of license plate using sparse representation,” in Int. Conf. Comput. Knowl. Eng. (ICCKE), 2012, pp. 34-38.
[2] A. H. Yu, H. Bai, Q. R. Jiang, Z. H. Zhu, C. G. Huang and B. P. Hou, “Blurred license plate recognition via sparse representations,” in Proc. IEEE Conf. Ind. Electron. Appl. , 2014, pp. 1657-1661.
[3] J. Fang, Y. Yuan, W. Ji, P. Tang and Y. Zhao, “Licence plate images deblurring with binarization threshold,” in Proc. IEEE Int. Conf. on Imaging Syst. and Tech. (IST), 2015, pp. 1-6.
[4] Y. Kataoka, T. Matsubara and K. Uehara, “Image generation using generative adversarial networks and attention mechanism,” in Proc. IEEE/ACIS Int. Conf. on Comput. and Inf. Science (ICIS), 2016, pp. 1-6.
[5] D. Pathak, P. Krähenbühl, J. Donahue, T. Darrell and A. A. Efros, “Context Encoders: feature learning by inpainting,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2016, pp. 2536-2544.
[6] C. Yang, X. Lu, Z. Lin, E. Shechtman, O. Wang and H. Li, “High-Resolution image inpainting using multi-scale neural patch synthesis,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2017, pp. 4076-4084.
[7] P. Isola, J. Zhu, T. Zhou and A. A. Efros, “Image-to-Image translation with conditional adversarial networks,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2017, pp. 5967-5976.
[8] J. Zhu, T. Park, P. Isola and A. A. Efros, “Unpaired image-to-image translation using cycle-consistent adversarial networks,” in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), 2017, pp. 2242-2251.
[9] C. Li and M. Wand, “Precomputed real-time texture synthesis with markovian generative adversarial networks,” ECCV, 2016.
[10] T. Wang, M. Liu, J. Zhu, A. Tao, J. Kautz and B. Catanzaro, “High-Resolution image synthesis and semantic manipulation with conditional GANs,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2018, pp. 8798-8807.
[11] 程義凱, 基於生成對抗網路之模糊車牌影像重建與辨識, 國立臺灣海洋大學電機工程學系碩士班碩士論文, 2021.
[12] O. Kupyn, V. Budzan, M. Mykhailych, D. Mishkin and J. Matas, “DeblurGAN: blind motion deblurring using conditional adversarial networks,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2018, pp. 8183-8192.
[13] I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin and A. Courville, “Improved training of wasserstein GANs,” in Proc. Adv. Neural Inf. Process. Syst. (NIPS), 2017.
[14] Z. -M. Chen and L. -W. Chang, “Blind motion deblurring via inceptionresdensenet by using GAN model,” in Proc. IEEE Int. Conf. Acoust. Speech Signal Process. (ICASSP), 2019, pp. 1463-1467.
[15] J. Fan, L. Wu and C. Wen, “Sharp processing of blur image based on generative adversarial network,” in Int. Conf. Adv. Robot. Mechatron. (ICARM), 2020, pp. 437-441.
[16] L. Zhou, W. Min, D. Lin, Q. Han and R. Liu, “Detecting motion blurred vehicle logo in IoV using filter-DeblurGAN and VL-YOLO,” IEEE Trans. Veh. Technol., vol. 69, no. 4, pp. 3604-3614, 2020.
[17] G. Gong and K. Zhang, “Local blurred natural image restoration based on self-reference deblurring generative adversarial networks,” in Proc. IEEE Int. Conf. Signal Image Process Appl. (ICSIPA), 2019, pp. 231-235.
[18] G. -S. Hsu, J. -C. Chen and Y. -Z. Chung, “Application-Oriented license plate recognition,” IEEE Trans. Veh. Technol., vol. 62, no. 2, pp. 552-561, 2013.
[19] G. -S. Hsu, A. Ambikapathi, S. -L. Chung and C. -P. Su, “Robust license plate detection in the wild,” in Proc. IEEE Int. Conf. on Adv. Video Signal Based Surveill. (AVSS), 2017, pp. 1-6.
[20] Z. Liang, B. Yang and H. Xiao, “Using motion deblurring algorithm to improve vehicle recognition via DeblurGAN,” in Int. Conf. Virtual Real. Intell. Syst. (ICVRIS), 2020, pp. 486-489.
[21] S. Gonwirat and O. Surinta, “DeblurGAN-CNN: effective image denoising and recognition for noisy handwritten characters,” IEEE Access, vol. 10, pp. 90133-90148, 2022.
[22] NVIDIA Tesla V100 SXM2 32 GB Specs – TechPowerUp
https://www.techpowerup.com/gpu-specs/tesla-v100-sxm2-32-gb.c3185
[23] GitHub - junyanz/pytorch-CycleGAN-and-pix2pix: Image-to-Image Translation in PyTorch
https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix
[24] GitHub - NVIDIA/pix2pixHD: Synthesizing and manipulating 2048x1024 images with conditional GANs
https://github.com/NVIDIA/pix2pixHD
[25] GitHub - KupynOrest/DeblurGAN: Image Deblurring using Generative Adversarial Networks
https://github.com/KupynOrest/DeblurGAN
[26] WoWtchout - 地圖型行車影像分享平台- Youtube
https://www.youtube.com/@WoWtchout
[27] 影像模糊化 - OpenCV 教學 ( Python ) | STEAM 教育學習網
https://steam.oxxostudio.tw/category/python/ai/opencv-blur.html
[28] OpenCV | Motion Blur in Python – GeeksforGeeks
https://www.geeksforgeeks.org/opencv-motion-blur-in-python/
[29] GitHub - eastmountyxz/ImageProcessing-Python
https://github.com/eastmountyxz/ImageProcessing-Python
[30] 峰值訊噪比 - 維基百科,自由的百科全書
https://zh.wikipedia.org/zh-tw/%E5%B3%B0%E5%80%BC%E4%BF%A1%E5%99%AA%E6%AF%94
[31] Zhou Wang, A. C. Bovik, H. R. Sheikh and E. P. Simoncelli, “Image quality assessment: from error visibility to structural similarity,” IEEE Trans. Image Process., vol. 13, no. 4, pp. 600-612, 2004.
[32] Y. Jia, R. Song, S. Chen, G. Wang and C. Yan, “Preliminary results of multipath ghost suppression based on generative adversarial nets in TWRI,” in Proc. IEEE Int. Conf. Signal Image Process. (ICSIP), 2019, pp. 208-212.
[33] controlling patch size · Issue #11 · yenchenlin/pix2pix-tensorflow · GitHub
https://github.com/yenchenlin/pix2pix-tensorflow/issues/11
[34] O. Ronneberger, P. Fischer and T. Brox, “U-net: convolutional networks for biomedical image segmentation,” MICCAI, pp. 234-241, 2015.
[35] S. Xie, R. Girshick, P. Dollár, Z. Tu and K. He, “Aggregated residual transformations for deep neural networks,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2017, pp. 5987-5995.
[36] X. Mao, Q. Li, H. Xie, R. Y. K. Lau, Z. Wang and S. P. Smolley, “Least squares generative adversarial networks,” in Proc. IEEE Conf. Comput. Vis. (ICCV), 2017, pp. 2813-2821. |