基於尺度遞迴網路的生成對抗網路之 影像去模糊

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：22

、訪客IP：3.21.44.115

姓名

許位祥(Wei-Hsiang Hsu) 查詢紙本館藏

畢業系所

通訊工程學系

論文名稱

基於尺度遞迴網路的生成對抗網路之影像去模糊
(Scale-recurrent Network Based Generative Adversarial Network for Image Deblurring)

相關論文

★ 應用於車內視訊之光線適應性視訊壓縮編碼器設計	★ 以粒子濾波法為基礎之改良式頭部追蹤系統
★ 應用於空間與CGS可調性視訊編碼器之快速模式決策演算法	★ 應用於人臉表情辨識之強健式主動外觀模型搜尋演算法
★ 結合Epipolar Geometry為基礎之視角間預測與快速畫面間預測方向決策之多視角視訊編碼	★ 基於改良式可信度傳遞於同質區域之立體視覺匹配演算法
★ 以階層式Boosting演算法為基礎之棒球軌跡辨識	★ 多視角視訊編碼之快速參考畫面方向決策
★ 以線上統計為基礎應用於CGS可調式編碼器之快速模式決策	★ 適用於唇形辨識之改良式主動形狀模型匹配演算法
★ 以運動補償模型為基礎之移動式平台物件追蹤	★ 基於匹配代價之非對稱式立體匹配遮蔽偵測
★ 以動量為基礎之快速多視角視訊編碼模式決策	★ 應用於地點影像辨識之快速局部L-SVMs群體分類器
★ 以高品質合成視角為導向之快速深度視訊編碼模式決策	★ 以運動補償模型為基礎之移動式相機多物件追蹤

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

拍攝時不論是相機或被拍攝物的晃動，都易使拍攝的影像有著運動模糊(motion blur)，造成觀賞體驗受嚴重的影響，或是視覺追蹤(visual tracking)和物件偵測(object detection)等效能下降。而現有基於深度學習的方案往往得耗費高網路參數量或記憶體，以換取網路生成高品質的去模糊影像。SRN^+為現有文獻中，網路參數量較低且效果甚佳的基於深度學習之影像去模糊網路方案，因此本論文提出以SRN^+的網路架構作為生成器(generator)，並於訓練階段加入以虛擬標籤(pseudo label)輔助之鑑別器(discriminator)，提升生成器去模糊影像的品質。和 standard GAN(generative adversarial network)不同，虛擬標籤輔助之生成對抗網路會同時被提供去模糊影像和對應的清晰影像，使鑑別器(discriminator)能給予生成器的優化更準確的損失，提升去模糊影像細節的回復。以漏斗式柔和標籤(funnel soft labelling)代替二元(binary)標籤，降低鑑別器的學習能力，使生成器較不會面臨梯度消失，穩定生成對抗網路的訓練。除此之外，本論文提出對於不同尺度(scale)的損失函數給予不一樣的權重，使網路能對於大尺度階段的去模糊影像之損失，給予更大的權重，並且最大尺度的損失函數以均方誤差(mean squared error, MSE)取代平均絕對誤差(mean absolute error, MAE)，使去模糊影像更加的清晰。在測試階段只需使用生成器輸出去模糊影像，因此本論文所提方案的網路參數和計算複雜度皆和SRN^+相同，對於GoPro資料集，峰值訊噪比(peak signal-to-noise ratio, PSNR)比SRN^+高0.51dB，結構相似性(structural similarity index measure, SSIM)高0.005，和現今頂尖方案MPRNet最輕量化的版本1-stage相比，峰值訊噪比高於1dB，網路參數量為MPRNet(1-stage)的7/10。

摘要(英)

Camera shake or moving objects causes blurred images. It would lead to the awful visual experience or decrease accuracy of visual tracking and object detection. Existing deep learning based approaches usually requires more network parameters or memory usage to generate high-quality deblurred images. SRN^+ is an existing deep learning based single image deblurring network which has a low amount of network parameters and good performance. Therefore, this paper proposes to adopt SRN^+ as the generator, and input training samples with pseudo labels to the discriminator to improve the quality of deblurred images from the generator at the training stage. Different form standard GAN (generative adversarial network), the proposed generative adversarial network with pseudo labels is provided with the deblurred image and the corresponding sharp image at the same time. Accordingly, the discriminator gives the more accurate loss to guide optimization of the generator to restore details of deblurred images. Use funnel soft labelling instead of binary label to reduce the learning ability of the discriminator, so that the generator will avoid gradient vanishing, and stabilize the training of the generative adversarial network. In addition, this paper proposes to assign different weights to loss functions of different scales, where a larger weight is assigned to the loss of the deblurred image at the large-scale stage. The loss function of the largest scale adopts mean squared error (MSE) instead of mean absolute error (MAE) to make the deblurred image more sharp. At the test stage, the generator generates the deblurred image where the amount of network parameters and computational complexity of the proposed scheme are the same as SRN^+. For the GoPro dataset, the proposed scheme is 0.51dB higher than SRN^+ on the peak signal-to-noise ratio (PSNR), and it is 0.005 higher than SRN^+ on the structural similarity index measure (SSIM). Compared with the lightest version (i.e., 1-stage) of the state-of-the-art deblurring net MPRNet, the proposed scheme is 1dB higher on PSNR.

關鍵字(中)

★ 單影像去模糊
★ 生成對抗網路
★ 尺度遞迴網路
★ 虛擬標籤

關鍵字(英)

★ single image deblurring
★ generative adversarial network
★ scale-recurrent network
★ pseudo label

論文目次

摘要 vii
Abstract ix
誌謝 xi
第一章緒論 1
1.1 前言 1
1.2 研究動機 1
1.3 研究方法 3
1.4 論文架構 3
第二章非基於生成對抗網路之單影像去運動模糊技術介紹 4
2.1 基於單尺度網路之單影像去運動模糊方案現況 4
2.2 非基於多尺度網路之單影像去運動模糊技術現況 6
2.3 總結 8
第三章基於生成對抗網路之單影像去運動模糊技術現況 9
3.1 生成對抗網路之去運動模糊方法 9
3.2 基於生成對抗網路之影像去運動模糊現況介紹 12
3.3 總結 14
第四章本論文所提之基於生成對抗網路的影像去模糊方案 15
4.1 系統架構 15
4.2 本論文提出之以監督式生成對抗網路改善去模糊影像方案 16
4.3 訓練細節 28
4.4 總結 29
第五章實驗結果與分析 31
5.1 測試資料集和測試環境 31
5.2 客觀品質量測 32
5.3 網路參數分析 35
5.4 GoPro和RealBlur-J測試資料集的視覺評估 37
5.3 總結 42
第六章結論與未來展望 43
參考文獻 44
符號表 49

參考文獻

[1] P. Hsu and B. Y. Chen, “Blurred image detection and classification,” in Proc. International Conference on Multimedia Modeling, pp. 277-286, Jan. 2008.
[2] S. Lee and S. Cho, “Recent advances in image deblurring,” in Proc. SIGGRAPH Asia 2013 Courses, pp.1-108, Nov. 2013.
[3] E. O. Brigham and R. E. Morrow, “The fast Fourier transform,” IEEE spectrum, Vol. 4, No. 12, pp. 63-70, Dec. 1967.
[4] L. Lucy, “An iterative technique for the rectification of observed distributions,” Astronomical Journal, Vol. 79, pp. 745-754, 1974.
[5] J. Sun, W. Cao, Z. Xu, and J. Ponce, “Learning a convolutional neural network for non-uniform motion blur removal,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 769-777, June 2015.
[6] A. Chakrabarti, “A neural approach to blind motion deblurring,” in European Conference on Computer Vision, pp. 221–235, Oct. 2016.
[7] X. Tao, H. Gao, X. Shen, J. Wang, and J. Jia, “Scale-recurrent network for deep image deblurring,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 8174-8182, Dec. 2018.
[8] X. Shi, Z. Chen, H. Wang, D. Y. Yeung, W. K. Wong, and W. C. Woo, “Convolutional LSTM network: a machine learning approach for precipitation nowcasting,” in Proc. 28th International Conference on Neural Information Processing System, Vol. 1, pp.802-810, Dec. 2015.
[9] H. Gao, X. Tao, X. Shen, and J. Jia, "Dynamic scene deblurring with parameter selective sharing and nested skip connections," in Proc. IEEE Conference on Computer Vision and Pattern Recognition(CVPR), pp. 3848-3856, June 2019.
[10] O. Kupyn, T. Martyniuk, J. Wu, and Z. Wang, “Deblurgan-v2: deblurring (orders-of-magnitude) faster and better,” in Pro. IEEE International Conference on Computer Vision, pp. 8878-8887, Aug. 2019.
[11] T. Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie, “Feature pyramid networks for object detection,” in Proc. IEEE conference on Computer Vision and Pattern Recognition, pp. 2117-2125, July 2017.
[12] J. U. Yun, B. Jo, and I. K. Park, “Joint face super-resolution and deblurring using generative adversarial network,” IEEE Access, Vol. 8, pp.159661-159671, Aug. 2020.
[13] S. W. Zamir, A. Arora, S. Khan, M. Hayat, F. S. Khan, M. H. Yang, and L. Shao, “Multi-stage progressive image restoration,” in Proc. IEEE conference on Computer Vision and Pattern Recognition, June 2021.
[14] S. Nah, T. H. Kim, and K. M. Lee, “Deep multi-scale convolutional neural network for dynamic scene deblurring,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 3883-3891, July 2017.
[15] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” Communications of the ACM, Vol. 60, pp.84-90, June 2017.
[16] M. Suin, K. Purohit, and A. N. Rajagopalan, “Spatially-attentive patch-hierarchical network for adaptive motion deblurring,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 3606-3615, Aug. 2020.
[17] H. Zhang, Y. Dai, H. Li, and P. Koniusz, “Deep stacked hierarchical multi-patch network for image deblurring,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 5978-5986, June 2019.
[18] K. Purohit and A. N. Rajagopalan, “Region-adaptive dense network for efficient motion deblurring,” in Proc. AAAI Conference on Artificial Intelligence, Vol. 34, No. 07, pp. 11882-11889, Apr. 2020.
[19] G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, “Densely connected convolutional networks,” in Proc. IEEE conference on Computer Vision and Pattern Recognition, pp. 4700-4708, July 2017.
[20] Y. Yuan, W. Su, and D. Ma, “Efficient dynamic scene deblurring using spatially variant deconvolution network with optical flow guided training,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 3555-3564, June 2020.
[21] F. J. Tsai, Y. T. Peng, Y. Y. Lin, C. C. Tsai, and C. W. Lin, “BANet: Blur-aware attention networks for dynamic scene deblurring,” arXiv preprint arXiv:2101.07518, Jan. 2021.
[22] I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. C. Courville, and Y. Bengio, “Generative adversarial nets,” in Proc. Neural Information Processing Systems, pp. 2672-2680, Dec. 2014.
[23] H. Thanh-Tung and T. Tran, “Catastrophic forgetting and mode collapse in GANs,” in Proc. International Joint Conference on Neural Networks (IJCNN), pp. 1-10, July 2020.
[24] M. Arjovsky, S. Chintala, and L. Bottou, “Wasserstein generative adversarial networks,” in Proc. International Conference on Machine Learning(CML), pp.214-223, July 2017.
[25] I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. C. Courville, “Improved training of wasserstein gans,” in Proc. International Conference on Neural Information Processing Systems(NIPS), pp. 5769-5779, Dec. 2017.
[26] T. Miyato, T. Kataoka, M. Koyama, and Y. Yoshida, “Spectral normalization for generative adversarial networks,” in Proc. International Conference on Learning Representations, Feb. 2018.
[27] J. Heinonen, “Lectures on Lipschitz analysis,” in University of Jyväskylä, 2005.
[28] X. Mao, Q. Li, H. Xie, R. Y. Lau, Z. Wang, and S. Paul Smolley, “Least squares generative adversarial networks,” in Proc. IEEE International Conference on Computer Vision(ICCV), pp. 2794-2802, Oct. 2017.
[29] S. Ramakrishnan, S. Pachori, A. Gangopadhyay, and S. Raman, “Deep generative filter for motion deblurring,” in Proc. IEEE International Conference on Computer Vision Workshops, pp. 2993-3000, Sep. 2017.
[30] M. Mirza, and S. Osindero, “Conditional generative adversarial nets,” in arXiv preprint arXiv:1411.1784, Nov. 2014.
[31] S. Zheng, Z. Zhu, J. Cheng, Y. Guo, and Y. Zhao, “Edge heuristic GAN for non-uniform blind deblurring,” IEEE Signal Processing Letters, pp. 1546-1550, July 2019.
[32] J. Johnson, A. Alahi, and L. Fei-Fei, “Perceptual losses for real-time style transfer and super-resolution,” in Proc. European Conference on Computer Vision, pp. 694-711, March 2016.
[33] J. Pan, D. Sun, H. Pfister, and M.-H. Yang, “Blind image deblurring using dark channel prior,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition(CVPR), pp. 1628-1636, June 2016.
[34] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” in Proc. International Conference on Learning Representations(ICLR), pp. 1-14, May 2015.
[35] H. Tomosada, T. Kudo, T. Fujisawa, and M. Ikehara, “GAN-Based Image Deblurring Using DCT Discriminator,” in Proc. 25th IEEE International Conference on Pattern Recognition, pp. 3675-3681, Jan. 2021.
[36] C. Szegedy, S. Ioffe, V. Vanhoucke, and A. A. Alemi, “Inception-v4, inception-ResNet and the impact of residual connections on learning,” in 35th AAAI Conference on Artificial Intelligence, pp. 4278-4284, Feb. 2017.
[37] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L. C. Chen, “MobileNetV2: inverted residuals and linear bottlenecks,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp.4510-4520, June 2018.
[38] F. Chollet, “Xception: Deep learning with depthwise separable convolutions,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 1251-1258, July 2017.
[39] J. Rim, H. Lee, J. Won, and S. Cho, “Real-world blur dataset for learning and benchmarking deblurring algorithms,” in Proc. European Conference on Computer Vision, pp. 184-201, Aug. 2020.
[40] A. Hore, amd D. Ziou, “Image quality metrics: PSNR vs. SSIM,” in Proc. International Conference on Pattern Recognition, pp. 2366-2369, Aug. 2010.
[41] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: from error visibility to structural similarity,” IEEE Transactions on Image Processing, Vol. 13, No. 4, pp. 600-612, April 2004.

指導教授

唐之瑋(Chih-Wei Tang)

審核日期

2021-7-19

推文