具全局及局部特徵自注意力機制之高效通用型影像去模糊神經網路

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：54

、訪客IP：3.149.25.162

姓名

林羿恒(Yz-Heng Lin) 查詢紙本館藏

畢業系所

電機工程學系

論文名稱

具全局及局部特徵自注意力機制之高效通用型影像去模糊神經網路
(An Efficient Universal Image Deblurring Neural Network with Global and Local Feature Self-Attention)

相關論文

★ 即時的SIFT特徵點擷取之低記憶體硬體設計	★ 即時的人臉偵測與人臉辨識之門禁系統
★ 具即時自動跟隨功能之自走車	★ 應用於多導程心電訊號之無損壓縮演算法與實現
★ 離線自定義語音語者喚醒詞系統與嵌入式開發實現	★ 晶圓圖缺陷分類與嵌入式系統實現
★ 語音密集連接卷積網路應用於小尺寸關鍵詞偵測	★ G2LGAN: 對不平衡資料集進行資料擴增應用於晶圓圖缺陷分類
★ 補償無乘法數位濾波器有限精準度之演算法設計技巧	★ 可規劃式維特比解碼器之設計與實現
★ 以擴展基本角度CORDIC為基礎之低成本向量旋轉器矽智產設計	★ JPEG2000靜態影像編碼系統之分析與架構設計
★ 適用於通訊系統之低功率渦輪碼解碼器	★ 應用於多媒體通訊之平台式設計
★ 適用MPEG 編碼器之數位浮水印系統設計與實現	★ 適用於視訊錯誤隱藏之演算法開發及其資料重複使用考量

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 (2027-4-1以後開放)

摘要(中)

影像去模糊演算法一直以來都是一項重要的電腦視覺任務，無論是日常應用或是醫學應用，其目的旨在將受到退化破壞的影像還原回原始清晰且銳利的影像。傳統影像去模糊算法在當影像帶有複雜的晃動、失焦等模糊干擾時，無法有效地重建出原始影像中的細節。受視覺變換器模型(Vision Transformer)在各種任務中取得成功的啟發，本論文提出了一個創新架構，利用特徵自注意機制，同時在全局和局部範圍內捕捉影像中的模糊特徵並消除。為減輕運算負擔，許多視覺變換器技術都採用將影像分割成多個視窗的策略，接著對每個獨立視窗內的關係進行建模。然而，這些方法對視窗之間的資訊交換造成了限制，從而影響了整體效能。因此我們提出了一個變換器模組，包括將影像分割成水平和垂直條紋以捕捉長距離模糊特徵，以及使用窗口捕捉短距離模糊特徵。為進一步擴大變換器的視覺感受野(Receptive Field)，我們額外提出了一個有效的算法，計算不同視窗之間的相關性。為了評估模型的有效性及泛用性，我們在多個一般影像以及MRI影像資料集上進行了實驗與驗證。驗證結果證明，與當前影像去模糊領域中數個最先進方法相較之下，我們所提出的架構在影像修復能力上是具有競爭力的，並能夠達到更高的影像PSNR (Peak Signal Noise Ratio)和SSIM (Structural Similarity)。

摘要(英)

The enhancement of image clarity has always been a low-level computer vision task, aiming to restore degraded images to their original clear and sharp state. Traditional image enhancement algorithms struggle to effectively reconstruct details in the original image when it is affected by complex factors such as motion blur and defocus. Inspired by the success of Vision Transformer models in various tasks, this paper proposes an innovative framework that utilizes a feature self-attention mechanism to simultaneously capture and eliminate blurry features in both global and local contexts within the image. To alleviate computational burden, many Vision Transformer techniques adopt a strategy of dividing images into multiple windows and then modeling relationships within each independent window. However, these methods impose limitations on information exchange between windows, thereby affecting overall performance. Therefore, we propose a transformer module that involves segmenting images into horizontal and vertical stripes to capture long-distance blurry patterns, and using windows to capture short-distance blurry features. To further expand the visual receptive field of the transformer, we additionally introduce an effective algorithm to compute correlations between different windows. To assess the model′s effectiveness and generalization, we conducted experiments on several general image and MRI datasets, validating our approach. The results indicate that compared to several state-of-the-art methods in the image restoration domain, our proposed architecture is competitive in image restoration capability, achieving higher Peak Signal Noise Ratio (PSNR) and Structural Similarity (SSIM).

關鍵字(中)

★ 影像去模糊
★ 深度學習
★ 特徵自注意力機制

關鍵字(英)

★ Image Deblurring
★ Self-Attention
★ Transformer

論文目次

摘要 I
Abstract II
表目錄 IV
圖目錄 V
1. 緒論 1
1.1 研究背景與動機 1
2. 文獻探討 5
2.1 影像清晰化 5
2.2 傳統影像清晰化演算法 5
2.3 基於深度學習之影像清晰化演算法 7
2.4 用於MRI移動性偽影校正之影像清晰化 16
3. 影像去模糊神經網路架構 19
3.1 設計動機與構想 19
3.2 神經網路架構介紹 20
3.3 局部特徵自注意力模組 21
3.4 全局特徵自注意力模組 23
4. 磁振造影(MRI)移動性偽影校正 27
4.1 MRI移動性偽影成因 27
4.2 偽影消除方式 28
4.3 基於生成對抗訓練策略之MRI移動性偽影校正神經網路 29
4.4 臨床MRI偽影消除方式 31
5. 實驗結果與討論 33
5.1 資料集 33
5.2 訓練及驗證細節 35
5.3 量化比較結果 38
5.4 消融實驗 45
5.5 討論 47
6. 結論 49
參考文獻 50

參考文獻

[1] Cho, Sunghyun, and Seungyong Lee, Fast motion deblurring. ACM SIGGRAPH Asia 2009 papers, 2009, 1-8.
[2] L. Sun, S. Cho, J. Wang and J. Hays, Edge-based blur kernel estimation using patch priors, IEEE International Conference on Computational Photography (ICCP), Cambridge, MA, USA, 2013, pp. 1-8, doi: 10.1109/ICCPhot.2013.6528301.
[3] Seungjun Nah, Tae Hyun Kim, and Kyoung Mu Lee, Deep multi-scale convolutional neural network for dynamic scene deblurring, In CVPR, 2017.
[4] KUPYN, Orest, et al. “Deblurgan: Blind motion deblurring using conditional adversarial networks.” In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2018. p. 8183-8192.
[5] H. Gao, X. Tao, X. Shen and J. Jia, Dynamic Scene Deblurring with Parameter Selective Sharing and Nested Skip Connections, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 2019, pp. 3843-3851, doi: 10.1109/CVPR.2019.00397.
[6] Orest Kupyn, Tetiana Martyniuk, Junru Wu, and Zhangyang Wang. DeblurGAN-v2: Deblurring (orders-of-magnitude) faster and better. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019.
[7] Sung-Jin Cho, Seo-Won Ji, Jun-Pyo Hong, Seung-Won Jung, and Sung-Jea Ko. Rethinking coarse-to-fine approach in single image deblurring. In ICCV, 2021.
[8] Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Ming-Hsuan Yang, and Ling Shao. Multi-stage progressive image restoration. In CVPR, 2021.
[9] Maitreya Suin, Kuldeep Purohit, and A. N. Rajagopalan. Spatially-attentive patch-hierarchical network for adaptive motion deblurring. In CVPR, 2020.
[10] Xin Tao, Hongyun Gao, Xiaoyong Shen, Jue Wang, and Jiaya Jia. Scale-recurrent network for deep image deblurring. In CVPR, 2018.
[11] Zhang, H., Dai, Y., Li, H., Koniusz, P. Deep stacked hierarchical multi-patch network for image deblurring. IEEE Conference on Computer Vision and Pattern Recognition, 2019.
[12] Kim, K., Lee, S., Cho, S. (2023). MSSNet: Multi-Scale-Stage Network for Single Image Deblurring. In: Karlinsky, L., Michaeli, T., Nishino, K. (eds) Computer Vision – ECCV 2022 Workshops. ECCV 2022. Lecture Notes in Computer Science, vol 13802. Springer, Cham. https://doi.org/10.1007/978-3-031-25063-7_32
[13] Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. An image is worth 16x16words: Trans formers for image recognition at scale. In ICLR, 2021.
[14] Bao, Hangbo, et al. BEiT: BERT Pre-Training of Image Transformers. International Conference on Learning Representations. 2021.
[15] Caron, Mathilde, et al. Emerging properties in self-supervised vision transformers. Proceedings of the IEEE/CVF international conference on computer vision. 2021.
[16] Carion, Nicolas, et al. End-to-end object detection with transformers. European conference on computer vision. Cham: Springer International Publishing, 2020.
[17] Li, Yanghao, et al. Exploring plain vision transformer backbones for object detection. European Conference on Computer Vision. Cham: Springer Nature Switzerland, 2022.
[18] Liu, Ze, et al. Swin transformer: Hierarchical vision transformer using shifted windows. Proceedings of the IEEE/CVF international conference on computer vision. 2021.
[19] Jingyun Liang, Jiezhang Cao, Guolei Sun, Kai Zhang, Luc Van Gool, and Radu Timofte. SwinIR: Image restoration using swin transformer. In ICCV Workshops, 2021.
[20] Fu-Jen Tsai, Yan-Tsung Peng, Yen-Yu Lin, Chung-Chi Tsai, and Chia-Wen Lin. Stripformer: Strip transformer for fast image deblurring. In Proceedings of the European Confer ence on Computer Vision, pages 146–162. Springer, 2022.
[21] Sun, Jian, et al. “Learning a convolutional neural network for non-uniform motion blur removal.” Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.
[22] Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-Net: convolutional networks for biomedical image segmentation. In MICCAI, 2015.
[23] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention Is All You Need. In NeurIPS, 2017.
[24] Jingyun Liang, Jiezhang Cao, Guolei Sun, Kai Zhang, Luc Van Gool, and Radu Timofte. SwinIR: Image restoration using swin transformer. In ICCV Workshops, 2021.
[25] Zhendong Wang, Xiaodong Cun, Jianmin Bao, Wengang Zhou, Jianzhuang Liu, and Houqiang Li. Uformer: A general u-shaped transformer for image restoration. In CVPR, 2022.
[26] Lyu, Qing, et al. “Cine cardiac MRI motion artifact reduction using a recurrent neural network.” IEEE transactions on medical imaging 40.8 (2021): 2170-2181.
[27] O. Bernard, A. Lalande, C. Zotti, F. Cervenansky, X. Yang, P.-A.Heng, I. Cetin, K. Lekadir, O. Camara, M. A. G. Ballester, et al., “Deep learning techniques for automatic MRI cardiac multi-structures segmentation and diagnosis: is the problem solved,” IEEE transactions on medical imaging, vol. 37, no. 11, pp. 2514–2525, 2018.
[28] G. Oh, J. E. Lee and J. C. Ye, “Unpaired MR Motion Artifact Deep Learning Using Outlier-Rejecting Bootstrap Aggregation,” in IEEE Transactions on Medical Imaging, vol. 40, no. 11, pp. 3125-3139, Nov. 2021, doi: 10.1109/TMI.2021.3089708.
[29] T.-H. Tsai, Y.-H. Lin, and T.-S. Lin, “Motion Artifact Correction in MRI using GAN-Based Channel Attention Transformer,” 2023 IEEE Biomedical Circuits and Systems Conference (BioCAS)
[30] J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired image-to-image translation using cycle-consistent adversarial networks,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 2223–2232.
[31] Hu, Jie, Li Shen, and Gang Sun. “Squeeze-and-excitation networks.” Proceedings of the IEEE conference on computer vision and pattern recognition. 2018.
[32] Woo, Sanghyun, et al. “Cbam: Convolutional block attention module.” Proceedings of the European conference on computer vision (ECCV). 2018.
[33] Wang, Xiaolong, et al. “Non-local neural networks.” Proceedings of the IEEE conference on computer vision and pattern recognition. 2018.
[34] Gao, Yunhe, Mu Zhou, and Dimitris N. Metaxas. “UTNet: a hybrid transformer architecture for medical image segmentation.” Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part III 24. Springer International Publishing, 2021.
[35] Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, and Ming-Hsuan Yang. Restormer: Efficient transformer for high-resolution image restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5728– 5739, 2022.
[36] Chu, X., Tian, Z., Zhang, B., Wang, X., Wei, X., Xia, H., Shen, C.: Conditional positional encodings for vision transformers. In: arxiv preprint arXiv:2102.10882 (2021)
[37] Elster AD, Burdette JH. Questions and Answers in MRI, 2nd ed. St. Louis: Mosby, 2001;pp 72-101
[38] M. Hedley, H. Yan and D. Rosenfeld, “Motion artifact correction in MRI using generalized projections,” in IEEE Transactions on Medical Imaging, vol. 10, no. 1, pp. 40-46, March 1991, doi: 10.1109/42.75609.
[39] JIN, Kyong Hwan, et al. MRI artifact correction using sparse+ low‐rank decomposition of annihilating filter‐based hankel matrix. Magnetic resonance in medicine, 2017, 78.1: 327-340.
[40] ISOLA, Phillip, et al. Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2017. p. 1125-1134.
[41] MAO, Xudong, et al. Least squares generative adversarial networks. In: Proceedings of the IEEE international conference on computer vision. 2017. p. 2794-2802.
[42] Ziyi Shen, Wenguan Wang, Jianbing Shen, Haibin Ling, Tingfa Xu, and Ling Shao. Human-Aware Motion Deblurring. In ICCV, 2019.
[43] J. Zbontar et al., “fastMRI: An open dataset and benchmarks for accelerated MRI”, 2018.
[44] NÁ RAI, Á dám, et al. Movement-related artefacts (MR-ART) dataset of matched motion-corrupted and clean structural MRI brain scans. Scientific Data, 2022, 9.1: 630.
[45] Ilya Loshchilov and Frank Hutter. Decoupled Weight Decay Regularization. arXiv preprint arXiv:1711.05101, 2017.
[46] Xiaojie Chu, Liangyu Chen, Chengpeng Chen, and Xin Lu. Improving image restoration by revisiting global information aggregation. In ECCV, 2022.
[47] Dongwon Park, Dong Un Kang, Jisoo Kim, and Se Young Chun. Multi-temporal recurrent neural networks for progres sive non-uniform single image deblurring with incremental temporal training. In ECCV, 2020.
[48] T.-H. Tsai, Y.-H. Lin, and T.-S. Lin, “Motion Artifact Correction in MRI using GAN-Based Channel Attention Transformer,” 2023 IEEE Biomedical Circuits and Systems Conference (BioCAS)
[49] Huang, Jiahao, et al. “Swin transformer for fast MRI.” Neurocomputing 493 (2022): 281-304.

指導教授

蔡宗漢(Tsung-Han Tsai)

審核日期

2024-3-14

推文