紅外線與可見光影像融合之語意分割資料集建立 及其對融合效果的影響評估

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：18

、訪客IP：3.148.231.72

姓名

常興唯(Hsing-Wei Chang) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

紅外線與可見光影像融合之語意分割資料集建立及其對融合效果的影響評估
(Establishment and Evaluation of a Semantic Segmentation Dataset for Infrared and Visible Image Fusion)

相關論文

★ 基於QT之跨平台無線心率分析系統實現	★ 網路電話之額外訊息傳輸機制
★ 針對與運動比賽精彩畫面相關串場效果之偵測	★ 植基於向量量化之視訊/影像內容驗證技術
★ 植基於串場效果偵測與內容分析之棒球比賽精華擷取系統	★ 以視覺特徵擷取為基礎之影像視訊內容認證技術
★ 使用動態背景補償以偵測與追蹤移動監控畫面之前景物	★ 應用於H.264/AVC視訊內容認證之適應式數位浮水印
★ 棒球比賽精華片段擷取分類系統	★ 利用H.264/AVC特徵之多攝影機即時追蹤系統
★ 利用隱式型態模式之高速公路前車偵測機制	★ 基於時間域與空間域特徵擷取之影片複製偵測機制
★ 結合數位浮水印與興趣區域位元率控制之車行視訊編碼	★ 應用於數位智權管理之H.264/AVC視訊加解密暨數位浮水印機制
★ 基於文字與主播偵測之新聞視訊分析系統	★ 植基於數位浮水印之H.264/AVC視訊內容驗證機制

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 (2025-7-25以後開放)

摘要(中)

影像融合的目的是整合不同類型的輸入影像，透過影像間的互補資訊生成具更完整場景顯示和視覺感知的影像，以支援後續的進階視覺任務，例如物件偵測與語意分割等。紅外線與可見光影像融合是受到廣泛關注的研究領域，但使用深度學習方法進行模型訓練時通常需要大量的標記資料，現有的紅外線與可見光影像融合資料集卻只提供影像，缺乏精確的物件標記以及語意分割等，從而影響影像融合結果的呈現，也限制了相關領域的進一步發展。本研究提出創建具語意分割資訊的紅外線與可見光影像融合資料集方法，利用現有的語意分割資料集的一般影像，以風格轉換方式生成相對應的紅外線影像，依此建立具標記的融合影像資料集，即每組紅外線與可見光影像皆包含對應的語意分割標記。這樣的資料集建立方式能夠提升影像融合效果，也能針對實際拍攝的紅外線與可見光影像可能出現畫面解析度不同及內容錯位的問題，提供基於語意分割遮罩的對齊方法，將紅外線及可見光影像進行重新採樣對齊，對於此類研究中常見的對齊前處理能節省不少的時間與人力。

摘要(英)

The purpose of image fusion is to integrate different types of input images and generate a more complete image with improved scene representation and visual perception, supporting advanced vision tasks such as object detection and semantic segmentation. Infrared and visible image fusion is a widely studied research area, but training fusion models using deep learning methods often requires a large amount of annotated data. Existing infrared and visible image fusion datasets only provide images without precise object annotations or semantic segmentation, which affects the presentation of fusion results and limits the further development of related fields. In this study, we propose a method to create a dataset for infrared and visible image fusion with semantic segmentation information. We utilize general images from existing semantic segmentation datasets and generate corresponding infrared images using style transfer techniques. This allows us to establish a labeled fusion image dataset, where each pair of infrared and visible images is accompanied by their respective semantic segmentation labels. This dataset creation method improves image fusion performance and can also provide an alignment method based on semantic segmentation masks for disparate resolution and misalignment in real-world infrared and visible images. which saves significant time and resources in the common alignment preprocessing step.

關鍵字(中)

★ 影像融合
★ 影像對齊
★ 語意分割
★ 深度學習
★ 風格轉換

關鍵字(英)

★ Image fusion
★ Image alignment
★ Semantic segmentation
★ Deep learning
★ Style transfer

論文目次

摘要 I
Abstract II
致謝 III
目錄 IV
圖目錄 VI
表目錄 VIII
第一章、緒論 1
1.1. 研究動機 1
1.2. 研究貢獻 4
1.3. 論文架構 5
第二章、相關研究 6
2.1. 典型紅外線與可見光影像融合方法 6
2.1.1. 基於自動編碼器(AE)之融合方法 6
2.1.2. 基於卷積神經網路(CNN)之融合方法 8
2.1.3. 基於生成對抗網路(GAN)之融合方法 9
2.2. 高階視覺任務導向之紅外線與可見光影像融合方法 10
2.3. 紅外線與可見光影像融合資料集 12
第三章、提出方法 14
3.1. 資料集建立 14
3.1.1. Cityscapes資料集 15
3.1.2. 跨模態感知風格轉換網路 15
3.1.3. 以不同資料集訓練CPSTN 16
3.2. 影像對齊 17
3.3. 影像融合 19
3.3.1. 網路架構 19
3.3.2. 損失函數 20
3.3.3. 訓練策略 24
第四章、實驗結果 25
4.1. 開發環境 25
4.2. 測試資料集 25
4.3. 偽紅外線影像生成結果 25
4.4. 紅外線影像語意分割結果 26
4.5. 紅外線與可見光影像對齊結果 27
4.6. 紅外線與可見光影像融合結果 30
4.6.1. 訓練細節 30
4.6.2. 指標評估 31
4.6.3. 融合結果 36
第五章、結論與未來展望 40
5.1. 結論 40
5.2. 未來展望 40
參考文獻 41

參考文獻

[1] D. Wang, J. Liu, X. Fan, and R. Liu, "Unsupervised misaligned infrared and visible image fusion via cross-modality image generation and registration," arXiv preprint arXiv:2205.11876, 2022.
[2] L. Tang, Y. Deng, Y. Ma, J. Huang, and J. Ma, "SuperFusion: A Versatile Image Registration and Fusion Network with Semantic Awareness," IEEE/CAA Journal of Automatica Sinica, vol. 9, no. 12, pp. 2121-2137, 2022, doi: 10.1109/JAS.2022.106082.
[3] J. Liu et al., "Target-aware Dual Adversarial Learning and a Multi-scenario Multi-Modality Benchmark to Fuse Infrared and Visible for Object Detection," in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 18-24 June 2022 2022, pp. 5792-5801, doi: 10.1109/CVPR52688.2022.00571.
[4] L. Tang, J. Yuan, and J. Ma, "Image fusion in the loop of high-level vision tasks: A semantic-aware real-time infrared and visible image fusion network," Information Fusion, vol. 82, pp. 28-42, 2022/06/01/ 2022, doi: https://doi.org/10.1016/j.inffus.2021.12.004.
[5] H. Zhang, H. Xu, X. Tian, J. Jiang, and J. Ma, "Image fusion meets deep learning: A survey and perspective," Information Fusion, vol. 76, pp. 323-336, 2021/12/01/ 2021, doi: https://doi.org/10.1016/j.inffus.2021.06.008.
[6] H. Li and X. J. Wu, "DenseFuse: A Fusion Approach to Infrared and Visible Images," IEEE Transactions on Image Processing, vol. 28, no. 5, pp. 2614-2623, 2019, doi: 10.1109/TIP.2018.2887342.
[7] H. Li, X. J. Wu, and T. Durrani, "NestFuse: An Infrared and Visible Image Fusion Architecture Based on Nest Connection and Spatial/Channel Attention Models," IEEE Transactions on Instrumentation and Measurement, vol. 69, no. 12, pp. 9645-9656, 2020, doi: 10.1109/TIM.2020.3005230.
[8] L. Tang, J. Yuan, H. Zhang, X. Jiang, and J. Ma, "PIAFusion: A progressive infrared and visible image fusion network based on illumination aware," Information Fusion, vol. 83-84, pp. 79-92, 2022/07/01/ 2022, doi: https://doi.org/10.1016/j.inffus.2022.03.007.
[9] J. Ma, W. Yu, P. Liang, C. Li, and J. Jiang, "FusionGAN: A generative adversarial network for infrared and visible image fusion," Information Fusion, vol. 48, pp. 11-26, 2019/08/01/ 2019, doi: https://doi.org/10.1016/j.inffus.2018.09.004.
[10] J. Ma, H. Xu, J. Jiang, X. Mei, and X. P. Zhang, "DDcGAN: A Dual-Discriminator Conditional Generative Adversarial Network for Multi-Resolution Image Fusion," IEEE Transactions on Image Processing, vol. 29, pp. 4980-4995, 2020, doi: 10.1109/TIP.2020.2977573.
[11] A. Toet, "The TNO Multiband Image Data Collection," Data in Brief, vol. 15, pp. 249-251, 2017/12/01/ 2017, doi: https://doi.org/10.1016/j.dib.2017.09.038.
[12] H. Xu, J. Ma, J. Jiang, X. Guo, and H. Ling, "U2Fusion: A Unified Unsupervised Image Fusion Network," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 1, pp. 502-518, 2022, doi: 10.1109/TPAMI.2020.3012548.
[13] X. Jia, C. Zhu, M. Li, W. Tang, S. Liu, and W. Zhou, "LLVIP: A Visible-infrared Paired Dataset for Low-light Vision," arXiv e-prints, p. arXiv:2108.10831, 2021. [Online]. Available: https://ui.adsabs.harvard.edu/abs/2021arXiv210810831J.
[14] M. Cordts et al., "The Cityscapes Dataset for Semantic Urban Scene Understanding," in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 27-30 June 2016 2016, pp. 3213-3223, doi: 10.1109/CVPR.2016.350.
[15] E. Xie, W. Wang, Z. Yu, A. Anandkumar, J. M. Alvarez, and P. Luo, "SegFormer: Simple and efficient design for semantic segmentation with transformers," Advances in Neural Information Processing Systems, vol. 34, pp. 12077-12090, 2021.
[16] C. Peng, T. Tian, C. Chen, X. Guo, and J. Ma, "Bilateral attention decoder: A lightweight decoder for real-time semantic segmentation," Neural Networks, vol. 137, pp. 188-199, 2021/05/01/ 2021, doi: https://doi.org/10.1016/j.neunet.2021.01.021.
[17] J. Liu, X. Fan, J. Jiang, R. Liu, and Z. Luo, "Learning a Deep Multi-Scale Feature Ensemble and an Edge-Attention Guidance for Image Fusion," IEEE Transactions on Circuits and Systems for Video Technology, vol. 32, no. 1, pp. 105-119, 2022, doi: 10.1109/TCSVT.2021.3056725.
[18] G. Qu, D. Zhang, and P. Yan, "Information measure for performance of image fusion," Electronics Letters, vol. 38, pp. 313-315, 04/28 2002, doi: 10.1049/el:20020212.
[19] H. R. Sheikh and A. C. Bovik, "Image information and visual quality," IEEE Transactions on Image Processing, vol. 15, no. 2, pp. 430-444, 2006, doi: 10.1109/TIP.2005.859378.
[20] V. Aslantas and E. Bendes, "A new image quality metric for image fusion: The sum of the correlations of differences," AEU - International Journal of Electronics and Communications, vol. 69, no. 12, pp. 1890-1896, 2015/12/01/ 2015, doi: https://doi.org/10.1016/j.aeue.2015.09.004.
[21] C. Xydeas and V. Petrovic, "Objective image fusion performance measure," Electronics Letters, vol. 36, pp. 308-309, 03/17 2000, doi: 10.1049/el:20000267.

指導教授

蘇柏齊(Po-Chyi Su)

審核日期

2023-7-28

推文