監督性變換器模型對變遷偵測應用的  預訓練與微調策略

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：147

、訪客IP：3.133.116.38

姓名

柯浩飛(Juan Felipe Giraldo Cardenas) 查詢紙本館藏

畢業系所

遙測科技碩士學位學程

論文名稱

監督性變換器模型對變遷偵測應用的預訓練與微調策略
(Supervised transformer-based models pre-training and fine-tunning strategies for change detection)

相關論文

★ 利用影像處理進行遙測影像的河道偵測與醫學影像的血管偵測	★ 可調式都卜勒主動雷達校正器之改良研究
★ 基於色彩校正的遙測影像變遷偵測	★ 應用階層式親和力傳播理論進行高光譜影像分類
★ 遙測影像中雲及其陰影的移除及雲高估計	★ 龜山島周圍海域熱液與地震的關係
★ 利用穿牆連續波雷達分析人體步態的微都卜勒效應	★ 新穎的混合式角反射器法於全極化合成孔徑雷達校正
★ 應用多光譜遙測影像進行線性及非線性水深反演模式之探討	★ 非線性像元分解考慮多次反射應用於高光譜影像
★ 使用MODIS偵測地溫異常-熱異常和地震的相關性	★ 多光譜遙測影像自動偵測城市道路
★ 地球同步衛星觀測資料之雲區像素辨識	★ 結合掩星折射率與高光譜紅外線觀測之大氣溫溼度垂直剖面反演
★ 應用遙測影像之水深校正於東沙環礁海草棲地變遷	★ 基於SAR的數值高程模型的定性與定量分析

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 (2025-8-1以後開放)

摘要(中)

影像變遷偵測是遙測重要應用之一，其目的在自動偵測同一場景在不同時間拍攝的兩張或多個影像之間的變化。然而，對機器學習演算法，大多數資料庫的樣本數量都很少，導致模型產生過度擬合的問題。為了應對這項挑戰，我們使用一些遷移學習策略，將一個資料集中獲得的知識傳輸到新的訓練並進行微調，以便新模型能從兩個資料集中學習，並且可以成為能夠跨不同資料庫的模型。目前最先進的方法依賴深度學習和變換器(Transformer)架構。本研究基於變換器模型（特別是 BIT 和 ChangeFormer）在使用遷移學習檢測對不同資料庫的效能。研究目的在利用對變換器全面環境進行建模的能力來提高變遷偵測的準確性。透過在 LEVIR-CD、WHU-CD 和 DSIFN-CD 等三個資料庫上評估這些模型，包括它們在各種場景下的適應性和穩健性。評估指標包括整體準確度 (Overall Accuracy)、交並比 (Intersection-over-Union)、F1 分數、精確度和召回率。透過將知識從一個資料庫轉移到另一個
資料庫的微調模型，利用指標顯示變遷偵測的改進，展示轉換器和遷移學習管道可以幫助處理變遷偵測任務的策略。

摘要(英)

Image change detection is an important task in remote sensing, aiming to automatically detect changes between two or more images of the same scene taken at different times. However, most of the available datasets are small, leading the models to overfitting. To deal with this challenge, we used some transfer learning strategies to leverage the knowledge obtained in one dataset to be transmitted to a new training (fine-tuning), so that the new model learns from both datasets and can be generalized being able to model global context across datasets. State-of-the-art approaches rely their methods on deep learning and transformer architectures. This research investigates the effectiveness of transformer-based models, specifically the Bitemporal Image Transformer (BIT) and ChangeFormer, in detecting changes across different datasets using transfer learning. The study aims to leverage the ability to model global context of transformers to enhance change detection accuracy. By evaluating these models on datasets such as LEVIR-CD, WHU-CD, and DSIFN-CD, we assess their adaptability and robustness in various scenarios. The metrics to evaluate our pipelines are the Overall Accuracy (OA), Intersection-over-Union (IoU), F1-score, Precision and Recall. By transferring knowledge from a dataset to a fine-tuned model on another dataset, the metrics show an improvement detecting changes demonstrating that transformers and transfer learning pipelines can help to deal with
change detection tasks.

關鍵字(中)

★ 變換器

關鍵字(英)

★ Transformer

論文目次

摘要 ................................................................................................................. I
Abstract ........................................................................................................... II
Contents ........................................................................................................ IV
List of Figures ............................................................................................... VI
List of Tables .............................................................................................. VIII
Explanation of symbols ................................................................................ IX
Chapter 1 Introduction .................................................................................. 1
1.1 Motivation ...................................................................................... 1
1.2 Objectives ....................................................................................... 2
1.3 Overview ........................................................................................ 3
1.4 Thesis organization ......................................................................... 6
Chapter 2 Literature review .......................................................................... 7
2.1 Transformer-based architecture with the Bitemporal Image
Transformer (BIT).............................................................................................. 8
CNN Backbone ................................................................................ 9
Bitemporal Image Transformer (BIT) ...........................................10
Prediction head ..............................................................................15
2.2 ChangeFormer ..............................................................................16
Hierarchical Transformer Encoder ................................................17
Difference module .........................................................................18
MLP decoder .................................................................................19
Chapter 3 Methodology ...............................................................................21
3.1 Datasets .........................................................................................21
3.2 Training and testing strategies ......................................................26
3.3 Metrics ..........................................................................................30
3.4 Implementation details .................................................................33
Chapter 4 Results and discussion ................................................................34
4.1 Testing BIT ...................................................................................34
Without fine-tuning .......................................................................34
With fine-tuning ............................................................................35
4.2 Testing ChangeFormer .................................................................43
Without fine-tuning .......................................................................43
With fine-tuning ............................................................................45
Chapter 5 Conclusions and future work ......................................................55
5.1 Conclusions ..................................................................................55
5.2 Future work ..................................................................................56
References ......................................................................................................58

參考文獻

[1] A. Singh, “Review article digital change detection techniques using remotely sensed data,” Int. J. Remote Sens., vol. 10, no. 6, pp. 989–1003, Jun. 1989.
[2] Y. Lin, L. Zhang and N. Wang, "A New Time Series Change Detection Method for Landsat Land use and Land Cover Change," 2019 10th International Workshop on the Analysis of Multitemporal Remote Sensing Images (MultiTemp), Shanghai, China, 2019, pp. 1-4
[3] J. Z. Xu, W. Lu, Z. Li, P. Khaitan, and V. Zaytseva, “Building damage detection in satellite imagery using convolutional neural networks,” 2019, arXiv:1910.06444.
[4] X. Huang, L. Zhang and T. Zhu, "Building Change Detection From Multitemporal High-Resolution Remotely Sensed Images Based on a
Morphological Building Index," in IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 7, no. 1, pp. 105-115, Jan. 2014.
[5] H. Chen and Z. Shi, “A spatial-temporal attention-based method and a new dataset for remote sensing image change detection,” Remote. Sens., vol. 12, no. 10, p. 1662, May 2020.
[6] Z. Li, W. Shi, P. Lu, L. Yan, Q. Wang, and Z. Miao, “Landslide mapping from aerial photographs using change detection-based Markov random field,” Remote Sens. Environ., vol. 187, pp. 76–90, 2016.
[7] S. Liu, M. Chi, Y. Zou, A. Samat, J. A. Benediktsson, and A. Plaza, “Oil spill detection via multitemporal optical remote sensing images: A change detection perspective,” IEEE Geosci. Remote Sens. Lett., vol. 14, no. 3, pp. 324–328, Mar. 2017.
[8] Z. Zheng, Y. Wan, Y. Zhang, S. Xiang, D. Peng, and B. Zhang, “CLNet: Cross-layer convolutional neural network for change detection in optical remote sensing imagery,” ISPRS J. Photogramm. Remote Sens., vol. 175, pp. 247-267, May 2021.
[9] R. Liu, D. Jiang, L. Zhang, and Z. Zhang, “Deep depthwise sep arable convolutional network for change detection in optical aerial images,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 13, pp. 1109–1118, 2020.
[10] Y. Zhan, K. Fu, M. Yan, X. Sun, H. Wang, and X. Qiu, “Change detection based on deep Siamese convolutional network for optical aerial images,” IEEE Geosci. Remote Sens. Lett., vol. 14, no. 10, pp. 1845–1849, Oct. 2017.
[11] M. Yang, L. Jiao, F. Liu, B. Hou, and S. Yang, “Transferred deep learning-based change detection in remote sensing images,” IEEE Trans. Geosci. Remote Sens., vol. 57, no. 9, pp. 6960–6973, Sep. 2019.
[12] R. Caye Daudt, B. Le Saux, and A. Boulch, “Fully convolutional Siamese networks for change detection,” in Proc. 25th IEEE Int. Conf. Image Process. (ICIP), Oct. 2018, pp. 4063–4067.
[13] M. Zhang, G. Xu, K. Chen, M. Yan, and X. Sun, “Triplet-based semantic relation learning for aerial remote sensing image change detection,” IEEE Geosci. Remote Sens. Lett., vol. 16, no. 2, pp. 266–270, Feb. 2019.
[14] J.-M. Park, U.-H. Kim, S.-H. Lee, and J.-H. Kim, “Dual task learn ing by leveraging both dense correspondence and Mis-correspondence for robust change detection with imperfect matches,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2022, pp. 13739–13749.
[15] H. Chen and Z. Shi, “A spatial-temporal attention-based method and a new dataset for remote sensing image change detection,” Remote. Sens., vol. 12, no. 10, p. 1662, 2020.
[16] X. Peng, R. Zhong, Z. Li, and Q. Li, “Optical remote sensing image change detection based on attention mechanism and im age difference,” IEEE Transactions on Geoscience and Remote Sensing, pp. 1–12, 2020.
[17] H. Jiang, X. Hu, K. Li, J. Zhang, J. Gong, and M. Zhang, “Pga-siamnet: Pyramid feature-based attention-guided siamese network for remote sensing ortho imagery building change detection,” Remote Sensing, vol. 12, no. 3, p. 484, 2020.
[18] Hao Chen, Zipeng Qi, and Zhenwei Shi, “Remote sensing image change detection with transformers,” IEEE Transactions on Geoscience and Remote Sensing, 2021.
[19] W. G. C. Bandara and V. M. Patel, "A Transformer-Based Siamese Network for Change Detection," IGARSS 2022 - 2022 IEEE International Geoscience and Remote Sensing Symposium, Kuala Lumpur, Malaysia, 2022, pp. 207-210.
[20] C. Zhang, L. Wang, S. Cheng, and Y. Li, “SwinSUNet: Pure transformer network for remote sensing image change detection,” IEEE Trans. Geosci. Remote Sens., vol. 60, 2022, Art. no. 5224713.
[21] J. Lin, L. Zhao, S. Li, R. Ward, and Z. J. Wang, “Active-learning incorporated deep transfer learning for hyperspectral image classification,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 11, no. 11, pp. 4048–4062, Nov.
2018.
[22] J. Lin, C. He, Z. J. Wang, and S. Li, “Structure preserving transfer learning for unsupervised hyperspectral image classification,” IEEE
[23] Y. Yuan and L. Lin, "Self-Supervised Pretraining of Transformers for Satellite Image Time Series Classification," in IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 14, pp. 474-487, 2021
[24] H. Chen and Z. Shi, “A spatial-temporal attention-based method and a new dataset for remote sensing image change detection,” Remote. Sens., vol. 12, no. 10, p. 1662, 2020.
[25] S. Ji, S. Wei, and M. Lu, “Fully convolutional networks for multisource building extraction from an open aerial and satellite imagery data set,” IEEE Trans. Geoscience and Remote Sensing, vol. 57, no. 1, pp. 574–586, 2019.
[26] C. Zhang, P. Yue, D. Tapete, L. Jiang, B. Shangguan, L. Huang, and G. Liu, “A deeply supervised image fusion network for change detection in high resolution bi-temporal remote sensing images,” ISPRS, vol. 166, pp. 183–200, 2020.
[27] A. Vaswani et al., “Attention is all you need,” in Proc. Adv. Neural Inf. Process. Syst., vol. 30, 2017, pp. 1–11.
[28] A. Dosovitskiy et al., “An image is worth 16×16 words: Transformers for image recognition at scale,” 2020, arXiv:2010.11929.
[29] C. R. Chen, Q. Fan, and R. Panda, “CrossViT: Cross-attention multi scale vision transformer for image classification,” in Proc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), Oct. 2021, pp. 347–356.
[30] B. Cheng, I. Misra, A. G. Schwing, A. Kirillov, and R. Girdhar, “Masked-attention mask transformer for universal image segmentation,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2022, pp. 1280–1289.
[31] S. Zheng et al., “Rethinking semantic segmentation from a sequence to sequence perspective with transformers,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2021, pp. 6881–6890.
[32] D. Zhang, H. Zhang, J. Tang, M. Wang, X. Hua, and Q. Sun, “Feature pyramid transformer,” in Computer Vision—ECCV 2020, vol. 12373, A. Vedaldi, H. Bischof, T. Brox, and J.-M. Frahm, Eds. Glasgow, U.K.: Springer, 2020, pp. 323–339.
[33] X. Zhu, W. Su, L. Lu, B. Li, X. Wang, and J. Dai, “Deformable DETR: Deformable transformers for end-to-end object detection,” 2020,
arXiv:2010.04159.
[34] F. Yang, H. Yang, J. Fu, H. Lu, and B. Guo, “Learning texture transformer network for image super-resolution,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2020, pp. 5791–5800.
[35] H. Chen et al., “Pre-trained image processing transformer,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2021, pp.
12299–12310.
[36] Wenhai Wang, Enze Xie, Xiang Li, Deng-Ping Fan, Kaitao Song, Ding Liang, Tong Lu, Ping Luo, and Ling Shao, “Pyramid vision transformer: A versatile back bone for dense prediction without convolutions,” arXiv preprint
arXiv:2102.12122, 2021.

指導教授

任玄(Hsuan Ren)

審核日期

2024-7-26

推文