孿生變化檢測網路結合注意力機制及多尺度特徵之空拍和遙測影像檢測模型;Siamese Networks with Attention Mechanism and Multiscale Features for Aerial and Remote Sensing Images Change Detection

NCU Institutional Repository > 資訊電機學院 > 資訊工程研究所 > 博碩士論文 > Item 987654321/89787

請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/89787

題名:	孿生變化檢測網路結合注意力機制及多尺度特徵之空拍和遙測影像檢測模型;Siamese Networks with Attention Mechanism and Multiscale Features for Aerial and Remote Sensing Images Change Detection
作者:	叢伯蘭;Tsung, Po-Lan
貢獻者:	資訊工程學系
關鍵詞:	變化檢測;孿生神經網路;注意力機制;特徵融合;Change Detection;Siamese Network;Attention Mechanism;Multiscale Features Fusion
日期:	2022-07-21
上傳時間:	2022-10-04 11:59:45 (UTC+8)
出版者:	國立中央大學
摘要:	隨著衛星及空拍機軟硬體上的技術發展，想要取得高解析度的遙測影像資料越來越容易，也促使遙測影像有著眾多的相關研究及應用，而變化檢測(Change Detection)則是其中一項重要的研究議題，以往的方法大致分為像素(piexl-based)和物件(object-based)兩種，運用演算法、統計分析(PCA)或是機器學習分類器等，但上述方法容易受到背景雜訊、偵測目標大小等因素所影響。近年來深度學習技術被廣泛運用在變化檢測的各項應用上，本篇論文提出一個孿生的變化檢測網路用於遙測、空拍影像內的建築物變化檢測，以辨識建建築物是否新建或拆除，模型以編碼器及解碼器當作基礎架構，結合通道注意力(Channel Attention)、空間注意力(Spatial Attention)及自我交叉注意力(Self and Cross Attention)等機制，並在編碼器的骨幹網路設計多尺度特徵融合，輸出二值化結果圖。本論文模型可以端到端訓練，輸入不同時段拍攝之兩張圖片後得到變化圖(change map)，選擇LEVIR-CD、WHU及CDD三種不同區域遙測及空拍影像資料集做為實驗訓練及測試使用，並以精確率、招回率、F1 Score、總體準確率及交併比當作驗證指標，相比其他方法皆有較佳的分數結果。 ;With the advance of satellite and aerial camera technology, obtaining high resolution remote sensing and aerial images is getting easier. Change detection is one of the important topics in numerous studies and applications of remote sensing. Previous methods are roughly divided into two types, pixel-based and object-based. These methods include thresholding algorithms, statistical analysis like PCA, machine learning classifiers, etc. But the methods mentioned above are easily affected by background noises or the sizes of detected objects, which lead to unsatisfying outcomes. We propose a siamese network for building change detection in the remote sensing and aerial images. The goal is to identify whether the building is new or has already been demolished. The proposed network takes two images taken at different times as its input and output a binary change map. The model is based on an encoder and decoder architecture, with channel attention, spatial attention, self and cross attention mechanisms. We use multiscale feature fusion in the feature extraction backbone module. The network is trained in an end-to-end method. In the experiments, we select LEVIR-CD, WHU and CDD datasets for training and testing. We use precision, recall, overall accuracy, F1 score, and IoU as model evaluation metrics. Our results show better performance compared to other state-of-the-art methods.
顯示於類別:	[資訊工程研究所] 博碩士論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	32	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....