結合前景感知與多尺度注意力機制之語意分割模型應用於土石流偵測;Foreground-Aware and Multi-scale Convolutional Attention Mechanism for Remote Sensing Images Semantic Segmentation in Landslide Detection

NCU Institutional Repository > 資訊電機學院 > 軟體工程研究所 > 博碩士論文 > Item 987654321/95273

請使用永久網址來引用或連結此文件: https://ir.lib.ncu.edu.tw/handle/987654321/95273

題名:	結合前景感知與多尺度注意力機制之語意分割模型應用於土石流偵測;Foreground-Aware and Multi-scale Convolutional Attention Mechanism for Remote Sensing Images Semantic Segmentation in Landslide Detection
作者:	陳元娣;Chen, Yuan-Di
貢獻者:	軟體工程研究所
關鍵詞:	遙測語意分割;特徵金字塔網絡;卷積注意力機制;多尺度特徵融合;Remote Sensing;Semantic segmentation;Convolutional Attention Mechanism;Multi-scale Features Fusion
日期:	2024-07-22
上傳時間:	2024-10-09 16:36:58 (UTC+8)
出版者:	國立中央大學
摘要:	隨著衛星和無人機技術的進步，現在越來越容易獲取高解析度遙測影像資料，這促使遙測影像在眾多領域中得到廣泛的研究和應用。其中，遙測影像語意分割是一個特殊的語意分割任務，不僅面臨多尺度挑戰，還具有以下兩個獨特的挑戰特徵：一個是極度的前景-背景不平衡分佈，二是多個小物體與複雜背景共存，然而，現有的語意分割方法主要研究在自然場景中的尺度變化，忽略了遙測影像所面臨的特定問題，缺乏對前景建模。為了解決這些問題，本論文提出一種前景感知的遙測語意分割模型。該模型引入了多尺度卷積注意力機制，並採用特徵金字塔網絡(FPN)架構提取多尺度特徵，以解決多尺度問題。通過前景-場景關係模組對前景和場景進行建模，增強前景特徵，從而抑制誤報。在損失函數部分，使用正規化的交點損失函數，在訓練過程中專注於前景樣本，以緩解對前景背景分配不均問題，透過實驗和分析，在LS資料集的基準測試中，所提出的方法優於最先進的通用語義分割方法以及基於Transformer方法，並在速度和精確度之間達到平衡。 ;As satellite and aerial camera technology advances, acquiring high-resolution remote sensing images has become more readily achievable, leading to widespread research and applications in various fields. Remote sensing image semantic segmentation is a crucial task that provides semantic and localization information for target objects. Besides the large-scale variation issues common in most semantic segmentation datasets, aerial images present unique challenges, including high background complexity and imbalanced foreground-background ratios. However, general semantic segmentation methods primarily address scale variations in natural scenes and often neglect the specific challenges in remote sensing images, such as inadequate foreground modeling. In this paper, we present a foreground-aware remote sensing semantic segmentation model. The model introduces a multi-scale convolutional attention mechanism and utilizes a Feature Pyramid Network (FPN) architecture to extract multi-scale features, addressing the multi-scale problem. Additionally, we introduce a foreground-scene relation module to mitigate false alarms. The model enhances the foreground features by modeling the relationship between the foreground and the scene. In the loss function, a Soft Focal Loss focuses on foreground samples during training, alleviating the foreground-background imbalance issue. Experimental results indicate that our proposed method surpasses current state-of-the-art general semantic segmentation and transformer-based methods on LS dataset benchmark, achieving a trade-off between speed and accuracy.
顯示於類別:	[軟體工程研究所 ] 博碩士論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	310	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....