隨著衛星和無人機技術的進步,現在越來越容易獲取高解析度遙測影像資料,這促使遙測影像在眾多領域中得到廣泛的研究和應用。其中,遙測影像語意分割是一個特殊的語意分割任務,不僅面臨多尺度挑戰,還具有以下兩個獨特的挑戰特徵:一個是極度的前景-背景不平衡分佈,二是多個小物體與複雜背景共存,然而,現有的語意分割方法主要研究在自然場景中的尺度變化,忽略了遙測影像所面臨的特定問題,缺乏對前景建模。為了解決這些問題,本論文提出一種前景感知的遙測語意分割模型。 該模型引入了多尺度卷積注意力機制,並採用特徵金字塔網絡(FPN)架構提取多尺度特徵,以解決多尺度問題。通過前景-場景關係模組對前景和場景進行建模,增強前景特徵,從而抑制誤報。在損失函數部分,使用正規化的交點損失函數,在訓練過程中專注於前景樣本,以緩解對前景背景分配不均問題, 透過實驗和分析,在LS資料集的基準測試中,所提出的方法優於最先進的通用語義分割方法以及基於Transformer方法,並在速度和精確度之間達到平衡。 ;As satellite and aerial camera technology advances, acquiring high-resolution remote sensing images has become more readily achievable, leading to widespread research and applications in various fields. Remote sensing image semantic segmentation is a crucial task that provides semantic and localization information for target objects. Besides the large-scale variation issues common in most semantic segmentation datasets, aerial images present unique challenges, including high background complexity and imbalanced foreground-background ratios. However, general semantic segmentation methods primarily address scale variations in natural scenes and often neglect the specific challenges in remote sensing images, such as inadequate foreground modeling. In this paper, we present a foreground-aware remote sensing semantic segmentation model. The model introduces a multi-scale convolutional attention mechanism and utilizes a Feature Pyramid Network (FPN) architecture to extract multi-scale features, addressing the multi-scale problem. Additionally, we introduce a foreground-scene relation module to mitigate false alarms. The model enhances the foreground features by modeling the relationship between the foreground and the scene. In the loss function, a Soft Focal Loss focuses on foreground samples during training, alleviating the foreground-background imbalance issue. Experimental results indicate that our proposed method surpasses current state-of-the-art general semantic segmentation and transformer-based methods on LS dataset benchmark, achieving a trade-off between speed and accuracy.