摘要: | 高光譜影像包含豐富的光譜資訊,廣泛應用於遙測領域。然而,儘管高光譜影像在光譜層面具有高度解析度,與多光譜影像相比,常面臨空間解析度不足的挑戰。透過深度學習模型,可結合高空間解析度的多光譜影像(HR-MSI)與低空間解析度的高光譜影像(LR-HSI),以重建高空間解析度的高光譜影像(HR-HSI)。 ARCNET整合了三種注意力機制:(1) 針對高光譜影像提取光譜特徵的通道注意力機制(Channel Attention)、(2) 針對多光譜影像提取空間特徵的自注意力機制(Self-Attention),以及(3) 結合兩者特徵的融合注意力機制(Fusion Attention)。注意力機制有助於模型有效捕捉波段與像素間關聯性,強調關鍵特徵,縮小非必要資訊。此外,模型中加入了殘差連接(Residual Connections),以在訓練過程中保留原始資料的完整性,確保穩定的重建結果。 我們將 ARC-NET 與 TFNET、ResTFNET、SSR-NET、Spat-CNN、Spec-CNN、SSFCNN、ConSSFCNN 以及 MSDCNN 進行比較,並使用四種標準評估指標Peak Signal-to-Noise Ratio (PSNR) 、Root Mean Squared Error (RMSE) 、Erreur Relative Globale Adimensionnelle de Synthèse (ERGAS)和Spectral Angle Mapper (SAM) 在五個高光譜影像數據集上進行實驗,數據集包括 Urban、Indian Pines、Botswana、Pavia Center 和 Pavia University。實驗結果顯示,在 Botswana 數據集上,ARC-NET 分別達到 RMSE 0.308、PSNR 41.47 dB、ERGAS 1.987 與 SAM 1.573,分別領先第二名模型 0.104、2.55 dB、0.392 與0.456。在其他數據集上面的實驗中,ARC-NET同樣表現出相似的結果,結果表明ARC-NET在高光譜以及多光譜影像融合上具有卓越的效能,有效提升空間與光譜資訊擷取品質。 ;Hyperspectral images contain rich spectral information and are widely used in the remote sensing field. However, they often face the issue of insufficient spatial resolution compared to multispectral images. With the aid of deep learning, we reconstructed high spatial resolution hyperspectral images (HR-HSI) with high spatial resolution multispectral images (HR-MSI) and low spatial resolution hyperspectral images (LR-HSI). ARC-NET integrates three attention mechanisms: (1) Channel attention, which enhances hyperspectral feature extraction; (2) Self-attention, which captures spatial features from multispectral images; and (3) Fusion attention, which adaptively balances spectral and spatial information during fusion. These mechanisms enable AEC-NET to effectively model pixel-to-pixel and band-to-band relationships, enhancing informative features while suppressing irrelevant ones. Additionally, residual connections are incorporated to preserve original data integrity during training, ensuring stable reconstruction. We evaluate the ARC-NET against TFNET, ResTFNET, SSR-NET, Spat-CNN, Spec-CNN, SSFCNN, ConSSFCNN and MSDCNN across five benchmark datasets--Urban, Indian Pines, Botswana, Pavia Center, and Pavia University – using four standard metrics: Peak Signal-to-Noise Ratio (PSNR), Root Mean Squared Error (RMSE), Erreur Relative Globale Adimensionnelle de Synthèse (ERGAS), and Spectral Angle Mapper (SAM). Experimental results demonstrate that ARC-NET achieves a RMSE of 0.308, PSNR of 41.47 dB, ERGAS of 1.987, and SAM of 1.573 on the Botswana dataset, outperforming the second-best model by 0.104, 2.55 dB, 0.392, and 0.456 in the standard metrics. Similar results are observed across other benchmark datasets, where ARC-NET consistently delivers stable and leading performance. Results consistently demonstrate ARC-NET’s superior performance in hyperspectral and multispectral image fusion, reinforcing its effectiveness in enhancing spatial and spectral information. |