基於條件擴散模型之衛星影像去雲研究方法;Conditional Diffusion Model-Based Approach for Satellite Image Cloud Removal

NCU Institutional Repository > 管理學院 > 企業管理研究所 > 博碩士論文 > Item 987654321/97654

請使用永久網址來引用或連結此文件: https://ir.lib.ncu.edu.tw/handle/987654321/97654

題名:	基於條件擴散模型之衛星影像去雲研究方法;Conditional Diffusion Model-Based Approach for Satellite Image Cloud Removal
作者:	蔡承洋;Tsai, Cheng-Yang
貢獻者:	企業管理學系
關鍵詞:	衛星影像去雲;條件擴散模型;Landsat-8/9;影像修復;Satellite Image Cloud Removal;Condtional Diffusion model;Landsat 8/9;Image Restoration
日期:	2025-08-28
上傳時間:	2025-10-17 11:44:10 (UTC+8)
出版者:	國立中央大學
摘要:	本研究聚焦於光學衛星影像在地表資訊提取過程中，因雲層遮蔽而導致影像缺失與品質下降。被雲區遮擋的部分常使地表細節無法還原，進而影響多時相分析與定量反演的精準度。針對此問題，本研究提出一種基於條件擴散模型（Conditional Diffusion Model）的單幅衛星影像雲去除方法。首先，利用有雲與無雲之多光譜光學（Band 2–4）與輔助（Band 9–11）影像，透過像素級差分結合 K-means 無監督分類，自動生成二值化雲遮罩，同時以 GDAL 與 Rasterio 完成多波段影像的幾何校正與對齊。接著，將校正後影像切分為 128×128 像素 Patch，並依雲覆蓋比例與無效像素閾值進行過採樣與篩選，透過資料翻轉與亮度調整以構建均衡且多樣的訓練資料集。模型架構採三模組設計：時間嵌入（Sinusoidal Encoding + MLP）、條件編碼器（多層卷積結合 Time-Condition Fusion Block 提取多尺度雲特徵）與去噪自編碼器（基於 UNet 結構和 Time-Condition Fusion Block 組成）。訓練階段引入 Sigmoid β 排程與 Curriculum-t 取樣策略，並以動態加權的混合損失函數（ε-loss、ŷ₀-loss 及加權 MS-SSIM），輔以自動混合精度（AMP）、指數移動平均（EMA）與梯度累積技術，確保模型在去噪、細節還原與結構保真度之間取得最佳平衡。推論時則採用精簡版 DDIM 演算法（5–10 步），並以重疊區平均重建與雲遮罩融合輸出最終無雲影像。最終以PSNR 和 SSIM 定量指標來測試生成的圖片質量。;This study addresses the challenges posed by cloud occlusion in optical satellite imagery, which can obscure surface details and degrade the accuracy of multi-temporal analyses and quantitative retrieval. To overcome this, we propose a conditional diffusion-based method for single-image cloud removal on cloud data. A binary cloud mask is first generated by applying pixel-level differencing and K-means clustering to multi-spectral optical (Bands 2–4) and auxiliary infrared (Bands 9–11) images. These images are then geometrically corrected and co-registered using GDAL and Rasterio. The aligned data are partitioned into 128×128 pixel patches, which are oversampled and filtered based on cloud coverage ratio and invalid-pixel thresholds, forming a balanced and diverse training dataset. Our model architecture comprises three modules: (1) a time embedding unit employing sinusoidal encoding and an MLP, (2) a conditional encoder extracting multi-scale cloud representations through stacked convolutions and Time-Condition Fusion Blocks, and (3) a denoising autoencoder built upon a U-Net backbone integrated with Time-Condition Fusion Blocks. During training, we adopt a Sigmoid β schedule and a Curriculum-t sampling strategy, optimizing a dynamically weighted loss that combines ε-loss, ŷ₀-loss, and weighted MS-SSIM. Automatic mixed precision (AMP), exponential moving average (EMA), and gradient accumulation techniques are utilized to balance denoising performance, detail preservation, and structural fidelity. For inference, a simplified DDIM sampler with 5–10 steps is used, followed by overlap-averaging reconstruction and cloud mask fusion to generate the final cloud-free output. The resulting images are quantitatively evaluated using PSNR and SSIM metrics.
顯示於類別:	[企業管理研究所] 博碩士論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	74	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....