摘要: | 雲偵測對於光學衛星影像前處理是一項非常重要的議題,但於實務上,傳統的專家系統(決策樹)往往無法提供泛用且高準確度的模型,另由於該類模型所需之波段數也較多,這對於目前主流的高空間解析但僅搭載3~5波段的商用衛星而言,該類專家系統將無法適用。 對此本報告嘗試僅以可見光和近紅外配合捲積神經網路進行雲偵測之任務,然而由於衛星影像與傳統影像視覺上所使用的影像特性差異非常的大,無論是多了一個數量級的影像尺寸,又或是極端的灰度值分布,和較少標記數據等均將增加訓練上的難度。對此除了對於結果的分析呈現,本報告更會提供一設計該類神經網路模型的策略,並對神經網路標準化和影像前處理及影像大小等進行探討,同時也嘗試結合了傳統的專家系統對預測結果進行校正。 於資料集的應用中,本報告運用了Landsat-8 和Sentinel-2 兩顆衛星,並於Landsat-8的Cloud Cover Assessment (CCA)資料集上進行訓練,並於Landsat-8的Spatial Procedures for Automated Removal of Cloud and Shadow (SPARCS)和Sentinel-2的資料集上測試。 根據測試結果,目前於Landsat-8的預估的準確率可至 95.7%,且僅有16% 的誤授誤差。另外經由短波紅外線的校正後準確率可至97.4% 並且誤授誤差降至7.5%。對於 Sentinel-2其對於非捲雲的雲體之漏授誤差僅有 2% ~ 3% 此外誤授誤差於陸地和水上皆遠小於1%。更重要的是該模型對於原始無降解析的Sentinel-2影像也有不錯的分析能力,而這也意味著對於其他更高解析度的商用衛星如SPOT,該模型確有其可行性。 ;Cloud detection is an important yet difficult task for optical satellites especially for those images having limited bands in visible to near infrared (VNIR). Some approaches using expert systems that exploit the optical properties of cloud are commonly adopted. However, sufficient spectral bands are still needed to recognize a variety of cloud types for this approach. For high spatial resolution missions, most of them provide only 4 bands in VNIR. Thanks for the rapid development of computer technique, machine learning (ML) nowadays is another choice for cloud detection. This study aims to build a model based on the Convolutional Neural Networks (CNN), and treat this issue as a binary semantic segmentation problem. Different from traditional computer vision applications on CNN, the huge image size, worse histogram distribution, insufficient data quantity with poor labeling quality, and large variety of cloud formations are the major challenges. Thus, except for accuracy analysis, we focus on the analysis of the relationship between input image size, the selection of normalize method, image preprocess, and the strategy of tone mapping. In this research, we firstly split multiple satellite imageries and validation datasets into training and testing subsets. To assess the transferability of CNN model, especially for other satellites with higher spatial resolutions but lack of cloud flags, a workflow is designed to first train CNN by Landsat-8 Cloud Cover Assessment (CCA) dataset and then tests the detectability on both Landsat-8 Spatial Procedures for Automated Removal of Cloud and Shadow (SPARCS) and Sentinel-2 dataset. Based on our results, the current workflow is stable for Landsat-8 and transferable to Sentinel-2 data products. The overall accuracy for Landsat-8 is 95.7% with only 16% of commission error. After a calibration from short-wave infrared (SWIR), the accuracy could reach up to 97.4% with only 7.5% commission error. For Sentinel -2, the omission error for cloud class, exclude thin cirrus, could lower to 3% with the commission error below 1% on land and water area. |