摘要: | 機器學習為評估空氣品質的強大工具,它可以提供即時及預測資訊,以利公眾採取相關措施減緩空氣品質惡化。本研究使用長短期記憶模型(Long Short-Term Memory, LSTM)來進行短期空氣品質預報,並針對雲林斗六的PM2.5濃度進行探討,斗六地區的空氣品質問題可能是由於複雜的排放源,受當地環流與地形所影響,使斗六成為台灣空氣污染最嚴重的地區之一。因此本研究使用環保署斗六空氣品質觀測站連續監測數據為模型訓練主要資料來源,同時使用AERONET所提供氣膠光學厚度(AOD)、氣象站提供氣壓資料,及鄰近斗六三個環保署空品監測站所提供PM2.5濃度作為輔助提供預測效能,以模式中不同的設定進行模型敏感度測試,並尋找最佳模型組合。 以2021年斗六預報1小時至預報24小時為24個個案進行預報,評估模型敏感度,而各個案的RMSE從6.4至13.1 ug m-3不等,預報1小時至預報24小時的相關性則從0.92逐漸降低至0.58,其中,在預報1小時的PM2.5濃度預報趨勢與實際監測質趨勢相符。另外,本評估模型可預測高污染事件,如PM2.5濃度達100 ug m-3,即本方法可克服其他模式所發生低估的問題。然而,當預報時間拉長,模式仍會出現低估,如當預報24小時,最高可預估到PM2.5濃度僅達50-60 ug m-3。另外,本研究發現使用深度學習的LSTM (Deep learning-based LSTM),即兩層模型,PM2.5濃度預報將有顯著改善,在加入鄰近三站PM2.5濃度觀測值等輔助資料後,RMSE可改善將近10%。除預報1小時至24小時,本研究也針對季節性及區域性預報結果進行評估。在季節特徵方面,從觀測資料顯示,冬季為斗六地區污染最嚴重的季節,觀測值與預報結果誤差約為16.2 ug m-3,而夏季的觀測值與預報結果誤差呈現最低,為5.5 ug m-3,可能原因為夏季是相對較無污染的季節,PM2.5濃度不高,卻也使相關性較低。在區域特徵方面,本研究以台灣西部沿海地區共十個環保署空品站資料來進行模式評估,並針對預報1小時、預報12小時及預報24小時進行討論。研究發現,中南部站點的趨勢與斗六站最相近,而北部站點表現最差,RMSE較高、相關性較低。 總結來說,本研究考量台灣氣候條件並納入多樣的觀測數據,建立良好人工智慧模型架構,提供空氣品質預報作業有良好的參考,該模型可應用於台灣都市地區空氣品質評估,並提供預警以利公衛相關單位進行應變措施。 ;Machine learning has become a powerful tool in air quality assessment which can provide timely and predictable information, alert the public, and take timely measures to prevent deteriorating air quality. The study used an LSTM algorithm to predict short-term air quality. We focused on the PM2.5 concentration in Douliu, one of the most polluted sites in Taiwan. The challenge of Douliu’s air quality issue may be due to the complicated emission source and the effect of local circulation and topography. The EPA air quality data from the Douliu station was used as the primary input for model training. A sensibility test of different model setups was performed to rule out the best combination. The auxiliary features like AOD from the AERONET database, Pressure from CWB open source, time indicators, and PM2.5¬ ¬concentration from three nearby stations were also considered to improve the prediction performance. The 24 cases represent 1- to 24-hour prediction in Douliu, 2021, was conducted to assess the model sensitivity. The optimal setup was selected with the best performance, whose RMSE varied from 6.4 to 13.1 µg/m3 over the 24 cases. The highest correlation was 0.92 for the 1-hour prediction, and the lowest was 0.58 for the next 24-hour forecast. The distribution value of predicted PM2.5 at the 1-hour forecast shows a consistent PM2.5 concentration with the variation of the observed PM2.5. Additionally, the model can predict the high PM2.5 event, nearly 100 µg/m3. This result indicated that the LSTM algorithm could overcome the underestimated issue, which is the practical problem with other algorithms. However, if we considered predicting at the longer prediction time, the model still met the underestimated issue. This could be seen in the 24-hour prediction model, which only predicted the high PM.2.5 event at 50 – 60 µg/m3. Additionally, using the Deep learning-based LSTM (using two layers of LSTM), the PM2.5 concentration from the model prediction shows an improvement. After considering the auxiliary features, the combination with PM2.5 features from the three nearby stations shows better performances, with the improvement in RMSE can reach to nearly 10%. Seasonal and regional testing was conducted to assess the performance of the proposed model. The seasonal variation showed that the highest error, about 16.2 µg/m3, was observed during the winter, which is the high-polluted season in the area. On the other hand, the lowest error, 5.5 µg/m3, was observed during the summer; however, this also resulted in the lowest correlation. Because the summer is not the polluted season, leading to the low PM2.5 concentration. Regarding regional testing, ten stations in the western coastal region of Taiwan were selected to assess the model’s performance. Additionally, the prediction for the next 1-, 12-, and 24-hours models were chosen for comparison. The central and southern Taiwan stations present a similar trend to the Douliu station. On the other hand, the northern Taiwan stations perform the worst with a higher RMSE and lower correlation. Overall, this study provides a good reference for the best settings for deep learning–based AI model which meets Taiwan’s climate conditions and data resources. The model can be implemented for routine air quality monitoring in urban areas and air-quality alarms associated with public health. |