準確的 PM2.5 濃度預測對於空氣品質監測與管理決策至關重要,然而,受限於 PM2.5 固有的非平穩性、複雜的空間變異特徵以及時間依賴關係,使得精準的預測任務依然面臨嚴峻挑戰。針對上述難題,本研究提出了一 Transformer 架構模型,透過混合趨勢與殘差的預測機制來有效應對這些問題。具體而言,整合了空間與時間嵌入(spatial–temporal embeddings)以捕捉時空相關性;引入序列分解模組,將複雜的時間序列有效拆解為長期趨勢項與高頻波動項;並採用 De-Stationary Attention 機制,使模型能動態適應隨時間偏移的資料分布變化。為了驗證模型的實效性,我們採用台灣環境部提供的真實 PM2.5 觀測資料進行廣泛的評估實驗。結果顯示,相較於當前先進的 Transformer 變體模型及傳統線性基準方法,本研究所提出的模型在多種不同的預測步長下,均展現出更為穩定且優越的預測精確度。整體而言,本研究提出的方法提供了一個有效框架,可用於提升 PM2.5 之中長期預測能力。;Accurate PM2.5 forecasting is essential for air-quality management, yet remains challenging due to strong non-stationarity, spatial variability, and long-range temporal dependencies. In this work, we propose a Transformer-based modeling method designed to address these challenges through a hybrid trend–residual forecasting mechanism. The method incorporates spatial–temporal embeddings, a decomposition module that separates long-term trends from high-frequency fluctuations, and a De-Stationary Attention mechanism that adapts the model to shifting data distributions. We evaluate the proposed approach on PM2.5 records from Taiwan’s Ministry of Environment. Experimental results show that the model achieves consistently superior accuracy across multiple forecast horizons compared with recent Transformer-based models and linear baselines. Ablation studies further verify the contributions of the hybrid design and non-stationarity handling. These findings demonstrate that our Transformer modeling provides an effective framework for long-term PM2.5 forecasting.