本論文探討於音訊處理領域中改善注意力機制的策略,應用 於音樂分類任務,並針對現行方法所面臨之注意力偏誤問題提出 解決方案。首先,我們設計一套具可學習性的反事實注意力機制 (Learnable Counterfactual Attention, LCA),於訓練階段引導反事 實分支聚焦於表面看似有意義但實際缺乏鑑別力的注意區域,並推動 主分支避開這些區域,專注於真正與任務目標相關的聲學特徵。在 artist20、GTZAN 與 FMA 等標準資料集上進行實驗,驗證該方法可 提升模型於歌手辨識與音樂類型分類任務中的表現,並於測試階段移 除反事實分支,以維持推論效率。進一步地,為克服單次反事實學習 在處理偏誤時的侷限(仍會殘留少量注意力關注於偏誤區域),我們 提出漸進式反事實注意力機制(Progressive Learnable Counterfactual Attention, PLCA)。該方法透過多階段遞進的注意力精緻化過程,使 模型逐步降低對偏誤特徵的依賴,提升語意對齊能力與分類穩定性。 實驗結果顯示,PLCA 可在不顯著增加參數量的前提下,進一步改善 分類準確率與特徵表示品質,展現其在偏誤抑制與注意力監督設計上 的應用潛力。綜上所述,本論文驗證了反事實導向的注意力學習策略 於音樂分類領域的有效性,並透過 PLCA 機制,拓展其學習深度與結 構彈性,為未來的注意力去偏研究奠定基礎。;This work investigates counterfactual-based attention mechanisms for music classification, with a focus on addressing attention bias commonly observed in existing models. We first propose a Learnable Counterfactual Attention (LCA) mechanism, which introduces a trainable counterfactual branch during training to identify seemingly relevant but ultimately misleading regions. The main attention branch is then guided away from these regions to better capture task-relevant acoustic features. Experimental results on benchmark datasets, artist20, GTZAN, and FMA, demonstrate that LCA consistently improves performance in both singer identification and genre classification tasks. Importantly, the counterfactual branch is removed during inference to preserve model efficiency. To address the limitations of one-shot counterfactual learning, specifically the residual attention that may remain in biased regions, we further propose Progressive Learnable Counterfactual Attention (PLCA). PLCA reformulates counterfactual learning as a multistage refinement process, where attention is gradually decoupled from bias-prone regions through repeated conditioning. Experimental findings confirm that PLCA improves classification accuracy and enhances the semantic coherence of learned representations, all with minimal additional parameter overhead. In summary, this work validates the effectiveness of counterfactual attention learning for music classification and introduces a progressive framework that deepens and stabilizes bias suppression through structured attention refinement.