| 摘要: | 近年來,股票市場對突發事件的反應日益敏感,例如 COVID-19 疫情與關稅政策衝擊,使得僅依賴歷史價格與財務報表進行預測的方法難以有效掌握市場的快速變化。儘管已有研究嘗試引入新聞情緒輔助判讀市場動向,但現有方法多受限於語境理解力不足、模型泛化能力不佳,或對大量標註資料的依賴。為因應此困境,本研究提出一套創新的金融情緒分析框架,結合兩種大型語言模型——FinBERT 與 ModernBERT,針對股市新聞的標題與子標題進行三類情緒分類(正面、負面與中立)。 
 為提升模型對金融語境的適應能力,本研究採用遮蔽語言模型(Masked
 Language Modeling, MLM)進行領域自適應預訓練(Domain-Adaptive Pre-Training, DAPT),以 19.7 萬筆未標註的股市新聞作為語料庫。隨後於僅 649 篇人工標註文章上進行微調,即展現優異表現,顯示本框架在低標註資源條件下具高效能。為進一步強化預測穩定性,本研究採用堆疊集成學習法,以多元邏輯迴歸作為元學習器整合兩模型預測結果,最終模型於測試資料上達到 0.97 準確率與 0.97 的 Macro F1-score,整體表現顯著優於單一模型。
 
 此外,透過每日情緒分數與蘋果(AAPL)和英特爾(INTC)股價報酬之互相關分析,發現情緒與報酬之間存在約 1 天的滯後關係,顯示新聞情緒對市場反應可能具有延遲效應。此結果突顯情緒指標作為投資決策領先訊號的潛力。本研究所提出之架構具可擴展性,適用於即時金融市場分析與風險評估,並為未來跨領域應用提供可行方向。;In recent years, stock markets have become increasingly sensitive to unexpected events such as the COVID-19 pandemic and tariff policy shocks, rendering traditional forecasting methods that rely solely on historical prices and financial statements insufficient for capturing rapid market shifts. While prior studies have explored incorporating news sentiment into market analysis, many face limitations such as insufficient contextual understanding, unstable performance, or reliance on large annotated datasets. To address these challenges, this study proposes an innovative financial sentiment analysis framework that integrates two large language models—FinBERT and ModernBERT—to classify the sentiment (positive, negative, and neutral) of stock news titles and subtitles.
 
 To enhance the models′ understanding of financial contexts, we employ masked language model (MLM) for domain-adaptive pre-training (DAPT) using a corpus of 197,000 unlabeled stock market news articles. The models are then fine-tuned on only 649 manually annotated samples, yet demonstrate strong performance under low-resource conditions. To further improve prediction stability, we adopt a stacking ensemble method, using multinomial logistic regression as a meta-learner to combine the predictions from FinBERT and ModernBERT. The resulting ensemble model achieves an accuracy of 0.97 and a macro F1-score of 0.97 on the test set, significantly outperforming each individual model.
 
 Additionally, cross-correlation analysis between daily sentiment scores and the stock returns of the Apple (AAPL) and Intel (INTC) reveals an approximate one-day lag relationship. This suggests that market responses to financial news may not be instantaneous, highlighting the potential of sentiment signals as short-term leading indicators for investment decision-making. Overall, the proposed framework provides a scalable and robust approach for real-time financial market analysis and risk assessment, while also offering a practical foundation for future applications in cross-domain and event-driven modeling.
 |