我們詳細探討了如何結合強化學習(Reinforcement Learning, RL)和圖 注意力機制(Graph Attention Networks, GAT)來優化資產配置的過程。研究 主要目的是透過這些先進技術,提高股票投資組合的管理效率和表現。這項研 究不僅提升了策略的績效,也增強了模型對市場變化的反應能力和預測準確 度。 首先介紹了圖注意力網絡(GAT),這是一種能夠有效捕捉股票間動態關 聯性的模型。利用GAT 可以分析股票間的互動,透過這種分析,選股代理人 能更好地捕捉市場中的隱含機會,並對各種經濟情勢下的資產表現進行有效預 測。 其次,我們還利用了時序卷積自編碼器(TCN-AE)對每日的交易數據 進行特徵壓縮。這一步驟對於提高模型對於交易訊息的處理效率和預測精度至 關重要。TCN-AE 透過壓縮過程,將大量的交易數據簡化為更精煉的特徵表 示,這有助於提高後續模型訓練的效率和效果。 最後是強化學習的部分,論文採用了近端策略優化(PPO)演算法。 PPO 是一種先進的強化學習方法,被用來訓練選股代理人和交易的代理人, 以自動化和優化交易決策過程。代理人學習如何在不同的市場環境下作出最優 決策,並通過持續的學習過程,不斷調整其策略以適應市場變化。 綜合這些技術,論文提出了一個創新的資產配置系統,旨在通過機器 學習技術優化投資組合的管理。實驗結果表明,與全球管理資產規模最大的 ETF 相比,顯示出較低的風險和更高的收益潛力。這證明了結合強化學習和 圖注意力機制在資產配置上是有效性與前瞻性。;We conducted a detailed investigation on how to integrate Reinforcement Learning (RL) and Graph Attention Networks (GAT) to optimize the asset allocation process. The primary aim of this study is to enhance the management efficiency and performance of stock portfolios through these advanced technologies. This research not only improves the performance of the strategy but also enhances the model’s responsiveness to market fluctuations and predictive accuracy. The introduction begins with an overview of Graph Attention Networks (GAT), by utilizing GAT, allows to get the analysis of interactions between stocks. Through this analysis, stock-picking agents can effectively capture potential opportunities in the market and predict asset performance. Through this analysis, stock-picking agents can better capture hidden opportunities in the market and effectively predict asset performance under various economic conditions. Additionally, we utilized a Temporal Convolutional Network Autoencoder (TCN-AE) to compress features from daily trading data. This step is crucial for enhancing the model’s efficiency in processing transaction information and improving predictive accuracy. Through the compression process, TCN-AE simplifies large volumes of trading data into more refined feature representations, which aids in enhancing the efficiency and effectiveness of subsequent model training. In the end, the paper is going to show the application of reinforcement learning, specifically using the Proximal Policy Optimization (PPO)algorithm. PPO is an advanced reinforcement learning method employed to train stock-picking and trading agents, automating and optimizing the decision-making process in trading. The agents learn how to make optimal decisions in varying market conditions and continuously adjust their strategies through an ongoing learning process to adapt the market fluctuations. Integrating these technologies, the paper proposes an innovative asset allocation system, which is designed to optimize portfolio management through machine learning techniques. Compared to the largest globally managed ETFs, this system gets lower risks and higher rewards via experimental demonstration. This proves the effectiveness and forward-thinking nature of combining reinforcement learning and graph attention mechanisms in asset allocation.