基於強化式學習與圖注意力機制於資產配置之研究

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：121

、訪客IP：3.145.12.212

姓名

黃俊傑(Chun-Chieh Huang) 查詢紙本館藏

畢業系所

資訊工程學系在職專班

論文名稱

基於強化式學習與圖注意力機制於資產配置之研究
(Portfolio Management via Reinforcement Learning and Graph Attention Network)

相關論文

★ 行程邀約郵件的辨識與不規則時間擷取之研究	★ NCUFree校園無線網路平台設計及應用服務開發
★ 網際網路半結構性資料擷取系統之設計與實作	★ 非簡單瀏覽路徑之探勘與應用
★ 遞增資料關聯式規則探勘之改進	★ 應用卡方獨立性檢定於關連式分類問題
★ 中文資料擷取系統之設計與研究	★ 非數值型資料視覺化與兼具主客觀的分群
★ 關聯性字組在文件摘要上的探討	★ 淨化網頁：網頁區塊化以及資料區域擷取
★ 問題答覆系統使用語句分類排序方式之設計與研究	★ 時序資料庫中緊密頻繁連續事件型樣之有效探勘
★ 星狀座標之軸排列於群聚視覺化之應用	★ 由瀏覽歷程自動產生網頁抓取程式之研究
★ 動態網頁之樣版與資料分析研究	★ 同性質網頁資料整合之自動化研究

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 (2024-12-31以後開放)

摘要(中)

我們詳細探討了如何結合強化學習（Reinforcement Learning, RL）和圖
注意力機制（Graph Attention Networks, GAT）來優化資產配置的過程。研究
主要目的是透過這些先進技術，提高股票投資組合的管理效率和表現。這項研
究不僅提升了策略的績效，也增強了模型對市場變化的反應能力和預測準確
度。
首先介紹了圖注意力網絡（GAT），這是一種能夠有效捕捉股票間動態關
聯性的模型。利用GAT 可以分析股票間的互動，透過這種分析，選股代理人
能更好地捕捉市場中的隱含機會，並對各種經濟情勢下的資產表現進行有效預
測。
其次，我們還利用了時序卷積自編碼器（TCN-AE）對每日的交易數據
進行特徵壓縮。這一步驟對於提高模型對於交易訊息的處理效率和預測精度至
關重要。TCN-AE 透過壓縮過程，將大量的交易數據簡化為更精煉的特徵表
示，這有助於提高後續模型訓練的效率和效果。
最後是強化學習的部分，論文採用了近端策略優化（PPO）演算法。
PPO 是一種先進的強化學習方法，被用來訓練選股代理人和交易的代理人，
以自動化和優化交易決策過程。代理人學習如何在不同的市場環境下作出最優
決策，並通過持續的學習過程，不斷調整其策略以適應市場變化。
綜合這些技術，論文提出了一個創新的資產配置系統，旨在通過機器
學習技術優化投資組合的管理。實驗結果表明，與全球管理資產規模最大的
ETF 相比，顯示出較低的風險和更高的收益潛力。這證明了結合強化學習和
圖注意力機制在資產配置上是有效性與前瞻性。

摘要(英)

We conducted a detailed investigation on how to integrate Reinforcement
Learning (RL) and Graph Attention Networks (GAT) to optimize the asset allocation
process. The primary aim of this study is to enhance the management
efficiency and performance of stock portfolios through these advanced technologies.
This research not only improves the performance of the strategy but also
enhances the model’s responsiveness to market fluctuations and predictive accuracy.
The introduction begins with an overview of Graph Attention Networks
(GAT), by utilizing GAT, allows to get the analysis of interactions between
stocks. Through this analysis, stock-picking agents can effectively capture potential
opportunities in the market and predict asset performance.
Through this analysis, stock-picking agents can better capture hidden opportunities
in the market and effectively predict asset performance under various
economic conditions.
Additionally, we utilized a Temporal Convolutional Network Autoencoder
(TCN-AE) to compress features from daily trading data. This step is crucial
for enhancing the model’s efficiency in processing transaction information and
improving predictive accuracy. Through the compression process, TCN-AE simplifies
large volumes of trading data into more refined feature representations,
which aids in enhancing the efficiency and effectiveness of subsequent model
training.
In the end, the paper is going to show the application of reinforcement learning,
specifically using the Proximal Policy Optimization (PPO)algorithm. PPO
is an advanced reinforcement learning method employed to train stock-picking
and trading agents, automating and optimizing the decision-making process in
trading. The agents learn how to make optimal decisions in varying market
conditions and continuously adjust their strategies through an ongoing learning
process to adapt the market fluctuations.
Integrating these technologies, the paper proposes an innovative asset allocation
system, which is designed to optimize portfolio management through machine learning techniques. Compared to the largest globally managed ETFs,
this system gets lower risks and higher rewards via experimental demonstration.
This proves the effectiveness and forward-thinking nature of combining
reinforcement learning and graph attention mechanisms in asset allocation.

關鍵字(中)

★ 強化學習
★ 圖注意力機制
★ 資產配置
★ 特徵壓縮

關鍵字(英)

★ Reinforcement Learning
★ Graph Attention Networks
★ Portfolio Allocation
★ TCN-AE

論文目次

摘要. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
目錄. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
圖目錄. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
表目錄. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
一、緒論. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1-1 目標. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1-2 挑戰. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1-3 貢獻. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
二、相關研究. . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2-1 強化學習. . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2-2 Graph Neural Networks . . . . . . . . . . . . . . . . . . . . 5
2-3 TCN-AE . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
三、方法. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3-1 Graph Attention Networks 介紹. . . . . . . . . . . . . . . 9
3-1-1 GAT 架構說明. . . . . . . . . . . . . . . . . . . . . . . . . 9
3-1-2 GAT 訓練方式. . . . . . . . . . . . . . . . . . . . . . . . . 11
3-1-3 資料維度. . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3-1-4 財報資料應用於GAT 之範例. . . . . . . . . . . . . . . . 12
3-2 Temporal Convolutional Autoencoder 介紹. . . . . . . . . 13
3-2-1 TCN-AE 架構說明. . . . . . . . . . . . . . . . . . . . . . 13
3-2-2 TCN-AE 訓練方式. . . . . . . . . . . . . . . . . . . . . . 14
3-2-3 TCN-AE 資料維度. . . . . . . . . . . . . . . . . . . . . . 15
3-3 Reinforcement Learning 介紹. . . . . . . . . . . . . . . . 15
3-3-1 Proximal Policy Optimization . . . . . . . . . . . . . . . . 17
3-4 演算法. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3-4-1 PPO 資料維度. . . . . . . . . . . . . . . . . . . . . . . . . 19
3-4-2 PPO 模型參數. . . . . . . . . . . . . . . . . . . . . . . . . 20
四、實驗. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4-1 財報資料. . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4-1-1 財報公布時間. . . . . . . . . . . . . . . . . . . . . . . . . 21
4-1-2 財報資料-用於每季選股. . . . . . . . . . . . . . . . . . . 22
4-2 技術指標-用於每日交易. . . . . . . . . . . . . . . . . . . 22
4-2-1 每日交易資料獲取時間. . . . . . . . . . . . . . . . . . . . 22
4-2-2 每日交易欄位. . . . . . . . . . . . . . . . . . . . . . . . . 23
4-3 實驗設計. . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4-4 實驗比較基準及評估方法. . . . . . . . . . . . . . . . . . . 25
4-5 實驗結果. . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4-5-1 每季實驗. . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4-5-2 每日實驗. . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4-5-3 三、六個月投資週期. . . . . . . . . . . . . . . . . . . . . 27
4-5-4 熊市與牛市之投資比較. . . . . . . . . . . . . . . . . . . . 28
4-5-5 與1/N 策略比較. . . . . . . . . . . . . . . . . . . . . . . 29
4-6 消融實驗. . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
五、結論與未來展望. . . . . . . . . . . . . . . . . . . . . . . . 32
參考文獻. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
A.1 實驗股票清單. . . . . . . . . . . . . . . . . . . . . . . . . 35
v

參考文獻

[1] Nuno Fernandes. Economic effects of coronavirus outbreak (covid-19) on
the world economy. 2020.
[2] 戴孝君. 俄烏戰爭與全球經濟安全, 2022. 訪問於2024-06-15.
[3] Shuo Sun, Rundong Wang, and Bo An. Reinforcement learning for quantitative
trading, 2021.
[4] Xiao-Yang Liu, Hongyang Yang, Jiechao Gao, and Christina Dan Wang.
Finrl: deep reinforcement learning framework to automate trading in quantitative
finance. In Proceedings of the Second ACM International Conference
on AI in Finance, ICAIF’21. ACM, November 2021.
[5] Antonin Raffin, Ashley Hill, Adam Gleave, Anssi Kanervisto, Maximilian
Ernestus, and Noah Dormann. Stable-baselines3: Reliable reinforcement
learning implementations. Journal of Machine Learning Research,
22(268):1–8, 2021.
[6] Hongyang Yang, Xiao-Yang Liu, Shan Zhong, and Anwar Walid. Deep reinforcement
learning for automated stock trading: an ensemble strategy. In
Proceedings of the First ACM International Conference on AI in Finance,
ICAIF ’20, New York, NY, USA, 2021. Association for Computing Machinery.
[7] Yuyu Yuan, Wen Wen, and Jincui Yang. Using data augmentation based
reinforcement learning for daily stock trading. Electronics, 9(9):1384, 2020.
[8] Mao Guan and Xiao-Yang Liu. Explainable deep reinforcement learning for
portfolio management: An empirical approach, 2021.
[9] Kevin Dabérius, Elvin Granat, and Patrik Karlsson. Deep execution-value
and policy based reinforcement learning for trading and beating market
benchmarks. Available at SSRN 3374766, 2019.
[10] Jingyuan Wang, Yang Zhang, Ke Tang, Junjie Wu, and Zhang Xiong. Alphastock:
A buying-winners-and-selling-losers investment strategy using interpretable
deep reinforcement attention networks. In Proceedings of the
25th ACM SIGKDD International Conference on Knowledge Discovery amp;
Data Mining, KDD ’19. ACM, July 2019.
[11] Yi-Hsuan Lee. Portfolio management with autoencoders and reinforcement
learning, 2023.
[12] Alireza Jafari and Saman Haratizadeh. Gcnet: graph-based prediction of
stock price movement using graph convolutional network, 2022.
[13] Zihan Chen, Lei Zheng, Cheng Lu, Jialu Yuan, and Di Zhu. Chatgpt informed
graph neural network for stock movement prediction. SSRN Electronic
Journal, 2023.
[14] Huize Xu, Yuhang Zhang, and Yaoqun Xu. Promoting financial market
development-financial stock classification using graph convolutional neural
networks. IEEE Access, 11:49289–49299, 2023.
[15] Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero,
Pietro Liò, and Yoshua Bengio. Graph attention networks, 2018.
[16] Markus Thill, Wolfgang Konen, and Thomas Bäck. Time Series Encodings
with Temporal Convolutional Networks, pages 161–173. 11 2020.
[17] John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg
Klimov. Proximal policy optimization algorithms. CoRR, abs/1707.06347,
2017.
[18] Ifrs. https://zh.wikipedia.org/zh-tw/%E5%9B%BD%E9%99%85%E8%B4%
A2%E5%8A%A1%E6%8A%A5%E5%91%8A%E5%87%86%E5%88%99.
[19] Finmind. https://finmind.github.io/, 2024.
[20] OpenAI, Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, and
Ilge Akkaya. Gpt-4 technical report, 2024.

指導教授

張嘉惠(Chia-Hui Chang)

審核日期

2024-7-25

推文