摘要(英) |
Financial technology (FinTech) has emerged as one of the key areas for the application of artificial intelligence (AI), including but not limited to, the prediction of stock
market movements and asset allocation. However, relying solely on stock price forecasting does not guarantee the maximization of investment returns. An investor also needs to
consider asset allocation strategies to either maximize the returns or minimize the losses.
In such a scenario that requires interactions with the environment to reap rewards, reinforcement learning (RL) emerges as an ideal solution. Consequently, in this study, we
propose a strategy for stock investment that employs the Actor-Critic techniques of RL.
To enhance the effectiveness of investment decisions, we employed an AutoEncoder (AE) to learn the features of various technical indicators in stocks, which then aids in making
decisions on stock allocation and return estimation.
However, portfolio management still faces challenges in optimizing allocation strategies and accurately forecasting returns, especially during market volatility. Traditional
strategies often focus on either short-term or long-term investments, leading to a lack of a model in the market that can adapt flexibly to various situations. To address this
problem, we have introduced a novel method that combines reinforcement learning and autoencoders, hoping to fill this gap in the market.
We employed ablation experiments to explore the effects of the dimension of AutoEncoder encoding and the length of historical data on state encoding. The results indicate that by compressing the past 30 days of historical data into five dimensions, the optimal state encoding effect can be achieved. We also discovered that incorporating the
prediction results of the AutoEncoder Predictor can enhance cumulative earnings. Furthermore, we investigated three different investment strategies: RL+AE Predictor, RL
Only, and AE Predictor. Through performance analysis, correlation with the broader market, and error rate analysis, we evaluated the performance of these three strategies in
various market environments.
Experimental results reveal that as a constrained investment strategy, RL+AE Predictor performs the best in maximizing assets and exhibits a stable learning process. Especially during significant market changes, this strategy showcases superior risk resistance and can maintain stable investment returns. Moreover, this strategy has a lower correlation coefficient with the broader market, indicating its independence from market index volatility. In the error rate analysis, the RL+AE Predictor model has a False Positive
Rate (FPR) of 6.46%, which is lower compared to AE Predictor at 38.38% and RL Only at 36.09%. This shows that it outperforms in predicting stock asset allocation, with the lowest error rate.
We validated this method in Taiwan’s stock market environment, conducting experiments with Taiwan stock data from 2019 to 2021 and compared it with the TW50
Index, traditional portfolio theory (Mean-variance optimization, MVO), and Jiang’s
research that uses reinforcement learning Policy Gradient techniques. The experimental results show that the win rate of this study in short-term (3 months), mid-to-long term
(6-9 months), and long-term (1-2 years) investment periods is superior to the benchmarks
TW50, Jiang’s, and MVO, reaching the highest total return in the 12 and 24 month lognterm investment periods. Even in the two-year fixed investment time comparison, starting
at two different investment points, the bull market of 2019 and the bear market of 2020,
the method proposed in this thesis still outperforms the TW50 Index, MVO, and Jiang’s.
In summary, this research provides empirical evidence that a combination of reinforcement learning and autoencoders in portfolio management outperforms the traditional
MVO, TW50 Index, and Jiang’s hybrid deep learning methods in both cumulative return rate and Sharpe ratio. It highlights the potential of AI in complex financial decisions and
points out the need for a more flexible, universal model to bridge the gap between shortterm and long-term investment strategies. These research findings provide significant
reference value for the development and improvement of investment strategies |
參考文獻 |
1. Harry Markowitz. Portfolio selection. The Journal of Finance, 7(1):77–91, 1952.
2. Zhengyao Jiang and Jinjun Liang. Cryptocurrency portfolio management with deep
reinforcement learning. In 2017 Intelligent Systems Conference (IntelliSys), pages
905–913, New York, NY, USA, 2017. IEEE.
3. Yuan Qi and Jing Xiao. Fintech: Ai powers financial services to improve people’s
lives. Communications of the ACM, 61(11):65–69, 2018.
4. Ahmet Murat Ozbayoglu, Mehmet Ugur Gudelek, and Omer Berat Sezer. Deep
learning for financial applications : A survey. Applied Soft Computing, 93:106384,
2020.
5. Dmitry Sizykh. Performance indicators comparative analysis of stocks investment
portfolios with various approaches to their formation. In 2020 13th International
Conference "Management of large-scale system development" (MLSD), pages 1–5,
New York, NY, USA, 2020. IEEE.
6. Yash S. Asawa. Modern machine learning solutions for portfolio selection. IEEE
Engineering Management Review, 50(1):94–112, 2021.
7. Weimin Ma, Yingying Wang, and Ningfang Dong. Study on stock price prediction
based on bp neural network. In 2010 IEEE International Conference on Emergency
Management and Management Sciences, pages 57–60, New York, NY, USA, 2010.
IEEE.
8. Timothée Lesort, Natalia Díaz-Rodríguez, Jean-Franois Goudou, and David Filliat.
State representation learning for control: An overview. Neural Networks, 108:379–
392, 2018.
9. Yue Deng, Feng Bao, Youyong Kong, Zhiquan Ren, and Qionghai Dai. Deep direct reinforcement learning for financial signal representation and trading. IEEE
Transactions on Neural Networks and Learning Systems, 28(3):653–664, 2017.
10. Bo An, Shuo Sun, and Rundong Wang. Deep reinforcement learning for quantitative
trading: Challenges and opportunities. IEEE Intell. Syst., 37(2):23–26, 2022.
11. Amirhosein Mosavi, Yaser Faghan, Pedram Ghamisi, Puhong Duan, Sina Faizollahzadeh Ardabili, Ely Salwana, and Shahab S. Band. Comprehensive review of
deep reinforcement learning methods and applications in economics. Mathematics,
8(10), 2020.
12. Akhil Raj Azhikodan, Anvitha G. K. Bhat, and Mamatha V. Jadhav. Stock trading
bot using deep reinforcement learning. In H. S. Saini, Rishi Sayal, A. Govardhan,
and Rajkumar Buyya, editors, Innovations in Computer Science and Engineering,
pages 41–49, Singapore, 2019. Springer Singapore.
13. Tarrin Skeepers, Terence L. van Zyl, and Andrew Paskaramoorthy. Ma-fdrnn: Multiasset fuzzy deep recurrent neural network reinforcement learning for portfolio management. In 2021 8th International Conference on Soft Computing & Machine Intelligence (ISCMI), pages 32–37, New York, NY, USA, 2021. IEEE.
14. Qinma Kang, Huizhuo Zhou, and Yunfan Kang. An asynchronous advantage actorcritic reinforcement learning method for stock selection and portfolio management.
In Proceedings of the 2nd International Conference on Big Data Research, ICBDR
2018, page 141–145, New York, NY, USA, 2018. Association for Computing Machinery.
15. Mao Guan and Xiao-Yang Liu. Explainable deep reinforcement learning for portfolio management: An empirical approach. In Proceedings of the Second ACM
International Conference on AI in Finance, ICAIF ’21, New York, NY, USA, 2022.
Association for Computing Machinery.
16. John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov.
Proximal policy optimization algorithms. CoRR, abs/1707.06347, 2017.
17. Richard S. Sutton and Andrew G. Barto. Reinforcement learning - an introduction.
Adaptive computation and machine learning. MIT Press, 1998.
18. Liu XiaoYang, Li Zechu, Zhaoran Wang, and Zheng Jiahao. ElegantRL: Massively parallel framework for cloud-native deep reinforcement learning. https:
//github.com/AI4Finance-Foundation/ElegantRL, 2021.
19. Wei Bao, Jun Yue, and Yulei Rao. A deep learning framework for financial time series using stacked autoencoders and long-short term memory. PLOS ONE, 12(7):1–
24, 07 2017.
20. Zhipeng Liang, Kangkang Jiang, Hao Chen, Junhao Zhu, and Yanran Li. Deep
reinforcement learning in portfolio management. CoRR, abs/1808.09940, 2018.
21. Nicolas Heess, Dhruva TB, Srinivasan Sriram, Jay Lemmon, Josh Merel, Greg
Wayne, Yuval Tassa, Tom Erez, Ziyu Wang, S. M. Ali Eslami, Martin A. Riedmiller,
and David Silver. Emergence of locomotion behaviours in rich environments. CoRR,
abs/1707.02286, 2017.
22. Herman Kahn and Theodore E Harris. Estimation of particle transmission by random sampling. National Bureau of Standards applied mathematics series, 12:27–30,
1951.
23. Taiwan Stock Exchange. Taiwan stock exchange. https://www.twse.com.
tw/, 2022.
24. Min-Syue Chang. Application of learning to rank and autoencoder hybrid technology in portfolio strategy. Master’s thesis, National Central University, Taoyuan,
Taiwan, 2021.
25. Investopedia. Investopedia. https://www.investopedia.com/terms/t/
technicalindicator.asp, 2022.
26. Luciano Floridi and Massimo Chiriatti. Gpt-3: Its nature, scope, limits, and consequences. Minds and Machines, 30:681–694, 2020.
27. Ryan Lowe, Yi Wu, Aviv Tamar, Jean Harb, Pieter Abbeel, and Igor Mordatch.
Multi-agent actor-critic for mixed cooperative-competitive environments. Neural
Information Processing Systems (NIPS), 2017.
28. Antonio C. Briza and Prospero C. Naval. Stock trading system based on the multiobjective particle swarm optimization of technical indicators on end-of-day market
data. Applied Soft Computing, 11(1):1191–1201, 2011. |