摘要(英) |
Asset allocation is an important issue in the field of financial investment. Through a reasonable and effective asset allocation strategy, investors can adjust the proportion of funds between different financial commodities, maximizing returns while suppressing risks. In traditional asset allocation research, the Mean-Variance Model proposed by Markowitz is often used. However, when dealing with time series data of financial products, it lacks sufficient nonlinear performance and is not good at handling the dynamics of financial markets.
In recent years, deep reinforcement learning has emerged, and some researchers have begun to use deep reinforcement learning to deal with asset allocation issues. In the current related research, the reward function in the model is mostly simply calculated the amount of change in investment returns, and the risk has not been considered. However, risk is an important aspect that needs to be considered in asset allocation strategies.
This study uses a deep reinforcement learning model to study the optimization of asset allocation, which is mainly divided into two major stages. The first stage is the CNN neural network parameter adjustment in the deep reinforcement learning model, and the second stage is to compare the seven sets of reward functions and four The impact of this rebalancing frequency on model transaction performance.
In the first stage, this study designed two models with CNN and DDPG and PPO algorithms respectively. Through the test of various parameter combinations, we discussed how the parameters of CNN neural network in deep reinforcement learning affect the transaction performance of the model.
In the second stage, the best parameter combination obtained after the experiment is used to compare the transaction performance of the model through seven sets of reward functions and four rebalance frequency tests to find a reward function factor that is more suitable for asset allocation.
The DDPG and PPO models in this study need to be adjusted because of the large number of parameters that need to be adjusted. Although the reward function and rebalance frequency may also affect the selection of the best parameters of the CNN neural network, if each group of reward function and rebalance frequency are adjusted by the CNN neural network parameters, the overall number of experiments will be huge a lot of. Therefore, in order to simplify the experimental process, this study will carry out the experiment through the above-mentioned phased approach.
This study found that both models are more suitable for shallow convolutional layers, while the DDPG model is suitable for using more convolution kernels in the convolutional layer, and the PPO model is suitable for using fewer convolution kernels in the convolutional layer. In addition, both models It is more suitable for shallow fully connected layers and fewer neurons.
In this study, the average rate of return on investment returns, volatility, Sharpe rate, maximum drawdown and average annual compound growth rate are used as the factors of the reward function, and the transaction performance of the model is compared. The most appropriate reward functions in the DDPG and PPO models are found. It is the rate of change of the average investment return. In addition, this study also found that despite the use of the most appropriate reward function, if the combination of CNN neural network parameters in the DDPG and PPO models is not appropriate, good transaction performance cannot be obtained. Therefore, this study judges the CNN neural network parameters and reward function, which have an important influence on the transaction performance of the deep reinforcement learning model.
While volatility, Sharpe rate, max drawdown and compound annual growth rate are important and appropriate indicators when measuring the overall trading strategy performance and risk, they cannot make deep reinforcement learning models obtain good learning when used as a reward function. Therefore, when using deep reinforcement learning models to optimize asset allocation, the factors considered by the reward function must be carefully considered and carefully searched through reasonable experiments. |
參考文獻 |
[1]. Markowitz, H. M., 1968, Portfolio selection: efficient diversification of
investments, Yale university press, Vol. 16
[2]. Akita, R., 2016, Deep learning for stock prediction using numerical and textual information, IEEE/ACIS 15th International Conference on Computer and Information Science (ICIS), pp. 1-6
[3]. Jiang, Z., 2017, Cryptocurrency Portfolio Management with Deep Reinforcement Learning, from https://arxiv.org/abs/1612.01277
[4]. Liang, Z., 2018, Adversarial Deep Reinforcement Learning in Portfolio Management, from https://arxiv.org/abs/1808.09940
[5]. Narang, R. K., 2009, Inside the Black Box: The Simple Truth about Quantitative Trading. Description Based on Print Version Record: J. Wiley et Sons.
[6]. Xiao, L., 2017, A Secure Mobile Crowdsensing Game with Deep Reinforcement Learning, IEEE Transactions on Information Forensics and Security, Vol.13, pp.35-47
[7]. Liu, Z., 2019, Towards Understanding Chinese Checkers with Heuristics, Monte Carlo Tree Search, and Deep Reinforcement Learning, from https://arxiv.org/abs/1903.01747
[8]. Sutton, R., and Barto, A., 2014, Reinforcement Learning, Reinforcement Learning: An Introduction, pp.2-5
[9]. William F., 1992, Asset Allocation: Management Style and Performance Measurment, Journal of Portfolio Management, pp. 7-19
[10]. Meucci, A., 2009, Risk and asset allocation
[11]. Chollet, F., 2017, The Fundamental of Deep Learning, Deep Learning with Python.
[12]. Cortes, C., 2012, L2 Regularization for Learning Kernels, from https://arxiv.org/abs/1205.2653
[13]. Mahmood, H., 2019, Gradient Descent, from https://towardsdatascience.com/gradient-descent-3a7db7520711
[14]. Pai, A., 2020, Analyzing 3 Types of Neural Networks in Deep Learning, from https://www.analyticsvidhya.com/blog/2020/02/cnn-vs-rnn-vs-mlp-analyzing-3-types-of-neural-networks-in-deep-learning
[15]. Zhan, R., 2017, CS221 Project Final Report Deep Reinforcement Learning in Portfolio Management, from https://pdfs.semanticscholar.org/ec54/b8edf44070bc3166084f59ac9372176d7d86.pdf
[16]. Saha, S., 2018, A Comprehensive Guide to Convolutional Neural Networks, from https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53
[17]. Shih, M., 2019, Covolutional Neural Networks, from https://shihs.github.io/blog/machine%20learning/2019/02/25/Machine-Learning-Covolutional-Neural-Networks(CNN)
[18]. Krizhevsky, A., Sutskever, I., and Hinton, E., 2012, ImageNet Classification with Deep Convolutional Neural Network, Proceedings of the 25th International Conference on Neural Information Processing Systems (NOPS’12), Vol.1, pp.1097-1105
[19]. Perera, S., 2019, An introduction to Reinforcement Learning, from https://towardsdatascience.com/an-introduction-to-reinforcement-learning-1e7825c60bbe
[20]. Mnih, V., 2015, Human-level control through deep reinforcement learning, Proceedings of Nature, Vol.518, pp.529-533
[21]. Hasselt, H., Guez, A., and Silver, D., 2015, Deep Reinforcement Learning with Double Q-learning, from https://arxiv.org/abs/1509.06461
[22]. Wang, Z., 2016, Dueling Network Architectures for Deep Reinforcement Learning, from https://arxiv.org/abs/1511.06581
[23]. Sutton, R., 2000, Policy Gradient Methods for Reinforcement Learning with Function Approximation, Proceedings of the 12th International Conference on Neural Information Processing Systems, pp.1057-1063
[24]. Rosenstein, M., 2004, Supervised Actor-Critic Reinforcement Learning, from https://www-anw.cs.umass.edu/pubs/2004/rosenstein_b_ADP04.pdf
[25]. Silver, D., 2014, Deterministic Policy Gradient Algorithms, Proceedings of International Conference on Machine Learning, Vol.32.
[26]. Schulman, J., 2017, Proximal Policy Optimization Algorithms, from https://arxiv.org/abs/1707.06347
[27]. Jiang, Z., 2017, A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem, from https://arxiv.org/abs/1706.10059
[28]. TensorLayer, https://github.com/tensorlayer/tensorlayer/tree/master/examples/reinforcement_learning
[29]. Szegedy, C., 2015, Going deeper with convolutions, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1-9 |