dc.description.abstract | Asset allocation is an important issue in the field of financial investment. Through a reasonable and effective asset allocation strategy, investors can adjust the proportion of funds between different financial commodities, maximizing returns while suppressing risks. In traditional asset allocation research, the Mean-Variance Model proposed by Markowitz is often used. However, when dealing with time series data of financial products, it lacks sufficient nonlinear performance and is not good at handling the dynamics of financial markets.
In recent years, deep reinforcement learning has emerged, and some researchers have begun to use deep reinforcement learning to deal with asset allocation issues. In the current related research, the reward function in the model is mostly simply calculated the amount of change in investment returns, and the risk has not been considered. However, risk is an important aspect that needs to be considered in asset allocation strategies.
This study uses a deep reinforcement learning model to study the optimization of asset allocation, which is mainly divided into two major stages. The first stage is the CNN neural network parameter adjustment in the deep reinforcement learning model, and the second stage is to compare the seven sets of reward functions and four The impact of this rebalancing frequency on model transaction performance.
In the first stage, this study designed two models with CNN and DDPG and PPO algorithms respectively. Through the test of various parameter combinations, we discussed how the parameters of CNN neural network in deep reinforcement learning affect the transaction performance of the model.
In the second stage, the best parameter combination obtained after the experiment is used to compare the transaction performance of the model through seven sets of reward functions and four rebalance frequency tests to find a reward function factor that is more suitable for asset allocation.
The DDPG and PPO models in this study need to be adjusted because of the large number of parameters that need to be adjusted. Although the reward function and rebalance frequency may also affect the selection of the best parameters of the CNN neural network, if each group of reward function and rebalance frequency are adjusted by the CNN neural network parameters, the overall number of experiments will be huge a lot of. Therefore, in order to simplify the experimental process, this study will carry out the experiment through the above-mentioned phased approach.
This study found that both models are more suitable for shallow convolutional layers, while the DDPG model is suitable for using more convolution kernels in the convolutional layer, and the PPO model is suitable for using fewer convolution kernels in the convolutional layer. In addition, both models It is more suitable for shallow fully connected layers and fewer neurons.
In this study, the average rate of return on investment returns, volatility, Sharpe rate, maximum drawdown and average annual compound growth rate are used as the factors of the reward function, and the transaction performance of the model is compared. The most appropriate reward functions in the DDPG and PPO models are found. It is the rate of change of the average investment return. In addition, this study also found that despite the use of the most appropriate reward function, if the combination of CNN neural network parameters in the DDPG and PPO models is not appropriate, good transaction performance cannot be obtained. Therefore, this study judges the CNN neural network parameters and reward function, which have an important influence on the transaction performance of the deep reinforcement learning model.
While volatility, Sharpe rate, max drawdown and compound annual growth rate are important and appropriate indicators when measuring the overall trading strategy performance and risk, they cannot make deep reinforcement learning models obtain good learning when used as a reward function. Therefore, when using deep reinforcement learning models to optimize asset allocation, the factors considered by the reward function must be carefully considered and carefully searched through reasonable experiments. | en_US |