摘要(英) |
Money management, or asset allocation, is always the center in the area of trading. Since the modern portfolio theory proposed by Markowitz in 1952, it already attracts lots of talents into this fascinated problem. Among these newly introduced approaches, the Kelly criterion is one of the shining stars. It provides an elegant way to give players and investors an optimal bidding fraction which maximizes their logarithm wealth in the long run. However, it ignores a reality that each investor usually has his risk tolerance, and the fraction came out from the Kelly criterion disregard the down-side risk. In this study, we not only try to use a probability-based approach to model the risk but also, we revise the reward function of the deep reinforcement learning to consider the down-side risk. To sum up, the revised deep reinforcement learning can consider an investor’s risk tolerance rather than a naive reward function which only maximizes the return. Finally, we use DXY, GBP/USD, and EUR/USD as the underlyings of training and validation data set, and only consider the case of a single asset. The result reveals that our revision on the reward function indeed come out with an exciting performance. When the desired MDD is above 3%, the probability is averagely above 70%. |
參考文獻 |
[1] Herbert G Grubel. “Internationally Diversified Portfolios: Welfare Gains and Capital Flows”. In: The American Economic Review 58.5 (1968), pp. 1299–1314.
[2] Mu-En Wu, Sheng-Hao Lin, and Jia-Ching Wang. “Embedded draw-down constraint using ensemble learning for stock trading”. In: Journal of Intelligent & Fuzzy Systems Preprint (), pp. 1–9.
[3] Edward O Thorp. “Portfolio choice and the Kelly criterion”. In: Stochastic optimization mod- els in finance. Elsevier, 1975, pp. 599–619.
[4] Leonard C MacLean, Edward O Thorp, and William T Ziemba. The Kelly capital growth investment criterion: Theory and practice. Vol. 3. world scientific, 2011.
[5] Leonard C MacLean, Edward O Thorp, and William T Ziemba. “Good and bad properties of the Kelly criterion”. In: Risk 20.2 (2010), p. 1.
[6] Jigar Patel et al. “Predicting stock and stock price index movement using trend deterministic data preparation and machine learning techniques”. In: Expert systems with applications 42.1 (2015), pp. 259–268.
[7] Rohit Choudhry and Kumkum Garg. “A hybrid machine learning system for stock market forecasting”. In: World Academy of Science, Engineering and Technology 39.3 (2008), pp. 315– 318.
[8] CF Tsai and SP Wang. “Stock price forecasting by hybrid machine learning techniques”. In: Proceedings of the international multiconference of engineers and computer scientists. Vol. 1. 755. 2009, p. 60.
[9] Grant McQueen and Steven Thorley. “Are stock returns predictable? A test using Markov chains”. In: The Journal of Finance 46.1 (1991), pp. 239–263.
[10] Michael J Dueker. “Markov switching in GARCH processes and mean-reverting stock-market volatility”. In: Journal of Business & Economic Statistics 15.1 (1997), pp. 26–34.
[11] Christopher M Turner, Richard Startz, and Charles R Nelson. “A Markov model of het- eroskedasticity, risk, and learning in the stock market”. In: Journal of Financial Economics 25.1 (1989), pp. 3–22.
[12] Md Rafiul Hassan and Baikunth Nath. “Stock market forecasting using hidden Markov model: a new approach”. In: 5th International Conference on Intelligent Systems Design and Appli- cations (ISDA’05). IEEE. 2005, pp. 192–196.
[13] Richard S Sutton, Andrew G Barto, et al. Introduction to reinforcement learning. Vol. 135. MIT press Cambridge, 1998.
[14] Yue Deng et al. “Deep direct reinforcement learning for financial signal representation and trading”. In: IEEE transactions on neural networks and learning systems 28.3 (2016), pp. 653– 664.
[15] Zhengyao Jiang and Jinjun Liang. “Cryptocurrency portfolio management with deep re- inforcement learning”. In: 2017 Intelligent Systems Conference (IntelliSys). IEEE. 2017, pp. 905–913.
[16] O Jangmin et al. “Adaptive stock trading with dynamic asset allocation using reinforcement learning”. In: Information Sciences 176.15 (2006), pp. 2121–2147.
31
[17] John Moody et al. “Performance functions and reinforcement learning for trading systems and portfolios”. In: Journal of Forecasting 17.5-6 (1998), pp. 441–470.
[18] Lorenzo Pascual, Juan Romo, and Esther Ruiz. “Bootstrap prediction for returns and volatili- ties in GARCH models”. In: Computational Statistics & Data Analysis 50.9 (2006), pp. 2293– 2312.
[19] Olivier Ledoit, Pedro Santa-Clara, and Michael Wolf. “Flexible multivariate GARCH model- ing with an application to international stock markets”. In: Review of Economics and Statis- tics 85.3 (2003), pp. 735–747.
[20] Juri Marcucci. “Forecasting stock market volatility with regime-switching GARCH models”. In: Studies in Nonlinear Dynamics & Econometrics 9.4 (2005).
[21] Eric Jondeau and Michael Rockinger. “The copula-garch model of conditional dependencies: An international stock market application”. In: Journal of international money and finance 25.5 (2006), pp. 827–853.
[22] Philip Hans Franses and Dick Van Dijk. “Forecasting stock market volatility using (non- linear) Garch models”. In: Journal of Forecasting 15.3 (1996), pp. 229–235.
[23] Ray Yeutien Chou. “Volatility persistence and stock valuations: Some empirical evidence using GARCH”. In: Journal of Applied Econometrics 3.4 (1988), pp. 279–294.
[24] Richard M Levich. “Empirical studies of exchange rates: price behavior, rate determination and market efficiency”. In: Handbook of international economics 2 (1985), pp. 979–1040.
[25] William Poundstone. Fortune’s Formula: The untold story of the scientific betting system that beat the casinos and wall street. Hill and Wang, 2010.
[26] Enzo Busseti, Ernest K Ryu, and Stephen Boyd. “Risk-constrained Kelly gambling”. In: The Journal of Investing 25.3 (2016), pp. 118–134.
[27] Volodymyr Mnih et al. “Human-level control through deep reinforcement learning”. In: Nature 518.7540 (2015), pp. 529–533.
[28] Long-Ji Lin. “Self-improving reactive agents based on reinforcement learning, planning and teaching”. In: Machine learning 8.3-4 (1992), pp. 293–321. |