Embedded Draw-down Constraint by Deep Reinforcement Learning for Foreign Exchange Trading

、線上人數：91

、訪客IP：3.147.44.46

姓名	林聖皓(Sheng-Hao Lin) 查詢紙本館藏	畢業系所	資訊工程學系
論文名稱	(Embedded Draw-down Constraint by Deep Reinforcement Learning for Foreign Exchange Trading)
檔案	[Endnote RIS 格式] [Bibtex 格式] [相關文章] [文章引用] [完整記錄] [館藏目錄] [檢視] [下載] 本電子論文使用權限為同意立即開放。已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。
摘要(中)	資金管理或資產分配始終是交易領域的研究焦點。自 Markowitz 在 1952 年提出現代投資組合理論以來，已經吸引了許多人才來解決這個令人著迷的問題。在這些新近引入的方法中，凱莉公式 (Kelly Criterion) 是最被受矚目的焦點之一。其提供了一種簡潔的方法，可以為賭局玩家和金融市場投資者提供最佳的投資比例，長期而言，凱莉公式可以最大化參與者的期望對數報酬。但是，凱莉公式存在一個缺陷，即每個投資者通常都有自己的風險承受能力，而從凱利j公式得出的最適投資比率忽略了投資下行風險。在此研究，我們不僅嘗試使用基於機率的方法對風險進行捕捉，而且，我們修改了深度強化學習的獎勵函數 (reward function) 以考慮下行風險。綜上所述，經過改進後的深度強化學習可以納入投資者的風險承受能力，而不是僅單純極大化投資者長期財富。最後，我們使用DXY，GBP/USD 和 EUR/USD 作為訓練和驗證資料集的投資標的，並且僅考慮單一資產的情況。結果證明，我們對獎勵函數的改進確實表現出令人興奮的結果。當所需的MDD高於3%時，其機率平均高於70%。
摘要(英)	Money management, or asset allocation, is always the center in the area of trading. Since the modern portfolio theory proposed by Markowitz in 1952, it already attracts lots of talents into this fascinated problem. Among these newly introduced approaches, the Kelly criterion is one of the shining stars. It provides an elegant way to give players and investors an optimal bidding fraction which maximizes their logarithm wealth in the long run. However, it ignores a reality that each investor usually has his risk tolerance, and the fraction came out from the Kelly criterion disregard the down-side risk. In this study, we not only try to use a probability-based approach to model the risk but also, we revise the reward function of the deep reinforcement learning to consider the down-side risk. To sum up, the revised deep reinforcement learning can consider an investor’s risk tolerance rather than a naive reward function which only maximizes the return. Finally, we use DXY, GBP/USD, and EUR/USD as the underlyings of training and validation data set, and only consider the case of a single asset. The result reveals that our revision on the reward function indeed come out with an exciting performance. When the desired MDD is above 3%, the probability is averagely above 70%.
關鍵字(中)	★ 深度增強式學習 ★ 資金管理 ★ 凱莉公式 ★ GARCH模型	關鍵字(英)	★ Deep Reinforcement Learning ★ Money Management ★ Kelly Criterion ★ GARCH Model
論文目次	中文摘要 i 英文摘要 iii 謝誌 v 目錄 vii 圖目錄 ix 表目錄 xi 一.Introduction 1 二.Preliminaries 7 2.1 Kelly Criterion 7 2.2 Maximum Draw Down 8 2.3 Reinforcement Learning with Q Learning 9 2.4 Deep Q Network 12 三.Risk constrained bidding based on deep reinforcement learning 13 3.1 Methodology 13 3.2 Reward Function 14 3.3 Deep Reinforcement Learning 15 3.4 Bootstrap sampling with GARCH(1,1) model 16 3.5 Algorithm 18 四.Simulation and Experiments 23 4.1 Validation with Gt 23 4.2 Validation with Qt+1 25 五.Conclusion 29
參考文獻	[1] Herbert G Grubel. “Internationally Diversified Portfolios: Welfare Gains and Capital Flows”. In: The American Economic Review 58.5 (1968), pp. 1299–1314. [2] Mu-En Wu, Sheng-Hao Lin, and Jia-Ching Wang. “Embedded draw-down constraint using ensemble learning for stock trading”. In: Journal of Intelligent & Fuzzy Systems Preprint (), pp. 1–9. [3] Edward O Thorp. “Portfolio choice and the Kelly criterion”. In: Stochastic optimization mod- els in finance. Elsevier, 1975, pp. 599–619. [4] Leonard C MacLean, Edward O Thorp, and William T Ziemba. The Kelly capital growth investment criterion: Theory and practice. Vol. 3. world scientific, 2011. [5] Leonard C MacLean, Edward O Thorp, and William T Ziemba. “Good and bad properties of the Kelly criterion”. In: Risk 20.2 (2010), p. 1. [6] Jigar Patel et al. “Predicting stock and stock price index movement using trend deterministic data preparation and machine learning techniques”. In: Expert systems with applications 42.1 (2015), pp. 259–268. [7] Rohit Choudhry and Kumkum Garg. “A hybrid machine learning system for stock market forecasting”. In: World Academy of Science, Engineering and Technology 39.3 (2008), pp. 315– 318. [8] CF Tsai and SP Wang. “Stock price forecasting by hybrid machine learning techniques”. In: Proceedings of the international multiconference of engineers and computer scientists. Vol. 1. 755. 2009, p. 60. [9] Grant McQueen and Steven Thorley. “Are stock returns predictable? A test using Markov chains”. In: The Journal of Finance 46.1 (1991), pp. 239–263. [10] Michael J Dueker. “Markov switching in GARCH processes and mean-reverting stock-market volatility”. In: Journal of Business & Economic Statistics 15.1 (1997), pp. 26–34. [11] Christopher M Turner, Richard Startz, and Charles R Nelson. “A Markov model of het- eroskedasticity, risk, and learning in the stock market”. In: Journal of Financial Economics 25.1 (1989), pp. 3–22. [12] Md Rafiul Hassan and Baikunth Nath. “Stock market forecasting using hidden Markov model: a new approach”. In: 5th International Conference on Intelligent Systems Design and Appli- cations (ISDA’05). IEEE. 2005, pp. 192–196. [13] Richard S Sutton, Andrew G Barto, et al. Introduction to reinforcement learning. Vol. 135. MIT press Cambridge, 1998. [14] Yue Deng et al. “Deep direct reinforcement learning for financial signal representation and trading”. In: IEEE transactions on neural networks and learning systems 28.3 (2016), pp. 653– 664. [15] Zhengyao Jiang and Jinjun Liang. “Cryptocurrency portfolio management with deep re- inforcement learning”. In: 2017 Intelligent Systems Conference (IntelliSys). IEEE. 2017, pp. 905–913. [16] O Jangmin et al. “Adaptive stock trading with dynamic asset allocation using reinforcement learning”. In: Information Sciences 176.15 (2006), pp. 2121–2147. 31 [17] John Moody et al. “Performance functions and reinforcement learning for trading systems and portfolios”. In: Journal of Forecasting 17.5-6 (1998), pp. 441–470. [18] Lorenzo Pascual, Juan Romo, and Esther Ruiz. “Bootstrap prediction for returns and volatili- ties in GARCH models”. In: Computational Statistics & Data Analysis 50.9 (2006), pp. 2293– 2312. [19] Olivier Ledoit, Pedro Santa-Clara, and Michael Wolf. “Flexible multivariate GARCH model- ing with an application to international stock markets”. In: Review of Economics and Statis- tics 85.3 (2003), pp. 735–747. [20] Juri Marcucci. “Forecasting stock market volatility with regime-switching GARCH models”. In: Studies in Nonlinear Dynamics & Econometrics 9.4 (2005). [21] Eric Jondeau and Michael Rockinger. “The copula-garch model of conditional dependencies: An international stock market application”. In: Journal of international money and finance 25.5 (2006), pp. 827–853. [22] Philip Hans Franses and Dick Van Dijk. “Forecasting stock market volatility using (non- linear) Garch models”. In: Journal of Forecasting 15.3 (1996), pp. 229–235. [23] Ray Yeutien Chou. “Volatility persistence and stock valuations: Some empirical evidence using GARCH”. In: Journal of Applied Econometrics 3.4 (1988), pp. 279–294. [24] Richard M Levich. “Empirical studies of exchange rates: price behavior, rate determination and market efficiency”. In: Handbook of international economics 2 (1985), pp. 979–1040. [25] William Poundstone. Fortune’s Formula: The untold story of the scientific betting system that beat the casinos and wall street. Hill and Wang, 2010. [26] Enzo Busseti, Ernest K Ryu, and Stephen Boyd. “Risk-constrained Kelly gambling”. In: The Journal of Investing 25.3 (2016), pp. 118–134. [27] Volodymyr Mnih et al. “Human-level control through deep reinforcement learning”. In: Nature 518.7540 (2015), pp. 529–533. [28] Long-Ji Lin. “Self-improving reactive agents based on reinforcement learning, planning and teaching”. In: Machine learning 8.3-4 (1992), pp. 293–321.
指導教授	王家慶吳牧恩(Jia-Ching Wang Mu-En Wu)	審核日期	2020-7-28
推文	facebook plurk twitter funp google live udn HD myshare reddit netvibes friend youpush delicious baidu
網路書籤	Google bookmarks del.icio.us hemidemi myshare

博碩士論文 107522112 詳細資訊