基於I-score和Q-learning的投資組合

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：26

、訪客IP：3.144.35.147

姓名

李宗祐(Zong-Yu Li) 查詢紙本館藏

畢業系所

統計研究所

論文名稱

基於I-score和Q-learning的投資組合
(Portfolio Selection Based on I-Score and Q-Learning)

相關論文

★ Q學習結合監督式學習在股票市場的應用	★ 基於Q-learning與非監督式學習之交易策略
★ 視覺化股票市場之狀態變動	★ SNF效應的理論解釋和高影響力聚類特徵的識別
★ 利用強化學習探索可再生能源交易市場中的參與者策略	★ 軟訊息下的滯後多元貝氏結構GARCH模型及其應用
★ 基於動態網絡和vine copula的投資組合優化

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 (2029-8-1以後開放)

摘要(中)

在學術界和金融行業中，制定反映不同經濟狀態的盈利策略已引起了廣泛關注。隨著人工智能的快速發展，近期出現許多新的投資策略，其中Q-learning是一種建立在馬可夫決策過程數學框架上的強化學習方法，可幫助投資者預測市場趨勢並提供當前情況下的最佳策略。在Q-learning中，定義狀態的重要性不言而喻，本研究提出了一種基於無監督學習和I-score的方法，用於定義執行Q-learning時的狀態，根據所定義的狀態便可為每個資產建構一個Q-learning投資策略，並可進而建立一基於多標的資產的投資組合。實證結果表明，所提出的方法產生令人滿意的投資回報。

摘要(英)

Formulating profitable strategies for reflecting different economic states has attracted much attention in academics and financial industries. With the rapid development of artificial intelligence, new investment strategies have been created recently. Q-learning is a reinforcement learning method built upon the mathematical framework of Markov decision processes, aiding investors in predicting market trends and providing the best strategies in the current scenario. In Q-learning, the importance of defining states is self-evident. This study proposes an approach based on unsupervised learning and the I-score to define the states when conducting Q-learning. Accordingly, we construct a Q-learning investment strategy for each asset and propose an investment portfolio of multiple stocks. Empirical results demonstrate that the proposed method produces promising investment returns.

關鍵字(中)

★ I-score
★ 投資組合
★ Q學習

關鍵字(英)

★ I-score
★ portfolio selection
★ Q-learning

論文目次

1 Introduction 1
2 Literature Review 3
2.1 I-score ...................................... 3
2.2 Markov decisions process ........................... 4
2.3 Q-learning .................................... 5
2.4 Time series models ............................... 8
2.4.1 ARMA-GARCH model ......................... 8
2.4.2 Forecasting of an ARMA-GARCH model ............... 9
3 Methodology 9
3.1 Data pretreatment ............................... 9
3.2 States clustering for Q-learning ........................ 11
3.2.1 Influence measure ............................ 11
3.2.2 A backward dropping algorithm .................... 12
3.3 Using ARMA-GARCH model for prediction as conditional constraints ... 13
3.4 Portfolio construction based on Q-learning .................. 14
4 Empirical Study 16
4.1 Datasets ..................................... 16
4.2 Experiment setting ............................... 17
4.3 Experimental results .............................. 19
5 Conclusion and Discussion 29
A Datasets 35
B I-score for the stocks 36
C Q-learning performances for the stocks 37
D Empirical Study in 2020 40
E Empirical Study in 2021 42
F Empirical Study in 2023 44

參考文獻

[1] Akaike, H. (1973). Maximum likelihood identification of Gaussian autoregressive
moving average models. Biometrika, 60(2), 255-265.
[2] Busoniu, L., Babuska, R., and De Schutter, B. (2010). Multi-agent reinforcement
learning: An overview. Innovations in Multi-Agent Systems and Applications-1, 183-
221.
[3] Chakole, J. B., Kolhe, M. S., Mahapurush, G. D., Yadav, A., and Kurhekar, M. P.
(2021). A Q-learning agent for automated trading in equity stock markets. Expert
Systems with Applications, 163, 113761.
[4] Cherno↵, H., Lo, S. H., and Zheng, T. (2009). Discovering influential variables: a
method of partitions. Annals of Applied Statistics, 3,1335-1369.
[5] Cui, T., Ding, S., Jin, H., and Zhang, Y. (2023). Portfolio constructions in cryptocurrency market: A CVaR-based deep reinforcement learning approach. Economic
Modelling, 119, 106078.
[6] Cui, T., Du, N., Yang, X., and Ding, S. (2024). Multi-period portfolio optimization
using a deep reinforcement learning hyper-heuristic approach. Technological Forecasting and Social Change, 198, 122944.
[7] Hu, J. and Wellman, M. P. (2003). Nash Q-learning for general-sum stochastic games.
Journal of Machine Learning Research, 4(Nov), 1039-1069.
[8] Huang, S. F., Chiang, H. H., and Lin, Y. J. (2021). A network autoregressive model
with GARCH e↵ects and its applications. PLOS ONE, 16, e0255422.
[9] Huang, S. F. and Lin, T. Y. (2018). A linearization of portfolio optimization problem
with general risk measures under multivariate conditional heteroskedastic models.
Asia-Pacific Journal of Financial Studies, 47, 449-469.
[10] Khan, N., Zafar, M., Okunlola, A. F., Zoltan, Z., and Robert, M. (2022). Effects of financial inclusion on economic growth, poverty, sustainability, and financial efficiency: Evidence from the G20 countries. Sustainability, 14(19), 12688.
[11] Lo, S. H. and Yin, Y. (2021). A novel interaction-based methodology towards explainable AI with better understanding of Pneumonia Chest X-ray Images. Discover
Artificial Intelligence, 1(1), 16.
[12] Markowitz, H. M. (1952). Portfolio selection. Journal of Finance, 7, 77-91.
[13] Markowitz, H. M. (1959). Portfolio Selection. John Wiley and Sons, New York.
[14] Neuneier, R. (1997). Enhancing Q-learning for optimal asset allocation. Advances in
Neural Information Processing Systems, 10.
[15] Sahu, S. K., Mokhade, A., and Bokde, N. D. (2023). An overview of machine learning,
deep learning, and reinforcement learning-based techniques in quantitative finance:
recent progress and challenges. Applied Sciences, 13(3), 1956.
[16] Sutton, R. S. and Barto, A. G. (2018). Reinforcement Learning: An Introduction.
MIT Press.
[17] Syu, J. H., Yeh, Y. R., Wu, M. E., and Ho, J. M. (2021). Self-management portfolio
system with adaptive association mining: a practical application on Taiwan stock
market. Mathematics, 9, 1093.
[18] Tsay, R. S. (2010). Analysis of Financial Time Series. John Wiley and Sons.
[19] Wang, H., Lo, S. H., Zheng, T., and Hu, I. (2012). Interaction-based feature selection
and classification for high-dimensional biological data. Bioinformatics, 28(21), 2834-
2842.
[20] Watkins, C. J. (1989). Learning from Delayed Rewards. Ph.D. dissertation, University
of Cambridge, Cambridge, England.
[21] Watkins, C. J. and Dayan, P. (1992). Q-learning. Machine Learning, 8, 279-292.

指導教授

黃士峰(Shih-Feng Huang)

審核日期

2024-7-11

推文