以句向量與深度學習預測個股漲跌趨勢，以美國股市為例

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：69

、訪客IP：3.147.52.238

姓名

王孝庭(Shiaw-Tying Wnag) 查詢紙本館藏

畢業系所

資訊管理學系在職專班

論文名稱

以句向量與深度學習預測個股漲跌趨勢，以美國股市為例
(Predicting Stock Prices Using Doc2Vec with Deep Learning. Using U.S. Stock Data for Verifivation)

相關論文

★ 利用資料探勘技術建立商用複合機銷售預測模型	★ 應用資料探勘技術於資源配置預測之研究-以某電腦代工支援單位為例
★ 資料探勘技術應用於航空業航班延誤分析-以C公司為例	★ 全球供應鏈下新產品的安全控管-以C公司為例
★ 資料探勘應用於半導體雷射產業-以A公司為例	★ 應用資料探勘技術於空運出口貨物存倉時間預測-以A公司為例
★ 使用資料探勘分類技術優化YouBike運補作業	★ 特徵屬性篩選對於不同資料類型之影響
★ 資料探勘應用於B2B網路型態之企業官網研究-以T公司為例	★ 衍生性金融商品之客戶投資分析與建議-整合分群與關聯法則技術
★ 應用卷積式神經網路建立肝臟超音波影像輔助判別模型	★ 基於卷積神經網路之身分識別系統
★ 能源管理系統電能補值方法誤差率比較分析	★ 企業員工情感分析與管理系統之研發
★ 資料淨化於類別不平衡問題: 機器學習觀點	★ 資料探勘技術應用於旅客自助報到之分析—以C航空公司為例

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

⾃然語⾔處理進展快速運⽤多元如情緒解讀、破產預測，甚至利⽤推⽂預測股價，而股價預測這方面是研究重點之一。
過去相關研究中以消息⾯預測股價很多是利⽤情緒分析或是TF-IDF，⽽近期有⽤詞向量為前處理⽅式，那是否⽤加強前後⽂關聯性的句向量進⾏前處理更有優勢，本篇研究就是以歐美財⾦新聞在前處理利⽤句向量及詞向量再配合不同類神經模型訓練比較預測正確率的影響，更進⼀步以單⼀新聞來源去訓練並比對混合新聞來源及未混合的差異，也比較過去結論標題預測較內⽂預測為佳這個論點以不同技術確認是否依然，最後在詞向量會去除常⽤字⽽句向量因字詞順序及⽂法意義上的不同所以不需要進⾏，那在預測模型是否還是⼀樣或是去除之後有怎樣影響。實驗結果顯⽰在配合CNN下句向量略為詞向量優秀，⽽在新聞來源因其擴散性及接觸程度對預測有其影響，混合新聞來源對長期預測較為有⽤，短期預測則接觸廣的新聞網站較為優秀，⽽在新聞標題與內⽂比較上⾯與過去研究不⼀樣的顯⽰內⽂較為優秀，最後句向量若去除掉常⽤字其訓練效率會提昇，預測準確度會略微下降。

摘要(英)

The natural language processing develops rapidly and be used in multiple purposes such as sentiment interpretation, bankruptcy prediction. Moreover, the twitter is used for stock price prediction, which is the main focus of research. In the past, relevant researches used financial news to predict stock prices by sentiment analysis or TF-IDF. Recently, word vectors have been used in related pre-processing methods. It uses sentence vectors to strengthen the contextual relevance of articles.
This research extracts the sentence vector and word direction from European and American financial news. Then, the prediction accuracy produced by different types of deep learning models are compared. Particularly, the models are trained with single and multiple news sources individually. In addition, the feature representations by news headlines and news content are also compared. As a result, the word vector will remove the commonly used word sentence vector, which is because the difference in word orders and syntax meanings, so that there is no need to make it. In this case, the prediciton models based on the word vectors with and without sentence vectors are also compared. The experimental results show that the sentence vector under CNN performs slightly better than the word vector. On the other hand, the news sources have an impact on the prediction performance due to their spread and exposure. Mixed news sources are more useful for long-term forecasts, while short-term forecasts are exposed to a wide range of news websites. Compared with news headlines and contents, the models trained by news contents perform better, which are different from the findings of previous researches. Finally, if the last sentence vector removes the commonly used words, the training efficiency will increase, and the prediction accuracy will slightly decrease.

關鍵字(中)

★ 句向量
★ 股市預測
★ 深度學習
★ 新聞分類

關鍵字(英)

★ Doc2Vec
★ Predicting Stock
★ Deep learning
★ News Classication

論文目次

摘要 I
Abstract II
表目錄 III
圖目錄 VI
目錄 VII
Chapter 1. 緒論 1
1.1 研究背景 1
1.2 研究動機 2
1.3 研究目的 3
1.4 研究架構 4
Chapter 2 文獻探討 5
2.1 文字探勘前處理 5
2.2 深度學習 8
2.2.1 卷積型神經網路 8
2.2.2 遞迴神經網路 9
2.2.3 注意力模式的遞迴神經網路 9
2.2.4 區域型卷積神經網路 9
2.3 文字探勘在股價預測文獻回顧 10
Chapter 3. 研究方法 13
3.1 研究設計及架構 13
3.2 資料來源及處理 15
3.2.1 公司及股價資料 15
3.2.2 新聞來源 15
3.3 資料前處理及貼標 19
3.4 模型評估 21
3.5 執行環境 21
Chapter 4. 研究結果與分析 22
4.1 比較詞向量與句向量為前處理在不同深度神經網路下對預測精準度 22
4.2 區隔資料來源進行預測比較 27
4.3 比較標題及內文對預測精準度影響 32
4.4 在句向量去除常用字其影響 35
4.5 總結 37
Chapter 5. 研究結論與建議 38
5.1 結論 38
5.2 未來研究方向建議 38
參考文獻 39
附錄 41
1. 程式碼 41

參考文獻

[1] 謝劍平, 現代投資學, 6th ed. 智勝出版社.
[2] K. Yuan, G. Liu, J. Wu, and H. Xiong, “Dancing with Trump in the Stock Market: A Deep Information Echoing Model,” ACM Trans. Intell. Syst. Technol., vol. 11, no. 5, p. 62:1-62:22, Jul. 2020, doi: 10.1145/3403578.
[3] J. Klaus and C. Koser, “Measuring Trump: The Volfefe Index and its impact on European financial markets,” Finance Res. Lett., vol. 38, p. 101447, Jan. 2021, doi: 10.1016/j.frl.2020.101447.
[4] S. Shead, “Elon Musk’s tweets are moving markets — and some investors are worried,” CNBC, Jan. 29, 2021. https://www.cnbc.com/2021/01/29/elon-musks-tweets-are-moving-markets.html (accessed Apr. 19, 2021).
[5] “Elon Musk is an investment kingmaker but traders shouldn’t blindly follow his every word - CNN.” https://edition.cnn.com/2021/02/17/investing/elon-musk-social-media/index.html (accessed Apr. 19, 2021).
[6] G. Thushan, Natural Language Processing with TensorFlow. 碁峰.
[7] T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and J. Dean, “Distributed Representations of Words and Phrases and their Compositionality,” ArXiv13104546 Cs Stat, Oct. 2013, Accessed: Apr. 19, 2021. [Online]. Available: http://arxiv.org/abs/1310.4546
[8] Q. V. Le and T. Mikolov, “Distributed Representations of Sentences and Documents,” ArXiv14054053 Cs, May 2014, Accessed: Apr. 09, 2021. [Online]. Available: http://arxiv.org/abs/1405.4053
[9] M. Vargas, B. Lima, and A. Evsukoff, “Deep learning for stock market prediction from financial news articles,” Jun. 2017, pp. 60–65. doi: 10.1109/CIVEMSA.2017.7995302.
[10] Y. Kim, “Convolutional Neural Networks for Sentence Classification,” ArXiv14085882 Cs, Sep. 2014, Accessed: May 19, 2021. [Online]. Available: http://arxiv.org/abs/1408.5882
[11] K. Fukushima, “Neocognitron: a self organizing neural network model for a mechanism of pattern recognition unaffected by shift in position,” Biol. Cybern., vol. 36, no. 4, pp. 193–202, 1980, doi: 10.1007/BF00344251.
[12] D. Britz, “Understanding Convolutional Neural Networks for NLP,” WildML, Nov. 07, 2015. http://www.wildml.com/2015/11/understanding-convolutional-neural-networks-for-nlp/ (accessed May 19, 2021).
[13] S. Minaee, N. Kalchbrenner, E. Cambria, N. Nikzad, M. Chenaghlu, and J. Gao, “Deep Learning Based Text Classification: A Comprehensive Review,” ArXiv200403705 Cs Stat, Jan. 2021, Accessed: Apr. 09, 2021. [Online]. Available: http://arxiv.org/abs/2004.03705
[14] S. Lai, L. Xu, K. Liu, and J. Zhao, “Recurrent Convolutional Neural Networks for Text Classification,” p. 7.
[15] Y. Zhai, A. Hsu, and S. K. Halgamuge, “Combining News and Technical Indicators in Daily Stock Price Trends Prediction,” in Advances in Neural Networks – ISNN 2007, Berlin, Heidelberg, 2007, pp. 1087–1096. doi: 10.1007/978-3-540-72395-0_132.
[16] R. P. Schumaker and H. Chen, “A quantitative stock prediction system based on financial news,” Inf. Process. Manag., vol. 45, no. 5, pp. 571–583, Sep. 2009, doi: 10.1016/j.ipm.2009.05.001.
[17] X. Li, C. Wang, J. Dong, F. Wang, X. Deng, and S. Zhu, “Improving Stock Market Prediction by Integrating Both Market News and Stock Prices,” in Database and Expert Systems Applications, Berlin, Heidelberg, 2011, pp. 279–293. doi: 10.1007/978-3-642-23091-2_24.
[18] E. Junqué de Fortuny, T. De Smedt, D. Martens, and W. Daelemans, “Evaluating and understanding text-based stock price prediction models,” Inf. Process. Manag., vol. 50, no. 2, pp. 426–441, Mar. 2014, doi: 10.1016/j.ipm.2013.12.002.
[19] V. Lavrenko, M. Schmill, D. Lawrie, P. Ogilvie, D. Jensen, and J. Allan, “Language models for financial news recommendation,” in Proceedings of the ninth international conference on Information and knowledge management, New York, NY, USA, Nov. 2000, pp. 389–396. doi: 10.1145/354756.354845.
[20] M. Mittermayer and G. Knolmayer, “NewsCATS: A News Categorization and Trading System,” in Sixth International Conference on Data Mining (ICDM’06), Hong Kong, China, Dec. 2006, pp. 1002–1007. doi: 10.1109/ICDM.2006.115.
[21] X. Ding, Y. Zhang, T. Liu, and J. Duan, “Deep Learning for Event-Driven Stock Prediction,” p. 7, 2015.
[22] A. Tafti, R. Zotti, and W. Jank, “Real-Time Diffusion of Information on Twitter and the Financial Markets,” PLOS ONE, vol. 11, no. 8, p. e0159226, Aug. 2016, doi: 10.1371/journal.pone.0159226.
[23] R. Akita, A. Yoshihara, T. Matsubara, and K. Uehara, “Deep learning for stock prediction using numerical and textual information,” in 2016 IEEE/ACIS 15th International Conference on Computer and Information Science (ICIS), Okayama, Japan, Jun. 2016, pp. 1–6. doi: 10.1109/ICIS.2016.7550882.
[24] M. Phillips, “Nasdaq: Here’s Our Timeline of the Flash Crash,” Wall Street Journal, May 11, 2010. Accessed: Apr. 20, 2021. [Online]. Available: https://www.wsj.com/articles/BL-MB-21942
[25] G. Shorter and R. S. Miller, “High-Frequency Trading: Background, Concerns, and Regulatory Developments,” p. 47.
[26] J. Bollen, H. Mao, and X. Zeng, “Twitter mood predicts the stock market,” J. Comput. Sci., vol. 2, no. 1, pp. 1–8, Mar. 2011, doi: 10.1016/j.jocs.2010.12.007.
[27] J. Yang, C. Zhao, H. Yu, and H. Chen, “Use GBDT to Predict the Stock Market,” Procedia Comput. Sci., vol. 174, pp. 161–171, Jan. 2020, doi: 10.1016/j.procs.2020.06.071.
[28] X. Ji, J. Wang, and Z. Yan, “A stock price prediction method based on deep learning technology,” Int. J. Crowd Sci., vol. ahead-of-print, no. ahead-of-print, Jan. 2021, doi: 10.1108/IJCS-05-2020-0012.
[29] T. Matsubara, R. Akita, and K. Uehara, “Stock Price Prediction by Deep Neural Generative Model of News Articles,” IEICE Trans. Inf. Syst., vol. E101.D, no. 4, pp. 901–908, 2018, doi: 10.1587/transinf.2016IIP0016.
[30] H. Lee, M. Surdeanu, B. MacCartney, and D. Jurafsky, “On the Importance of Text Analysis for Stock Price Prediction,” p. 6.
[31] Y. Peng and H. Jiang, “Leverage Financial News to Predict Stock Price Movements Using Word Embeddings and Deep Neural Networks,” in Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, California, 2016, pp. 374–379. doi: 10.18653/v1/N16-1041.
[32] L. Yang et al., “Explainable Text-Driven Neural Network for Stock Prediction,” in 2018 5th IEEE International Conference on Cloud Computing and Intelligence Systems (CCIS), Nanjing, China, Nov. 2018, pp. 441–445. doi: 10.1109/CCIS.2018.8691233.
[33] P. Cremonesi et al., “Social Network based Short-Term Stock Trading System,” p. 8.
[34] “End of Day Stock Market Data API | Tiingo.” https://api.tiingo.com/products/end-of-day-stock-price-data (accessed Apr. 29, 2021).
[35] “Top 10 U.S. Daily Newspapers,” Cision. https://www.cision.com/2019/01/top-ten-us-daily-newspapers/ (accessed Apr. 29, 2021).
[36] “List of business newspapers,” Wikipedia. Apr. 15, 2021. Accessed: Apr. 29, 2021. [Online]. Available: https://en.wikipedia.org/w/index.php?title=List_of_business_newspapers&oldid=1017868117
[37] X. Ding, Y. Zhang, T. Liu, and J. Duan, “Using Structured Events to Predict Stock Price Movement: An Empirical Investigation,” in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 2014, pp. 1415–1425. doi: 10.3115/v1/D14-1148.

指導教授

蔡志豐

審核日期

2021-7-12

推文