姓名 王孝庭(Shiaw-Tying Wnag)
論文名稱 以句向量與深度學習預測個股漲跌趨勢,以美國股市為例
論文名稱 以句向量與深度學習預測個股漲跌趨勢,以美國股市為例
(Predicting Stock Prices Using Doc2Vec with Deep Learning. Using U.S. Stock Data for Verifivation)
摘要(中) ⾃然語⾔處理進展快速運⽤多元如情緒解讀、破產預測,甚至利⽤推⽂預測股價,而股價預測這方面是研究重點之一。
摘要(英) The natural language processing develops rapidly and be used in multiple purposes such as sentiment interpretation, bankruptcy prediction. Moreover, the twitter is used for stock price prediction, which is the main focus of research. In the past, relevant researches used financial news to predict stock prices by sentiment analysis or TF-IDF. Recently, word vectors have been used in related pre-processing methods. It uses sentence vectors to strengthen the contextual relevance of articles.
This research extracts the sentence vector and word direction from European and American financial news. Then, the prediction accuracy produced by different types of deep learning models are compared. Particularly, the models are trained with single and multiple news sources individually. In addition, the feature representations by news headlines and news content are also compared. As a result, the word vector will remove the commonly used word sentence vector, which is because the difference in word orders and syntax meanings, so that there is no need to make it. In this case, the prediciton models based on the word vectors with and without sentence vectors are also compared. The experimental results show that the sentence vector under CNN performs slightly better than the word vector. On the other hand, the news sources have an impact on the prediction performance due to their spread and exposure. Mixed news sources are more useful for long-term forecasts, while short-term forecasts are exposed to a wide range of news websites. Compared with news headlines and contents, the models trained by news contents perform better, which are different from the findings of previous researches. Finally, if the last sentence vector removes the commonly used words, the training efficiency will increase, and the prediction accuracy will slightly decrease.
關鍵字(中) ★ 句向量
★ 股市預測
★ 深度學習
★ 新聞分類
關鍵字(英) ★ Doc2Vec
★ Predicting Stock
★ Deep learning
★ News Classication
論文目次 摘要 I
Abstract II
表目錄 III
圖目錄 VI
目錄 VII
Chapter 1. 緒論 1
1.1 研究背景 1
1.2 研究動機 2
1.3 研究目的 3
1.4 研究架構 4
Chapter 2 文獻探討 5
2.1 文字探勘前處理 5
2.2 深度學習 8
2.2.1 卷積型神經網路 8
2.2.2 遞迴神經網路 9
2.2.3 注意力模式的遞迴神經網路 9
2.2.4 區域型卷積神經網路 9
2.3 文字探勘在股價預測文獻回顧 10
Chapter 3. 研究方法 13
3.1 研究設計及架構 13
3.2 資料來源及處理 15
3.2.1 公司及股價資料 15
3.2.2 新聞來源 15
3.3 資料前處理及貼標 19
3.4 模型評估 21
3.5 執行環境 21
Chapter 4. 研究結果與分析 22
4.1 比較詞向量與句向量為前處理在不同深度神經網路下對預測精準度 22
4.2 區隔資料來源進行預測比較 27
4.3 比較標題及內文對預測精準度影響 32
4.4 在句向量去除常用字其影響 35
4.5 總結 37
Chapter 5. 研究結論與建議 38
5.1 結論 38
5.2 未來研究方向建議 38
參考文獻 39
附錄 41
1. 程式碼 41
指導教授 蔡志豐 審核日期 2021-7-12
