應用文字探勘技術於股價預測： 探討傳統機器學習及深度學習技術與不同財經新聞來源之關係

DC 欄位	值	語言
DC.contributor	資訊管理學系	zh_TW
DC.creator	陳瑄	zh_TW
DC.creator	Hsuan Chen	en_US
dc.date.accessioned	2021-7-7T07:39:07Z
dc.date.available	2021-7-7T07:39:07Z
dc.date.issued	2021
dc.identifier.uri	http://ir.lib.ncu.edu.tw:444/thesis/view_etd.asp?URN=108423019
dc.contributor.department	資訊管理學系	zh_TW
DC.description	國立中央大學	zh_TW
DC.description	National Central University	en_US
dc.description.abstract	股價預測不管在財務、經濟或是資訊科技領域都是十分重要的研究議題，但是股價預測受到眾多因素的影響使得難以準確地預測，因此過去許多研究利用歷史股價之關鍵指標或是時間序列模型演算法以預測股價漲跌，近年來也有些研究使用社群媒體或是財經新聞透過文字探勘技術分析文本，並搭配機器學習與深度學習技術提升預測效能。目前現有研究雖有針對傳統的文字特徵表現進行比較，但在新興的自然語言處理技術發展下，較少與傳統常見的技術於股價預測領域進行全面性比較，而過去也較少研究針對不同的財經新聞來源資料進行探討，因此本研究利用財經新聞，比較了上述相關文字技術何種對於股價預測會有較佳之表現以及該技術於機器學習或是深度學習分類器上的影響，亦會針對不同新聞來源是否影響股價預測結果進行探討，並且更進一步地探討在股價預測研究議題上，不同訓練資料量比例對預測效能之影響。　　本研究實驗結果顯示 AUC 表現最佳的實驗組合為（CNN+Word2vec），大部分結果約在 0.53 至 0.56 之間；Apple 公司以新聞來源 Reuters 有較好的表現，代表該新聞對於該公司較能反映出股價漲跌；而 Bank of America 則是以 The Motley Fool 為最佳，因此可以發現 The Motley Fool 在股價預測上也是不錯的新聞來源對象，也從中發現近年來平均股價變化較小的公司比平均股價變化較大的公司在不同新聞來源中均有較好的表現；於不同訓練資料量比例上之實驗結果顯示　AUC 隨著訓練資料量比例的增加，預測效能也有所提高，表現最佳為在訓練資料比例為 70% 或是 50% 時，代表在資料收集的年份上4至6年有不錯的表現。	zh_TW
dc.description.abstract	Stock prediction has long been regarded as a very interesting and important research problem in finance, economic, information technology, etc. To accurately predict stock prices is difficult because there are various factors affecting stock prices. In the past, many studies predicted stock prices through some technical indicators and time series forecasting algorithms. In recent years, some studies utilized financial news to predict the stock trend by text mining and machine learning techniques. Despite many different text feature representation methods being used for stock prediction, there is no a comprehensive study of comparing different kinds of text mining techniques. Therefore, one major research objective of this thesis is to develop effective prediction models with different text representations for performance comparisons. Moreover, the impacts of using different news sources and different ratios of training data on the prediction models are also examined. The experiment results demonstrate that the combination of deep learning method by CNN with the text representation by Word2vec achieves the best results, and most of the average AUC results are between 0.53 and 0.56. Moreover, the news articles collected from The Motley Fool and Reuters are the better choices to predict stock trends than CNBC. The results show that the company having a smaller level of stock price changes performs better than the company having a larger level of stock price changes. We also find that using the higher training data ratios can produce the higher prediction performance in general. In particular, using either 70% or 50% of the training data in the eight-year duration can make the prediction models reach relatively higher performances.	en_US
DC.subject	股價預測	zh_TW
DC.subject	文字探勘	zh_TW
DC.subject	自然語言處理	zh_TW
DC.subject	機器學習	zh_TW
DC.subject	深度學習	zh_TW
DC.title	應用文字探勘技術於股價預測：探討傳統機器學習及深度學習技術與不同財經新聞來源之關係	zh_TW
dc.language.iso	zh-TW	zh-TW
DC.title	Text Mining in Stock Prediction by Traditional Machine Learning and Deep Learning Techniques with Different Financial News	en_US
DC.type	博碩士論文	zh_TW
DC.type	thesis	en_US
DC.publisher	National Central University	en_US

博碩士論文 108423019 完整後設資料紀錄