結合注意力機制與層標準化的神經網路於股價預測之研究

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：7

、訪客IP：3.19.76.4

姓名

柯伯叡(Bo-Ruei Ke) 查詢紙本館藏

畢業系所

資訊管理學系

論文名稱

結合注意力機制與層標準化的神經網路於股價預測之研究
(Combining attention mechanism with layer normalized neural network in stock price forecasting: a case study of electronics industry)

相關論文

★ 具代理人之行動匿名拍賣與付款機制	★ 網路攝影機遠端連線安全性分析
★ HSDPA環境下的複合式細胞切換機制	★ 樹狀結構為基礎之行動隨意網路IP位址分配機制
★ 平面環境中目標區域之偵測 - 使用行動感測網路技術	★ 藍芽Scatternet上的P2P檔案分享機制
★ 交通壅塞避免之動態繞路機制	★ 運用UWB提升MANET上檔案分享之效能
★ 合作學習平台對團體迷思現象及學習成效之影響–以英文字彙學習為例	★ 以RFID為基礎的室內定位機制─使用虛擬標籤的經驗法則
★ 適用於實體購物情境的行動商品比價系統-使用影像辨識技術	★ 信用卡網路刷卡安全性
★ DEAP:適用於行動RFID系統之高效能動態認證協定	★ 在破產預測與信用評估領域對前處理方式與分類器組合的比較分析
★ 單一類別分類方法於不平衡資料集－搭配遺漏值填補和樣本選取方法	★ 正規化與變數篩選在破產領域的適用性研究

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 (2026-1-1以後開放)

摘要(中)

利用消息面預測未來股價趨勢的過往研究中，許多學者在自然語言的處理上多採用靜態表示方式的詞嵌入方法。為了瞭解動態表示方式的詞嵌入方法是否適用基於消息面訊息的股價預測任務上，本研究蒐集了兩間報社(Barron、Reuters)的資料並以蘋果公司(AAPL)及微軟公司(MSFT)為預測標的，搭配兩種動態表示方式的詞嵌入(Sentence-BERT、BERT)與三種靜態表示方式的詞嵌入方法(paragraph Vector、Word2Vec、TF-IDF)，探討不同詞嵌入方法對於結果的影響。此外，有鑑於消息面中每個新聞事件對股價的影響力均不一致，本研究提出一個基於注意力機制與層標準化的長短期記憶模型(Attention mechanism and Layer normalization-based LSTM, AL_LSTM)，將注意力集中在股票漲跌貢獻較大的新聞事件上，藉此幫助模型掌握關鍵訊息。本研究發現在整體平均下，詞嵌入方法Sentence-BERT表示消息面時在準確度上有正面的影響，並且最高準確度達69.07%。而本研究提出的AL_LSTM相較於深度學習模型LSTM和機器學習模型SVM，平均在準確度上能分別提升4.27%及6.32%，能有效預測未來股價趨勢的變化。

摘要(英)

In previous researches using news to predict future stock price trends, many scholars have adopted the word embedding method of static representation in natural language processing. In order to understand the applicability of the word embedding method of dynamic representation in the task of stock price prediction based on news information. We collected data from two newspapers (Barron, Reuters) and used Apple (AAPL) and Microsoft (MSFT) as the forecast targets, with two dynamic representations of word embedding methods (Sentence-BERT, BERT) and three static representations of word embedding methods (paragraph Vector, Word2Vec, TF-IDF) to explore the impact of different word embedding methods on the prediction performance. In addition, because each news event has a different impact on the stock price trend, this study proposes an Attention mechanism and Layer normalization-based LSTM (AL_LSTM) to focus attention on news events that have a greater contribution to the stock price trend, thereby helping the model understand key information. This study found that under the overall average, using Sentence-BERT as the word embedding method for news messages has a positive effect on accuracy, and the highest accuracy is 69.07%. The accuracy of the AL_LSTM proposed in this study is 4.27% and 6.32% higher than the deep learning model LSTM and the machine learning model SVM, which can effectively predict future stock price changes.

關鍵字(中)

★ Sentence-BERT
★ 動態詞嵌入
★ 注意力機制
★ 層標準化
★ 股價預測

關鍵字(英)

★ Sentence-BERT
★ dynamic word embedding
★ attention mechanism
★ layer normalization
★ stock price prediction
★ attention

論文目次

摘要 i
Abstract ii
誌謝 iii
目錄 iv
圖目錄 vi
表目錄 viii
一、緒論 1
1-1　　研究背景 1
1-2　　研究動機 2
1-3　　研究目的 3
二、文獻探討 4
2-1　　過往股價預測文獻探討 4
2-2　　詞嵌入 5
2-2-1　TF-IDF (Term Frequency-Inverse Document Frequency) 7
2-2-2　Word2Vec 8
2-2-3　PV (Paragraph Vector) 9
2-2-4　BERT (Bidirectional Encoder Representations from Transformers) 10
2-2-5　SBERT (Sentence-BERT) 11
2-3　　機器學習、深度學習模型 11
2-3-1　SVM (Support Vector Machine) 13
2-3-2　LSTM (Long Short-Term Memory) 13
2-3-3　注意力機制 (Attention mechanism) 14
2-3-4　層標準化 (Layer normalization) 15
2-3-5　殘差連接 (Residual connections) 16
三、研究方法 18
3-1　　研究數據集 19
3-2　　非結構化資料前處理 20
3-3　　資料標注 22
3-4　　實驗參數設定與方法 22
3-5　　評估指標 27
四、實驗結果與分析 28
4-1　　視窗大小於機器學習與深度學習的探討 28
4-1-1　視窗大小於各分類器的影響 28
4-1-2　小結 35
4-2　　詞嵌入方法的優劣探討 35
4-3　　探討機器學習與深度學習於股票預測結果 40
4-4　　報社與公司於股市的影響 43
4-4-1　Barron與Reuters報社在相同公司上的差異 44
4-4-2　AAPL與MSFT公司於報社上的差異 48
4-4-3　小結 52
五、結論 55
5-1　　結論與貢獻 55
5-2　　研究限制 56
5-3　　未來研究與建議 56
參考文獻 57

參考文獻

[1] Peters, M. E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., & Zettlemoyer, L. (2018). Deep contextualized word representations. arXiv preprint arXiv:1802.05365.
[2] Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
[3] Singh, R., & Srivastava, S. (2017). Stock prediction using deep learning. Multimedia Tools and Applications, 76(18), 18569-18584.
[4] Wang, Y., Hou, Y., Che, W., & Liu, T. (2020). From static to dynamic word representations: a survey. International Journal of Machine Learning and Cybernetics, 1-20.
[5] Akita, R., Yoshihara, A., Matsubara, T., & Uehara, K. (2016, June). Deep learning for stock prediction using numerical and textual information. In 2016 IEEE/ACIS 15th International Conference on Computer and Information Science (ICIS) (pp. 1-6). IEEE.
[6] Chiewhawan, T., & Vateekul, P. (2020, July). Explainable deep learning for thai stock market prediction using textual representation and technical indicators. In Proceedings of the 8th International Conference on Computer and Communications Management (pp. 19-23).
[7] Reimers, N., & Gurevych, I. (2019). Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084.
[8] Kumar, R. B., Kumar, B. S., & Prasad, C. S. S. (2012). Financial news classification using SVM. International Journal of Scientific and Research Publications, 2(3), 1-6.
[9] Li, X., Li, Y., Yang, H., Yang, L., & Liu, X. Y. (2019). DP-LSTM: Differential privacy-inspired LSTM for stock prediction using financial news. arXiv preprint arXiv:1912.10806.
[10] Huang, W., Nakamori, Y., & Wang, S. Y. (2005). Forecasting stock market movement direction with support vector machine. Computers & operations research, 32(10), 2513-2522.
[11] Liu, G., & Wang, X. (2018). A numerical-based attention method for stock market prediction with dual information. Ieee Access, 7, 7357-7367.
[12] Hwang, Y., Kim, H. J., Choi, H. J., & Lee, J. (2020). Exploring abnormal behavior patterns of online users with emotional eating behavior: topic modeling study. Journal of medical Internet research, 22(3), e15700.
[13] Qasem, M., Thulasiram, R., & Thulasiram, P. (2015, August). Twitter sentiment classification using machine learning techniques for stock markets. In 2015 International Conference on Advances in Computing, Communications and Informatics (ICACCI) (pp. 834-840). IEEE.
[14] Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.
[15] Le, Q., & Mikolov, T. (2014, June). Distributed representations of sentences and documents. In International conference on machine learning (pp. 1188-1196). PMLR.
[16] Peng, Y., & Jiang, H. (2015). Leverage financial news to predict stock price movements using word embeddings and deep neural networks. arXiv preprint arXiv:1506.07220.
[17] Schumaker, R. P., & Chen, H. (2009). Textual analysis of stock market prediction using breaking financial news: The AZFin text system. ACM Transactions on Information Systems (TOIS), 27(2), 1-19.
[18] Sagala, T. W., Saputri, M. S., Mahendra, R., & Budi, I. (2020, January). Stock Price Movement Prediction Using Technical Analysis and Sentiment Analysis. In Proceedings of the 2020 2nd Asia Pacific Information Technology Conference (pp. 123-127).
[19] Liu, H. (2018). Leveraging financial news for stock trend prediction with attention-based recurrent neural network. arXiv preprint arXiv:1811.06173.
[20] Minh, D. L., Sadeghi-Niaraki, A., Huy, H. D., Min, K., & Moon, H. (2018). Deep learning approach for short-term stock trends prediction based on two-stream gated recurrent unit network. Ieee Access, 6, 55392-55404.
[21] Ding, X., Zhang, Y., Liu, T., & Duan, J. (2015, June). Deep learning for event-driven stock prediction. In Twenty-fourth international joint conference on artificial intelligence.
[22] Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine learning, 20(3), 273-297.
[23] Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural computation, 9(8), 1735-1780.
[24] Qiu, J., Wang, B., & Zhou, C. (2020). Forecasting stock prices with long-short term memory neural network based on attention mechanism. PloS one, 15(1), e0227222.
[25] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. In Advances in neural information processing systems (pp. 5998-6008).
[26] Ioffe, S., & Szegedy, C. (2015, June). Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International conference on machine learning (pp. 448-456). PMLR.
[27] Ba, J. L., Kiros, J. R., & Hinton, G. E. (2016). Layer normalization. arXiv preprint arXiv:1607.06450.
[28] 謝宜霖。「運用注意力機制與層標準化技術於孿生神經網路以改善雙向長短期記憶模型之不平衡資料集分類」。碩士論文，國立中興大學資訊科學與工程學系所，2021。
[29] He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778).
[30] Wang, P., Xu, B., Xu, J., Tian, G., Liu, C. L., & Hao, H. (2016). Semantic expansion using word embedding clustering and convolutional neural network for improving short text classification. Neurocomputing, 174, 806-814.
[31] Lau, J. H., & Baldwin, T. (2016). An empirical evaluation of doc2vec with practical insights into document embedding generation. arXiv preprint arXiv:1607.05368.
[32] Xiong, R., Yang, Y., He, D., Zheng, K., Zheng, S., Xing, C., ... & Liu, T. (2020, November). On layer normalization in the transformer architecture. In International Conference on Machine Learning (pp. 10524-10533). PMLR.
[33] Galetzka, M., Strüngmann, L., & Weber, C. (2014). Intelligent Predictions: an empirical study of the Cortical Learning Algorithm. University of Applied Sciences Mannheim.

指導教授

蘇坤良(Kuen-Liang Su)

審核日期

2021-8-11

推文