摘要(英) |
Nowadays, information acquisition has become diverse and chaotic. With the improvement of computing power in both hardware and software, as well as the development of various
related application technologies, handling large amounts of data extraction and analysis has become much easier, leading to increased attention to data analysis-related fields and
research.
To deal with this unstructured information, certain statistical methods and algorithms (such as the Latent Dirichlet Allocation model and text sentiment analysis used in this paper) are employed to quantify text, converting it into meaningful numerical data, which can then be used as important reference information or data for decision-making.
Using the Latent Dirichlet Allocation model to construct a text model, the proportions of the underlying topics are extracted. Through text sentiment analysis, the sentiment of the text is analyzed based on these topic information, and the results are classified according to the polarity of the text, determining whether the expressed viewpoint is positive, negative, or neutral.
The classification results obtained in this way can be used for different purposes depending on the field of the article. For example, if applied in the financial market, it can determine the current trends in finance or the overall sentiment towards the general environment or specific
industries. In the investment field, overcoming information disparity is a continuous challenge. By incorporating financial news into the model, it is hoped that investors can obtain basic information that can help strengthen judgment or make predictions.
This paper will use financial news as the experimental subject. Through web crawling, a large amount of news article content will be accurately extracted. The organized information will then be subjected to Latent Dirichlet Allocation modeling and text sentiment analysis to
extract information from news headlines and content, assigning sentiment levels to them, quantifying them, and finally comparing them with the financial market composite index of the same period to verify whether this experimental method is suitable for financial analysis. |
參考文獻 |
[1] Baker, Malcolm, and Jeffrey Wurgler. "Investor sentiment in the stock market." Journal of Economic Perspectives 21.2 (2007): 129-151.
[2] Blei, David M., Andrew Y. Ng, and Michael I. Jordan. "Latent dirichlet allocation." Journal of Machine Learning research 3.Jan (2003): 993-1022.
[3] Blei, David M., and John D. Lafferty. "A correlated topic model of science." (2007): 17-35.
[4] Lee Gillam, Khurshid Ahmad, et al. "Economic News and Stock Market Correlation: A Study of the UK Market." (2002).
[5] Wuthrich, B., Cho, V., Leung, S., Permunetilleke, D., Sankaran, K., & Zhang, J. "Daily stock market forecast from textual web data." SMC′98 Conference Proceedings. 1998 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No. 98CH36218). Vol. 3. IEEE, (1998).
[6] Salloum, S. A., Al-Emran, M., Monem, A. A., & Shaalan, K. "Using text mining techniques for extracting information from research articles." Intelligent Natural Language Processing: Trends and Applications, (2018): 373-397.
[7] Tong, Zhou, and Haiyi Zhang. "A text mining research based on LDA topic modelling." International Conference on Computer Science, Engineering and Information Technology, (2016): 201-210.
[8] Titov, Ivan, and Ryan McDonald. "Modeling online reviews with multi-grain topic models." Proceedings of the 17th International Conference on World Wide Web, (2008):111-120.
[9] Pang, Bo, and Lillian Lee. "Opinion mining and sentiment analysis." Foundations and Trends in Information Retrieval 2.1–2 (2008): 1-135.
[10] Liu, Bing, and Lei Zhang. "A survey of opinion mining and sentiment analysis." Mining Text Data. Springer, Boston, MA, (2012): 415-463.
[11] Hatzivassiloglou, Vasileios, and Kathleen McKeown. "Predicting the semantic orientation of adjectives." 35th Annual Meeting of the Association for Computational Linguistics and 8th Conference of the European Chapter of the Association for Computational Linguistics. (1997): 174-181.
[12] Hu, Minqing, and Bing Liu. "Mining opinion features in customer reviews." AAAI. Vol. 4. No. 4.(2004): 755-760.
[13] Wilson, Theresa, Janyce Wiebe, and Paul Hoffmann. "Recognizing contextual polarity: An exploration of features for phrase-level sentiment analysis." Computational linguistics 35.3 (2009): 399-433.
[14] Nemeslaki, András, and Károly Pocsarovszky. "Web crawler research methodology." (2011).
[15] Lin, C., He, Y., Everson, R., & Ruger, S. "Weakly supervised joint sentiment-topic detection from text." IEEE Transactions on Knowledge and Data engineering 24.6 (2011): 1134-1145.
[16] Osmani, Amjad, Jamshid Bagherzadeh Mohasefi, and Farhad Soleimanian Gharehchopogh. "Enriched latent dirichlet allocation for sentiment analysis." Expert Systems 37.4 (2020): e12527.
[17] Li, Yue, Xutao Wang, and Pengjian Xu. "Chinese text classification model based on deep learning." Future Internet 10.11 (2018): 113.
[18] 游和正,「領域相關詞彙極性分析及文件情緒分類之研究」", 國立臺灣大學,碩士論文,(2012)。
[19] 劉羿廷,「運用財經文本情感分析於台灣電子類股價指數趨勢預測之研究」,國立政治大學,碩士論文,(2016)。
[20] 蔡宇祥,「股市趨勢預測之研究:財經評論文本情感分析」,國立政治大學,碩士論文,(2016)。
[21] 張良杰,「巨量資料環境下之新聞主題暨輿情與股價關係之研究」,國立政治大學,碩士論文,(2014)。
[22] 吳靖東,「投資人情緒對股票報酬之影響── 馬可夫狀態轉換模式之應用」, Journal of Innovation and Management,Vol 10.4,(2014)。
[23] 赵妍妍,秦兵,刘挺,「文本情感分析」,软件学报,21(8),(2010)。
[24] 鍾任明,「運用文字探勘於日內股價漲跌趨勢預測之研究」,中原大學,碩士論文,(2005)。
[25] 周賓凰,張宇志,林美珍,「投資人情緒與股票報酬互動關係」,證券市場發展季刊: 行為財務學特別專刊,153,(2019)。 |