透過以LLM實現的新聞監控與分析揭露ESG之媒體輿情

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：150

、訪客IP：3.12.36.175

姓名

劉學逸(Hsueh-Yi Liu) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

透過以LLM實現的新聞監控與分析揭露ESG之媒體輿情
(Disclosing Media Sentiment in ESG Through LLM-Enabled News Monitoring and Analytics)

相關論文

★ 應用智慧分類法提升文章發佈效率於一企業之知識分享平台	★ 家庭智能管控之研究與實作
★ 開放式監控影像管理系統之搜尋機制設計及驗證	★ 資料探勘應用於呆滯料預警機制之建立
★ 探討問題解決模式下的學習行為分析	★ 資訊系統與電子簽核流程之總管理資訊系統
★ 製造執行系統應用於半導體機台停機通知分析處理	★ Apple Pay支付於iOS平台上之研究與實作
★ 應用集群分析探究學習模式對學習成效之影響	★ 應用序列探勘分析影片瀏覽模式對學習成效的影響
★ 一個以服務品質為基礎的網際服務選擇最佳化方法	★ 維基百科知識推薦系統對於使用e-Portfolio的學習者滿意度調查
★ 學生的學習動機、網路自我效能與系統滿意度之探討-以e-Portfolio為例	★ 藉由在第二人生內使用自動對話代理人來改善英文學習成效
★ 合作式資訊搜尋對於學生個人網路搜尋能力與策略之影響	★ 數位註記對學習者在線上學習環境中反思等級之影響

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 (2029-6-30以後開放)

摘要(中)

近年來環境、社會、治理(Environmental, Social and Government, ESG)的議題越來越受世界各國的重視，例如歐盟將在2027年正式收取碳稅，其相關的媒體報導與輿情也將影響各家企業的形象甚至是市場價值，因此企業在ESG上的表現以及採取的相關行動在現代是很重要的議題。
本研究將使用大型語言模型(Large Language Model, LLM)為基礎的方法對大量ESG相關新聞進行多段式摘要生成以及建議生成。其中多段式摘要生成可以解決一些LLM的輸入內容長度限制導致無法直接總結大量新聞的問題。在摘要生成，我們測試了BERT、Pegasus、GPT-3.5-Turbo以及Llama-2對各個文章內的內容進行初步過濾，在Prompt的設計我們使用了簡易Prompt、複雜Prompt、以及使用Directional Stimulus Prompting(DSP)應用在我們的多段式摘要生成，我們選取了最具代表性的GPT-3.5-Turbo以及公開的Llama-2作為最終階段的摘要生成模型，並且透過Multi-News資料集衡量不同方法的優劣。在建議生成上，我們採用了情感分析的Distil-RoBERTa模型以及ESG分類模型和多段式摘要生成產生的摘要作為大型語言模型生成的輸入加以引導生成的內容方向。
本研究的結果展示了在多段式摘要生成的任務上使用不同方法的優劣，以及驗證了在不同摘要生成的大型語言模型上使用這些方法的一致性。另外，在Prompt設計實驗的DSP環節以及建議生成的實驗，展示了小型的模型可以如何進一步加強大型語言模型在不同任務上的表現。本研究提出的自動化工具也可以使企業能夠快速掌握ESG相關媒體輿情，並且得到相關建議能即時做出對應的決策。

摘要(英)

Issues related to Environmental, Social, and Governance (ESG) have gained increasing attention from countries worldwide in recent years. For instance, the European Union will officially start implementing carbon taxes in 2027. Media reports and public opinion surrounding ESG issues can significantly impact the image and market value of companies. Therefore, a company′s performance and actions in ESG have become crucial topics in contemporary society.
This study applies methods based on Large Language Models (LLMs), using multi-stage summary generation and suggestion generation powered by LLMs on a vast amount of ESG-related news. The multi-stage summary generation addresses the problem of input length limitations of LLMs, which hinder direct summarization of large volumes of news. In the summary generation process, we tested BERT, Pegasus, GPT-3.5-Turbo, and Llama-2 to initially filter the content within each article. For prompt design, we utilized simple prompts, complex prompts, Directional Stimulus Prompting (DSP) on our multi-stage summary generation. We selected the most representative models, GPT-3.5-Turbo and the publicly available Llama-2, as the final models for summary generation and measured the performance of different methods using the Multi-News dataset. For suggestion generation, we employed a sentiment Distil-RoBERTa model and an ESG classification model. These models, along with the summaries generated by multi-stage summary generation, guided the content direction generated by the LLM.
The results of this study demonstrate the advantages and disadvantages of using different methods for multi-stage summary generation and validate the consistency of these methods across various summary-generating LLMs. Additionally, experiments involving DSP in prompt design and suggestion generation showcase how smaller models can further enhance the performance LLMs in different tasks. The automated tools proposed in this study enable companies to quickly grasp ESG-related media sentiment and receive relevant suggestions to make timely and informed decisions.

關鍵字(中)

★ ESG
★ 大型語言模型
★ 自動文本摘要
★ 輿情分析

關鍵字(英)

★ ESG
★ LLM
★ automatic text summarization
★ sentiment analysis

論文目次

摘要 i
Abstract ii
致謝 iii
目錄 iv
圖目錄 vi
表目錄 vii
1. 緒論 1
1.1. ESG對企業價值的影響 1
1.2. 媒體輿情對於企業及市場的影響 1
1.3. 大型語言模型的興起 2
1.4. NLP自動化系統 2
2. 文獻探討 3
2.1. 自動文本摘要(Automatic Text Summarization) 3
2.2. 輿情分析(Sentiment Analysis) 4
2.3. 評估指標(Evaluation Metrics) 5
3. 研究方法 6
3.1. 語言模型 6
3.1.1 GPT 6
3.1.2 Llama-2 7
3.1.3 BERT-Extractive-Summarizer 7
3.1.4 Sentence-BERT 8
3.1.5 RoBERTa 8
3.1.6 T5 8
3.1.1. Pegasus 8
3.2. 先前的研究結果 9
3.2.1 esgBERT 9
3.2.2 ESG新聞資料集 10
3.3. 多段式文本摘要生成 11
3.3.1 文本過濾方法 11
3.3.2 LLM摘要生成之Prompt設計 13
3.3.2. 以不同LLM進行摘要生成 15
3.4. ESG新聞輿情分析 15
3.4.1. ESG新聞重心 15
3.4.2. ESG新聞建議生成 16
4. 研究結果 17
4.1. 用於摘要之文本過濾方法之結果比較 17
4.2. 用於摘要之Prompt設計之結果比較 18
4.3. ESG新聞摘要輿情分析系統 20
5. 討論 22
6. 結論 22
7. 限制與未來研究 23
參考文獻 24
附錄 27

參考文獻

Alhelbawy, A., Lattimer, M., Kruschwitz, U., Fox, C., & Poesio, M. (2020). An nlp-powered human rights monitoring platform. Expert Systems with Applications, 153, 113365.
Alkaraan, F., Albitar, K., Hussainey, K., & Venkatesh, V. G. (2022). Corporate transformation toward Industry 4.0 and financial performance: The influence of environmental, social, and governance (ESG). Technological Forecasting and Social Change, 175, Article 121423. https://doi.org/10.1016/j.techfore.2021.121423
Amel-Zadeh, A., & Serafeim, G. (2018). Why and how investors use ESG information: Evidence from a global survey. Financial analysts journal, 74(3), 87-103.
Arbane, M., Benlamri, R., Brik, Y., & Alahmar, A. D. (2023). Social media-based COVID-19 sentiment classification model using Bi-LSTM. Expert Systems with Applications, 212, 118710.
Aydogmus, M., Gülay, G., & Ergun, K. (2022). Impact of ESG performance on firm value and profitability. Borsa Istanbul Review, 22, S119-S127. https://doi.org/10.1016/j.bir.2022.11.006
Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., & Askell, A. (2020). Language models are few-shot learners. Advances in neural information processing systems, 33, 1877-1901.
De Vincentiis, P. (2024). ESG news, stock volatility and tactical disclosure. Research in International Business and Finance, 68, Article 102187. https://doi.org/10.1016/j.ribaf.2023.102187
Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
El-Kassas, W. S., Salama, C. R., Rafea, A. A., & Mohamed, H. K. (2021). Automatic text summarization: A comprehensive survey. Expert Systems with Applications, 165, 113679.
Fraiberger, S. P., Lee, D., Puy, D., & Ranciere, R. (2021). Media sentiment and international asset prices. Journal of International Economics, 133, 103526.
Gomez, M. J., Calderón, M., Sánchez, V., Clemente, F. J. G., & Ruipérez-Valiente, J. A. (2022). Large scale analysis of open MOOC reviews to support learners’ course selection. Expert Systems with Applications, 210, 118400.
Guo, M., Ainslie, J., Uthus, D., Ontanon, S., Ni, J., Sung, Y.-H., & Yang, Y. (2021). LongT5: Efficient text-to-text transformer for long sequences. arXiv preprint arXiv:2112.07916.
Jiang, A. Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D. S., Casas, D. d. l., Bressand, F., Lengyel, G., Lample, G., & Saulnier, L. (2023). Mistral 7B. arXiv preprint arXiv:2310.06825.
Lee, J., & Kim, M. (2023). ESG information extraction with cross-sectoral and multi-source adaptation based on domain-tuned language models. Expert Systems with Applications, 221, 119726.
Li, Z., Peng, B., He, P., Galley, M., Gao, J., & Yan, X. (2023). Guiding Large Language Models via Directional Stimulus Prompting. arXiv preprint arXiv:2302.11520.
Lin, C.-Y. (2004). Rouge: A package for automatic evaluation of summaries. Text summarization branches out,
Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., & Stoyanov, V. (2019). Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692.
Lu, J., & Eirinaki, M. (2021). Can a machine win a Grammy? An evaluation of AI-generated song lyrics. 2021 IEEE International Conference on Big Data, Big Data 2021,
Ma, C., Zhang, W. E., Guo, M., Wang, H., & Sheng, Q. Z. (2022). Multi-document summarization via deep learning techniques: A survey. ACM Computing Surveys, 55(5), 1-37.
Mathur, A., & Suchithra, M. (2022). Application of Abstractive Summarization in Multiple Choice Question Generation. 1st International Conference on Computational Intelligence and Sustainable Engineering Solution, CISES 2022,
Medhat, W., Hassan, A., & Korashy, H. (2014). Sentiment analysis algorithms and applications: A survey. Ain Shams engineering journal, 5(4), 1093-1113.
Mehta, S., Sekhavat, M. H., Cao, Q., Horton, M., Jin, Y., Sun, C., Mirzadeh, I., Najibi, M., Belenko, D., & Zatloukal, P. (2024). OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework. arXiv preprint arXiv:2404.14619.
Miller, D. (2019). Leveraging BERT for extractive text summarization on lectures. arXiv preprint arXiv:1906.04165.
Oniani, D., & Wang, Y. (2020). A Qualitative Evaluation of Language Models on Automatic Question-Answering for COVID-19. 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, BCB 2020,
Pedersen, L. H., Fitzgibbons, S., & Pomorski, L. (2021). Responsible investing: The ESG-efficient frontier. Journal of Financial Economics, 142(2), 572-597.
Penedo, G., Malartic, Q., Hesslow, D., Cojocaru, R., Cappelli, A., Alobeidli, H., Pannier, B., Almazrouei, E., & Launay, J. (2023). The RefinedWeb dataset for Falcon LLM: outperforming curated corpora with web data, and web data only. arXiv preprint arXiv:2306.01116.
Perez-Beltrachini, L., & Lapata, M. (2021). Multi-document summarization with determinantal point process attention. Journal of Artificial Intelligence Research, 71, 371-399.
Radford, A., Narasimhan, K., Salimans, T., & Sutskever, I. (2018). Improving language understanding by generative pre-training.
Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., & Liu, P. J. (2020). Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 21(1), 5485-5551.
Ranade, P., Piplai, A., Mittal, S., Joshi, A., & Finin, T. (2021). Generating Fake Cyber Threat Intelligence Using Transformer-Based Models. 2021 International Joint Conference on Neural Networks, IJCNN 2021,
Rawat, R., Rawat, P., Elahi, V., & Elahi, A. (2021). Abstractive Summarization on Dynamically Changing Text. 5th International Conference on Computing Methodologies and Communication, ICCMC 2021,
Reimers, N., & Gurevych, I. (2019). Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084.
Salvador, J., Bansal, N., Akter, M., Sarkar, S., Das, A., & Karmaker, S. K. (2024). Benchmarking LLMs on the Semantic Overlap Summarization Task. arXiv preprint arXiv:2402.17008.
Santu, S. K. K., & Feng, D. (2023). TELeR: A General Taxonomy of LLM Prompts for Benchmarking Complex Tasks. arXiv preprint arXiv:2305.11430.
Seok, J., Lee, Y., & Kim, B. D. (2020). Impact of CSR news reports on firm value. Asia Pacific Journal of Marketing and Logistics, 32(3), 644-663. https://doi.org/10.1108/apjml-06-2019-0352
Team, G., Mesnard, T., Hardin, C., Dadashi, R., Bhupatiraju, S., Pathak, S., Sifre, L., Rivière, M., Kale, M. S., & Love, J. (2024). Gemma: Open models based on gemini research and technology. arXiv preprint arXiv:2403.08295.
Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., & Bhosale, S. (2023). Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30.
Wu, Z., & Ma, G. (2024). NLP-based approach for automated safety requirements information retrieval from project documents. Expert Systems with Applications, 239, 122401.
Xiao, W., Beltagy, I., Carenini, G., & Cohan, A. (2021). PRIMERA: Pyramid-based masked sentence pre-training for multi-document summarization. arXiv preprint arXiv:2110.08499.
Zhang, J., Zhao, Y., Saleh, M., & Liu, P. (2020). Pegasus: Pre-training with extracted gap-sentences for abstractive summarization. International Conference on Machine Learning,
Zhang, T., Kishore, V., Wu, F., Weinberger, K. Q., & Artzi, Y. (2019). Bertscore: Evaluating text generation with bert. arXiv preprint arXiv:1904.09675.

指導教授

楊鎮華(Stephen J.H. Yang)

審核日期

2024-7-10

推文