評估大型語言模型的提示工程在ESG新聞分類問題的一致性效能

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：204

、訪客IP：18.218.81.160

姓名

鄭善仁(Shan-Jen Cheng) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

評估大型語言模型的提示工程在ESG新聞分類問題的一致性效能
(Evaluating the Coherence of Prompt Engineering in ESG News Classification using Large Language Models)

相關論文

★ 應用智慧分類法提升文章發佈效率於一企業之知識分享平台	★ 家庭智能管控之研究與實作
★ 開放式監控影像管理系統之搜尋機制設計及驗證	★ 資料探勘應用於呆滯料預警機制之建立
★ 探討問題解決模式下的學習行為分析	★ 資訊系統與電子簽核流程之總管理資訊系統
★ 製造執行系統應用於半導體機台停機通知分析處理	★ Apple Pay支付於iOS平台上之研究與實作
★ 應用集群分析探究學習模式對學習成效之影響	★ 應用序列探勘分析影片瀏覽模式對學習成效的影響
★ 一個以服務品質為基礎的網際服務選擇最佳化方法	★ 維基百科知識推薦系統對於使用e-Portfolio的學習者滿意度調查
★ 學生的學習動機、網路自我效能與系統滿意度之探討-以e-Portfolio為例	★ 藉由在第二人生內使用自動對話代理人來改善英文學習成效
★ 合作式資訊搜尋對於學生個人網路搜尋能力與策略之影響	★ 數位註記對學習者在線上學習環境中反思等級之影響

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 (2029-6-30以後開放)

摘要(中)

近幾年興起的大型語言模型，如OpenAI 公司的 GPT4 和 Meta公司的 Llama和Google公司的Gemini等模型，在理解、分析和生成自然語言方面取得了突破性進展，這些技術的應用為知識和資訊的提取提供了更準確的方式。同時，環境永續的重要性也在逐年浮現，若能將最新的AI技術與環境永續做結合，有望能對地球的永續發展做出重要貢獻。
因此，本論文旨在探討基於生成式AI（Gen-AI）技術，特別是使用Generative Pre-trained Transformer 3.5（GPT3.5）模型和 Gemini 模型，來進行環境、社會和公司治理（ESG）相關新聞的資料標注。我們將通過這些先進的自然語言處理 (Natural Language Processing, NLP) 技術，賦予模型理解並標注 ESG 相關內容的能力。進一步地，我們將探討這些標注過的新聞對於訓練 BERT 分類模型的影響，並評估其在ESG新聞分類任務中的性能提升。實驗結果將提供對於基於生成式 AI的資料標注技術在提高相關領域資料準確性上的洞見，同時為進一步發展應用於ESG領域的NLP技術提供實證支持。本研究將推動生成式AI在資訊分類和相關任務中的應用，促使更有效的資訊萃取和準確的資料標注。
實驗結果顯示，ESG新聞經過 GPT3.5 和 Gemini 標注後再予以BERT進行分類訓練能提升準確率，為NLP的應用在ESG領域開啟了新的可能性。透過生成式 AI 的先進語義理解和標注技術，我們成功地為ESG新聞注入了更豐富的語境和內容標籤，使得BERT模型在分類任務中能更全面、準確地理解文章的主題和內容。

摘要(英)

Recent years have seen rapid development of large language models (LLMs) like GPT, LLaMA, and Gemini, making breakthroughs in understanding, analyzing, and generating natural language. As environmental sustainability gains prominence, integrating AI with it can positively impact sustainable development. This thesis explores using generative AI (Gen-AI) technology, specifically GPT3.5 and Gemini models, for automatic annotation of environmental, social, and governance (ESG) news. It empowers models to understand and label ESG content through advanced natural language processing. The impact of these annotated news on training BERT classification models and performance improvements in ESG news classification is investigated. The results provide insights into using Gen-AI automatic annotation to improve data accuracy and empirical support for developing Natural Language Processing (NLP) technologies applied to ESG. Promoting Gen-AI application in information classification enables efficient extraction and accurate annotation. Annotation ESG news with Gen-AI models, then training with BERT improves classification accuracy, opening possibilities for NLP in ESG through advanced semantic understanding and annotation.

關鍵字(中)

★ 生成式AI
★ ESG
★ 文本分類
★ 自然語言處理
★ 資料標注

關鍵字(英)

★ Generative AI
★ ESG
★ Text classification
★ Natural Language Processing
★ Data Annotation

論文目次

1. 緒論 1
2. 文獻探討 4
2.1. 零樣本學習（Zero-shot learning） 4
2.2. 資料標注(Data Annotation) 5
2.3. 提示工程 6
2.4. 模型評估指標(Evaluation Metrics) 7
3. 研究方法 7
3.1. 資料來源 7
3.1.1. 公司財務報告書與永續報告書 7
3.1.2. 新聞文章 8
3.2. 資料前處理 11
3.3. 大型語言模型 11
3.3.1. BERT 12
3.3.2. GPT 13
3.3.3. Gemini 14
3.4. 零樣本方法：生成式AI與自監督訓練 15
3.4.1. 使用生成式 AI 的零樣本提示 16
3.4.2. 使用首句預測以自監督訓練進行零樣本分類 16
3.5. 提示工程 17
3.6. 分類模型訓練 19
4. 研究結果 19
4.1. 分類效果比較 19
4.2. 生成式AI標注資料的效能驗證 20
4.3. 提高分確率的成果 20
5. 討論 26
6. 結論 27
7. 限制與未來研究 28
參考文獻 29
相關程式碼與說明 31

參考文獻

Amin, F., & Mahmoud, M. (2022). Confusion matrix in binary classification problems: a step-by-step tutorial. Journal of Engineering Research, 6(5), 0-0.
Araci, D. (2019). Finbert: Financial sentiment analysis with pre-trained language models. arXiv preprint arXiv:1908.10063.
Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., & Askell, A. (2020). Language models are few-shot learners. Advances in neural information processing systems, 33, 1877-1901.
Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
Gilardi, F., Alizadeh, M., & Kubli, M. (2023). Chatgpt outperforms crowd-workers for text-annotation tasks. arXiv preprint arXiv:2303.15056.
He, P., Liu, X., Gao, J., & Chen, W. (2020). Deberta: Decoding-enhanced bert with disentangled attention. arXiv preprint arXiv:2006.03654.
Hossin, M., & Sulaiman, M. N. (2015). A review on evaluation metrics for data classification evaluations. International journal of data mining & knowledge management process, 5(2), 1.
Hu, H., Lu, H., Zhang, H., Song, Y.-Z., Lam, W., & Zhang, Y. (2023). Chain-of-symbol prompting elicits planning in large langauge models. arXiv preprint arXiv:2305.10276.
Kuzman, T., Mozetic, I., & Ljubešic, N. (2023). ChatGPT: Beginning of an End of Manual Linguistic Data Annotation. Use Case of Automatic Genre Identification. arXiv e-prints, arXiv–2303.
Liu, C., Zhang, W., Chen, G., Wu, X., Luu, A. T., Chang, C. H., & Bing, L. (2023). Zero-Shot Text Classification via Self-Supervised Tuning. arXiv preprint arXiv:2305.11442.
Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., & Stoyanov, V. (2019). Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692.
McHugh, M. L. (2012). Interrater reliability: the kappa statistic. Biochemia medica, 22(3), 276-282.
Petroni, F., Rocktäschel, T., Lewis, P., Bakhtin, A., Wu, Y., Miller, A. H., & Riedel, S. (2019). Language models as knowledge bases? arXiv preprint arXiv:1909.01066.
Radford, A., Narasimhan, K., Salimans, T., & Sutskever, I. (2018). Improving language understanding by generative pre-training.
Reid, M., Savinov, N., Teplyashin, D., Lepikhin, D., Lillicrap, T., Alayrac, J.-b., Soricut, R., Lazaridou, A., Firat, O., & Schrittwieser, J. (2024). Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context. arXiv preprint arXiv:2403.05530.
Sahoo, P., Singh, A. K., Saha, S., Jain, V., Mondal, S., & Chadha, A. (2024). A Systematic Survey of Prompt Engineering in Large Language Models: Techniques and Applications. arXiv preprint arXiv:2402.07927.
Socher, R., Ganjoo, M., Manning, C. D., & Ng, A. (2013). Zero-shot learning through cross-modal transfer. Advances in neural information processing systems, 26.
Team, G., Anil, R., Borgeaud, S., Wu, Y., Alayrac, J.-B., Yu, J., Soricut, R., Schalkwyk, J., Dai, A. M., & Hauth, A. (2023). Gemini: a family of highly capable multimodal models. arXiv preprint arXiv:2312.11805.
Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., & Azhar, F. (2023). Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971.
Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., & Bhosale, S. (2023). Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.
Truby, J. (2020). Governing Artificial Intelligence to benefit the UN Sustainable Development Goals. Sustainable Development, 28(4), 946-959. https://doi.org/https://doi.org/10.1002/sd.2048
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30.
Vinuesa, R., Azizpour, H., Leite, I., Balaam, M., Dignum, V., Domisch, S., Felländer, A., Langhans, S. D., Tegmark, M., & Fuso Nerini, F. (2020). The role of artificial intelligence in achieving the Sustainable Development Goals. Nature Communications, 11(1), 233. https://doi.org/10.1038/s41467-019-14108-y
Wang, S., Liu, Y., Xu, Y., Zhu, C., & Zeng, M. (2021). Want to reduce labeling cost? GPT-3 can help. arXiv preprint arXiv:2108.13487.
Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., & Zhou, D. (2022). Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171.
Wang, Z., Pang, Y., & Lin, Y. (2023). Large Language Models Are Zero-Shot Text Classifiers. arXiv preprint arXiv:2312.01044.
Wei, J., Bosma, M., Zhao, V. Y., Guu, K., Yu, A. W., Lester, B., Du, N., Dai, A. M., & Le, Q. V. (2021). Finetuned language models are zero-shot learners. arXiv preprint arXiv:2109.01652.
Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q. V., & Zhou, D. (2022). Chain-of-thought prompting elicits reasoning in large language models. Advances in neural information processing systems, 35, 24824-24837.
Yao, S., Yu, D., Zhao, J., Shafran, I., Griffiths, T., Cao, Y., & Narasimhan, K. (2024). Tree of thoughts: Deliberate problem solving with large language models. Advances in neural information processing systems, 36.

指導教授

楊鎮華(Stephen J. H. Yang)

審核日期

2024-7-10

推文