Rephrasing Human Instructions for Instruction-tuned LLMs

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：101

、訪客IP：3.145.54.136

姓名

盧俊吉(Jyun-Ji Lu) 查詢紙本館藏

畢業系所

資訊管理學系

論文名稱

(Rephrasing Human Instructions for Instruction-tuned LLMs)

相關論文

★ 多重標籤文本分類之實證研究 : word embedding 與傳統技術之比較	★ 基於圖神經網路之網路協定關聯分析
★ 學習模態間及模態內之共用表示式	★ Hierarchical Classification and Regression with Feature Selection
★ 病徵應用於病患自撰日誌之情緒分析	★ 基於注意力機制的開放式對話系統
★ 針對特定領域任務—基於常識的BERT模型之應用	★ 基於社群媒體使用者之硬體設備差異分析文本情緒強烈程度
★ 機器學習與特徵工程用於虛擬貨幣異常交易監控之成效討論	★ 捷運轉轍器應用長短期記憶網路與機器學習實現最佳維保時間提醒
★ 基於半監督式學習的網路流量分類	★ ERP日誌分析-以A公司為例
★ 企業資訊安全防護：網路封包蒐集分析與網路行為之探索性研究	★ 資料探勘技術在顧客關係管理之應用─以C銀行數位存款為例
★ 人臉圖片生成與增益之可用性與效率探討分析	★ 人工合成文本之資料增益於不平衡文字分類問題

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 (2026-8-1以後開放)

摘要(中)

生成式AI服務 ( ChatGPT、Gemini 和 Copilot )，因其能夠遵循人類指令並生成相對應的回應而受到廣泛的關注。這些大型語言模型 (LLMs) 的人類指令遵循能力主要是來自於指令調整 ( Instruction Tuning )，該方法透過指令跟隨資料集以監督式微調 (SFT) 的方式訓練LLMs。然而，研究顯示，經過指令調整的LLMs ( Instruction-tuned LLMs )對離散文本的擾動仍然具有一定敏感性，可能導致不可預測、無法控制的生成行為，進而影響遵循人類指令的表現。鑒於通用生成式AI服務的大量推出，是否可以改善人類直覺的指令輸入，以符合Instruction-tuned LLMs的偏好，實現穩定、可控且高品質的回應，同時解決用戶對如何撰寫精確指令的困擾。
優化離散文本以迎合LLMs偏好的概念，已在離散提示工程 (Discrete prompt engineering) 的研究中證實其在傳統NLP任務中的有效性。然而，與傳統NLP任務的資料不同，人類指令源自於人類現實世界中的互動，高度使用者友好且複雜，直接應用先前離散提示工程的方法於人類指令並不實際。
在我們的實驗當中，我們展示了我們提出的方法可以通過自動改寫人類指令以增強instruction-tuned LLMs生成回應的表現。這樣表現的提升在多樣性越高的訓練資料上更加的明顯。此外我們也觀察到相同的指令改寫方法可以泛化到具有相同主幹的instruction-tuned LLMs，而具有不同主幹的instruction-tuned LLMs對於離散文本的偏好可能不同。我們的方法展示了在離散層級和黑箱情境下改善instruction-tuned LLMs表現的可行性，同時保持人類指令的語義一致性和可解釋性。

摘要(英)

Generative AI services like ChatGPT, Gemini, and Copilot have gained significant attention for their ability to follow human instructions and assist with real-world tasks. The core mechanism behind their effectiveness is instruction tuning — a process involving supervised fine-tuning (SFT) with paired datasets of human instructions and responses. Despite the ability of following human instructions from instruction-tuned large language models (LLMs), studies still show that instruction-tuned LLMs exhibit sensitivity to perturbations in discrete text, which can cause the unpredictable, uncontrollable generation behavior and may lead to performance degradation. Given the emergence of general-purpose generative AI services, whether can human instructions be optimized to align with the preferences of instruction-tuned LLMs for stable, controllable and high-quality responses generation while also addressing users′ concerns about crafting precise instructions.
The concept of enhancing LLMs’ performance by optimizing discrete text to cater LLMs’ preference has already shown the effectiveness at discrete prompt engineering, which enhancing the performance of LLMs on traditional NLP tasks by finding optimal discrete templates or texts. However, unlike traditional NLP tasks, human instructions are user-friendly, highly variable, and derived from real-world interactions, making direct application of previous discrete prompt methods to human instructions impractical.
In our experiments, we demonstrate that our proposed method enhances the response quality of instruction-tuned LLMs simply by rephrasing human instructions. This enhancement is more pronounced with a richer variety of training data. Additionally, we observe that the same optimization approach applies across instruction-tuned LLMs sharing the same backbone, whereas instruction-tuned LLMs with different backbones may have different preferences for discrete text. Our method showcases the feasibility of improving instruction-tuned LLMs at the discrete level and in a black-box scenario, while maintaining the semantic consistency and explainability of human instructions.

關鍵字(中)

★ 指令跟隨
★ 離散提示
★ 改寫
★ 黑盒優化

關鍵字(英)

★ instruction following
★ discrete prompt
★ paraphrasing
★ black-box optimizing

論文目次

摘要 ii
Abstract iii
Table of Contents iv
1. Introduction 1
1.1. Overview 1
1.2. Motivation 2
1.3. Objectives 4
1.4. Paper Organization 5
2. Related Works 6
2.1. Discrete Prompt Engineering 6
2.1.1. Gradient-guide search 7
2.1.2. Black Box Optimization 7
2.1.3. Edit-based Gradient-free 8
2.1.4. Construct with generative LLMs 8
2.1.5. In Context Learning 10
2.2. Instruction Following Tasks 11
2.2.1. Instruction Tuning 12
2.2.2. Instruction Following Datasets 13
2.3. Discussion 15
3. Methodology 19
3.1. Model Overview 19
3.2. Optimizer 19
3.3. Responder 20
3.4. Training Phase 21
3.4.1. Step 1: Extracting paired data using reinforcement learning 23
3.4.2. Step 2: Supervised fine-tuning with the extracted data 24
3.5. Experiment design 24
3.5.1. Experiment 1 – The effectiveness of our proposed instruction optimizing method 24
3.5.2. Experiment 2 – The effectiveness of our method with or without context 26
3.6. Dataset 27
3.7. Experiment model setting 28
3.8. Evaluation metrics 29
3.8.1. Quality of the response: ROUGE-L (Lin, 2004) 30
3.8.2. Semantic Similarity of optimized instruction: BERTScore (Zhang et al., 2020) 30
3.8.3. Human Evaluation 31
4. Experiment Results 33
4.1. Experiment 1 – The effectiveness of our proposed instruction optimizing method 33
4.1.1. Experiment result 33
4.1.2. Generalization with same back bone LLM 35
4.1.3. Case Study 35
4.2. Experiment 2 – The effectiveness of our method with or without context 39
4.3. Case Study 39
5. Conclusion 44
5.1. Overall summary 44
5.2. Contributions 44
5.3. Limitation 45
5.4. Future work 45
Reference 47

參考文獻

Bach, S.H., Sanh, V., Yong, Z.-X., Webson, A., Raffel, C., Nayak, N.V., Sharma, A., Kim, T., Bari, M.S., Fevry, T., Alyafeai, Z., Dey, M., Santilli, A., Sun, Z., Ben-David, S., Xu, C., Chhablani, G., Wang, H., Fries, J.A., Al-shaibani, M.S., Sharma, S., Thakker, U., Almubarak, K., Tang, X., Radev, D., Jiang, M.T.-J., Rush, A.M., 2022. PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts.
Ben-David, E., Oved, N., Reichart, R., 2022. PADA: Example-based Prompt Learning for on-the-fly Adaptation to Unseen Domains. https://doi.org/10.48550/arXiv.2102.12206
Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D., Wu, J., Winter, C., Hesse, C., Chen, M., Sigler, E., Litwin, M., Gray, S., Chess, B., Clark, J., Berner, C., McCandlish, S., Radford, A., Sutskever, I., Amodei, D., 2020. Language Models are Few-Shot Learners, in: Advances in Neural Information Processing Systems. Curran Associates, Inc., pp. 1877–1901.
Chen, G., Qian, Y., Wang, B., Li, L., 2023. MPrompt: Exploring Multi-level Prompt Tuning for Machine Reading Comprehension.
Chen, L., Chen, J., Goldstein, T., Huang, H., Zhou, T., 2023. InstructZero: Efficient Instruction Optimization for Black-Box Large Language Models. https://doi.org/10.48550/arXiv.2306.03082
Chiang, W.-L., Zhuohan, L., Zi, L., Ying, S., Zhanghao, W., 2023. Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90%* ChatGPT Quality | LMSYS Org [WWW Document]. URL https://lmsys.org/blog/2023-03-30-vicuna
Christiano, P.F., Leike, J., Brown, T., Martic, M., Legg, S., Amodei, D., 2017. Deep Reinforcement Learning from Human Preferences, in: Advances in Neural Information Processing Systems. Curran Associates, Inc.
Chung, H.W., Hou, L., Longpre, S., Zoph, B., Tay, Y., Fedus, W., Li, Y., Wang, X., Dehghani, M., Brahma, S., Webson, A., Gu, S.S., Dai, Z., Suzgun, M., Chen, X., Chowdhery, A., Castro-Ros, A., Pellat, M., Robinson, K., Valter, D., Narang, S., Mishra, G., Yu, A., Zhao, V., Huang, Y., Dai, A., Yu, H., Petrov, S., Chi, E.H., Dean, J., Devlin, J., Roberts, A., Zhou, D., Le, Q.V., Wei, J., 2022. Scaling Instruction-Finetuned Language Models. https://doi.org/10.48550/arXiv.2210.11416
Conover, M., Hayes, M., Mathur, A., 2023. Free Dolly: Introducing the World’s First Truly Open Instruction-Tuned LLM [WWW Document]. URL https://www.databricks.com/blog/2023/04/12/dolly-first-open-commercially-viable-instruction-tuned-llm
Deng, M., Wang, J., Hsieh, C.-P., Wang, Y., Guo, H., Shu, T., Song, M., Xing, E., Hu, Z., 2022. RLPrompt: Optimizing Discrete Text Prompts with Reinforcement Learning, in: Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. Presented at the EMNLP 2022, Association for Computational Linguistics, Abu Dhabi, United Arab Emirates, pp. 3369–3391.
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K., 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. ArXiv181004805 Cs.
Diao, S., Huang, Z., Xu, R., Li, X., Lin, Y., Zhou, X., Zhang, T., 2023. Black-box Prompt Learning for Pre-trained Language Models. https://doi.org/10.48550/arXiv.2201.08531
Efrat, A., Levy, O., 2020. The Turking Test: Can Language Models Understand Instructions? https://doi.org/10.48550/arXiv.2010.11982
Fedus, W., Zoph, B., Shazeer, N., 2022. Switch transformers: scaling to trillion parameter models with simple and efficient sparsity. J Mach Learn Res 23, 120:5232-120:5270.
Fuzhao, X., 2024. Instruction in the wild: A user-based instruction dataset.
Gao, T., Fisch, A., Chen, D., 2021. Making Pre-trained Language Models Better Few-shot Learners, in: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Presented at the ACL-IJCNLP 2021, Association for Computational Linguistics, Online, pp. 3816–3830. https://doi.org/10.18653/v1/2021.acl-long.295
Gemma Team, Mesnard, T., Hardin, C., Dadashi, R., Bhupatiraju, S., Pathak, S., Sifre, L., Rivière, M., Kale, M.S., Love, J., Tafti, P., Hussenot, L., Sessa, P.G., Chowdhery, A., Roberts, A., Barua, A., Botev, A., Castro-Ros, A., Slone, A., Héliou, A., Tacchetti, A., Bulanova, A., Paterson, A., Tsai, B., Shahriari, B., Lan, C.L., Choquette-Choo, C.A., Crepy, C., Cer, D., Ippolito, D., Reid, D., Buchatskaya, E., Ni, E., Noland, E., Yan, G., Tucker, G., Muraru, G.-C., Rozhdestvenskiy, G., Michalewski, H., Tenney, I., Grishchenko, I., Austin, J., Keeling, J., Labanowski, J., Lespiau, J.-B., Stanway, J., Brennan, J., Chen, J., Ferret, J., Chiu, J., Mao-Jones, J., Lee, K., Yu, K., Millican, K., Sjoesund, L.L., Lee, L., Dixon, L., Reid, M., Mikuła, M., Wirth, M., Sharman, M., Chinaev, N., Thain, N., Bachem, O., Chang, O., Wahltinez, O., Bailey, P., Michel, P., Yotov, P., Chaabouni, R., Comanescu, R., Jana, R., Anil, R., McIlroy, R., Liu, R., Mullins, R., Smith, S.L., Borgeaud, S., Girgin, S., Douglas, S., Pandya, S., Shakeri, S., De, S., Klimenko, T., Hennigan, T., Feinberg, V., Stokowiec, W., Chen, Y., Ahmed, Z., Gong, Z., Warkentin, T., Peran, L., Giang, M., Farabet, C., Vinyals, O., Dean, J., Kavukcuoglu, K., Hassabis, D., Ghahramani, Z., Eck, D., Barral, J., Pereira, F., Collins, E., Joulin, A., Fiedel, N., Senter, E., Andreev, A., Kenealy, K., 2024. Gemma: Open Models Based on Gemini Research and Technology. https://doi.org/10.48550/arXiv.2403.08295
Gonen, H., Iyer, S., Blevins, T., Smith, N.A., Zettlemoyer, L., 2022. Demystifying Prompts in Language Models via Perplexity Estimation. https://doi.org/10.48550/arXiv.2212.04037
Gu, J., Zhao, H., Xu, H., Nie, L., Mei, H., Yin, W., 2023. Robustness of Learning from Task Instructions, in: Rogers, A., Boyd-Graber, J., Okazaki, N. (Eds.), Findings of the Association for Computational Linguistics: ACL 2023. Presented at the Findings 2023, Association for Computational Linguistics, Toronto, Canada, pp. 13935–13948. https://doi.org/10.18653/v1/2023.findings-acl.875
Haviv, A., Berant, J., Globerson, A., 2021. BERTese: Learning to Speak to BERT, in: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. Presented at the EACL 2021, Association for Computational Linguistics, Online, pp. 3618–3623. https://doi.org/10.18653/v1/2021.eacl-main.316
Honovich, O., Scialom, T., Levy, O., Schick, T., 2023a. Unnatural Instructions: Tuning Language Models with (Almost) No Human Labor, in: Rogers, A., Boyd-Graber, J., Okazaki, N. (Eds.), Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Presented at the ACL 2023, Association for Computational Linguistics, Toronto, Canada, pp. 14409–14428. https://doi.org/10.18653/v1/2023.acl-long.806
Honovich, O., Shaham, U., Bowman, S.R., Levy, O., 2023b. Instruction Induction: From Few Examples to Natural Language Task Descriptions, in: Rogers, A., Boyd-Graber, J., Okazaki, N. (Eds.), Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Presented at the ACL 2023, Association for Computational Linguistics, Toronto, Canada, pp. 1935–1952. https://doi.org/10.18653/v1/2023.acl-long.108
Iyer, S., Lin, X.V., Pasunuru, R., Mihaylov, T., Simig, D., Yu, P., Shuster, K., Wang, T., Liu, Q., Koura, P.S., Li, X., O’Horo, B., Pereyra, G., Wang, J., Dewan, C., Celikyilmaz, A., Zettlemoyer, L., Stoyanov, V., 2023. OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization. https://doi.org/10.48550/arXiv.2212.12017
Jang, J., Ye, S., Seo, M., 2022. Can Large Language Models Truly Understand Prompts? A Case Study with Negated Prompts. https://doi.org/10.48550/arXiv.2209.12711
Jiang, Y., Yang, Hao, Lin, J., Zhao, H., Yang, A., Zhou, C., Yang, Hongxia, Yang, Z., Cui, B., 2022. Instance-wise Prompt Tuning for Pretrained Language Models. https://doi.org/10.48550/arXiv.2206.01958
Jiang, Z., Xu, F.F., Araki, J., Neubig, G., 2020. How Can We Know What Language Models Know? Trans. Assoc. Comput. Linguist. 8, 423–438. https://doi.org/10.1162/tacl_a_00324
Jin, F., Lu, J., Zhang, J., Zong, C., 2022. Instance-aware Prompt Learning for Language Understanding and Generation.
Khashabi, D., Lyu, X., Min, S., Qin, L., Richardson, K., Welleck, S., Hajishirzi, H., Khot, T., Sabharwal, A., Singh, S., Choi, Y., 2022. Prompt Waywardness: The Curious Case of Discretized Interpretation of Continuous Prompts, in: Carpuat, M., de Marneffe, M.-C., Meza Ruiz, I.V. (Eds.), Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Presented at the NAACL-HLT 2022, Association for Computational Linguistics, Seattle, United States, pp. 3631–3643. https://doi.org/10.18653/v1/2022.naacl-main.266
Kung, P.-N., Peng, N., 2023. Do Models Really Learn to Follow Instructions? An Empirical Study of Instruction Tuning, in: Rogers, A., Boyd-Graber, J., Okazaki, N. (Eds.), Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Presented at the ACL 2023, Association for Computational Linguistics, Toronto, Canada, pp. 1317–1328. https://doi.org/10.18653/v1/2023.acl-short.113
Lester, B., Al-Rfou, R., Constant, N., 2021. The Power of Scale for Parameter-Efficient Prompt Tuning, in: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Presented at the EMNLP 2021, Association for Computational Linguistics, Online and Punta Cana, Dominican Republic, pp. 3045–3059. https://doi.org/10.18653/v1/2021.emnlp-main.243
Li, H., Yang, L., Li, L., Xu, C., Xia, S.-T., Yuan, C., 2022. PTS: A Prompt-based Teacher-Student Network for Weakly Supervised Aspect Detection, in: 2022 International Joint Conference on Neural Networks (IJCNN). Presented at the 2022 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. https://doi.org/10.1109/IJCNN55064.2022.9892147
Li, X.L., Liang, P., 2021. Prefix-Tuning: Optimizing Continuous Prompts for Generation, in: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Presented at the ACL-IJCNLP 2021, Association for Computational Linguistics, Online, pp. 4582–4597. https://doi.org/10.18653/v1/2021.acl-long.353
Lin, C.-Y., 2004. ROUGE: A Package for Automatic Evaluation of Summaries, in: Text Summarization Branches Out. Association for Computational Linguistics, Barcelona, Spain, pp. 74–81.
Liu, J., Chen, T., Liang, Z., Jiang, H., Xiao, Y., Wei, F., Qian, Y., Hao, Z., Han, B., 2023. Hierarchical Prompt Tuning for Few-Shot Multi-Task Learning, in: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, CIKM ’23. Association for Computing Machinery, New York, NY, USA, pp. 1556–1565. https://doi.org/10.1145/3583780.3614913
Liu, P., Yuan, W., Fu, J., Jiang, Z., Hayashi, H., Neubig, G., 2022. Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing. ACM Comput. Surv. https://doi.org/10.1145/3560815
Liu, X., Sun, T., Huang, X., Qiu, X., 2022. Late Prompt Tuning: A Late Prompt Could Be Better Than Many Prompts, in: Goldberg, Y., Kozareva, Z., Zhang, Y. (Eds.), Findings of the Association for Computational Linguistics: EMNLP 2022. Presented at the Findings 2022, Association for Computational Linguistics, Abu Dhabi, United Arab Emirates, pp. 1325–1338. https://doi.org/10.18653/v1/2022.findings-emnlp.95
Liu, X., Zheng, Y., Du, Z., Ding, M., Qian, Y., Yang, Z., Tang, J., 2021. GPT Understands, Too. https://doi.org/10.48550/arXiv.2103.10385
Longpre, S., Hou, L., Vu, T., Webson, A., Chung, H.W., Tay, Y., Zhou, D., Le, Q.V., Zoph, B., Wei, J., Roberts, A., 2023. The Flan Collection: Designing Data and Methods for Effective Instruction Tuning. https://doi.org/10.48550/arXiv.2301.13688
Lou, R., Zhang, K., Yin, W., 2024. Large Language Model Instruction Following: A Survey of Progresses and Challenges.
Mitchell, T.M., n.d. The Need for Biases in Learning Generalizations.
Muennighoff, N., Wang, T., Sutawika, L., Roberts, A., Biderman, S., Le Scao, T., Bari, M.S., Shen, S., Yong, Z.X., Schoelkopf, H., Tang, X., Radev, D., Aji, A.F., Almubarak, K., Albanie, S., Alyafeai, Z., Webson, A., Raff, E., Raffel, C., 2023. Crosslingual Generalization through Multitask Finetuning, in: Rogers, A., Boyd-Graber, J., Okazaki, N. (Eds.), Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Presented at the ACL 2023, Association for Computational Linguistics, Toronto, Canada, pp. 15991–16111. https://doi.org/10.18653/v1/2023.acl-long.891
OpenAI, Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., Avila, R., Babuschkin, I., Balaji, S., Balcom, V., Baltescu, P., Bao, H., Bavarian, M., Belgum, J., Bello, I., Berdine, J., Bernadett-Shapiro, G., Berner, C., Bogdonoff, L., Boiko, O., Boyd, M., Brakman, A.-L., Brockman, G., Brooks, T., Brundage, M., Button, K., Cai, T., Campbell, R., Cann, A., Carey, B., Carlson, C., Carmichael, R., Chan, B., Chang, C., Chantzis, F., Chen, D., Chen, S., Chen, R., Chen, J., Chen, M., Chess, B., Cho, C., Chu, C., Chung, H.W., Cummings, D., Currier, J., Dai, Y., Decareaux, C., Degry, T., Deutsch, N., Deville, D., Dhar, A., Dohan, D., Dowling, S., Dunning, S., Ecoffet, A., Eleti, A., Eloundou, T., Farhi, D., Fedus, L., Felix, N., Fishman, S.P., Forte, J., Fulford, I., Gao, L., Georges, E., Gibson, C., Goel, V., Gogineni, T., Goh, G., Gontijo-Lopes, R., Gordon, J., Grafstein, M., Gray, S., Greene, R., Gross, J., Gu, S.S., Guo, Y., Hallacy, C., Han, J., Harris, J., He, Y., Heaton, M., Heidecke, J., Hesse, C., Hickey, A., Hickey, W., Hoeschele, P., Houghton, B., Hsu, K., Hu, S., Hu, X., Huizinga, J., Jain, Shantanu, Jain, Shawn, Jang, J., Jiang, A., Jiang, R., Jin, H., Jin, D., Jomoto, S., Jonn, B., Jun, H., Kaftan, T., Kaiser, Ł., Kamali, A., Kanitscheider, I., Keskar, N.S., Khan, T., Kilpatrick, L., Kim, J.W., Kim, C., Kim, Y., Kirchner, H., Kiros, J., Knight, M., Kokotajlo, D., Kondraciuk, Ł., Kondrich, A., Konstantinidis, A., Kosic, K., Krueger, G., Kuo, V., Lampe, M., Lan, I., Lee, T., Leike, J., Leung, J., Levy, D., Li, C.M., Lim, R., Lin, M., Lin, S., Litwin, M., Lopez, T., Lowe, R., Lue, P., Makanju, A., Malfacini, K., Manning, S., Markov, T., Markovski, Y., Martin, B., Mayer, K., Mayne, A., McGrew, B., McKinney, S.M., McLeavey, C., McMillan, P., McNeil, J., Medina, D., Mehta, A., Menick, J., Metz, L., Mishchenko, A., Mishkin, P., Monaco, V., Morikawa, E., Mossing, D., Mu, T., Murati, M., Murk, O., Mély, D., Nair, A., Nakano, R., Nayak, R., Neelakantan, A., Ngo, R., Noh, H., Ouyang, L., O’Keefe, C., Pachocki, J., Paino, A., Palermo, J., Pantuliano, A., Parascandolo, G., Parish, J., Parparita, E., Passos, A., Pavlov, M., Peng, A., Perelman, A., Peres, F. de A.B., Petrov, M., Pinto, H.P. de O., Michael, Pokorny, Pokrass, M., Pong, V., Powell, T., Power, A., Power, B., Proehl, E., Puri, R., Radford, A., Rae, J., Ramesh, A., Raymond, C., Real, F., Rimbach, K., Ross, C., Rotsted, B., Roussez, H., Ryder, N., Saltarelli, M., Sanders, T., Santurkar, S., Sastry, G., Schmidt, H., Schnurr, D., Schulman, J., Selsam, D., Sheppard, K., Sherbakov, T., Shieh, J., Shoker, S., Shyam, P., Sidor, S., Sigler, E., Simens, M., Sitkin, J., Slama, K., Sohl, I., Sokolowsky, B., Song, Y., Staudacher, N., Such, F.P., Summers, N., Sutskever, I., Tang, J., Tezak, N., Thompson, M., Tillet, P., Tootoonchian, A., Tseng, E., Tuggle, P., Turley, N., Tworek, J., Uribe, J.F.C., Vallone, A., Vijayvergiya, A., Voss, C., Wainwright, C., Wang, J.J., Wang, A., Wang, B., Ward, J., Wei, J., Weinmann, C.J., Welihinda, A., Welinder, P., Weng, J., Weng, L., Wiethoff, M., Willner, D., Winter, C., Wolrich, S., Wong, H., Workman, L., Wu, S., Wu, J., Wu, M., Xiao, K., Xu, T., Yoo, S., Yu, K., Yuan, Q., Zaremba, W., Zellers, R., Zhang, C., Zhang, M., Zhao, S., Zheng, T., Zhuang, J., Zhuk, W., Zoph, B., 2023. GPT-4 Technical Report. https://doi.org/10.48550/arXiv.2303.08774
Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C., Mishkin, P., Zhang, C., Agarwal, S., Slama, K., Ray, A., Schulman, J., Hilton, J., Kelton, F., Miller, L., Simens, M., Askell, A., Welinder, P., Christiano, P.F., Leike, J., Lowe, R., 2022. Training language models to follow instructions with human feedback. Adv. Neural Inf. Process. Syst. 35, 27730–27744.
Peng, B., Li, C., He, P., Galley, M., Gao, J., 2023. Instruction Tuning with GPT-4. https://doi.org/10.48550/arXiv.2304.03277
Peters, M., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L., 2018. Deep Contextualized Word Representations, in: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Presented at the Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), Association for Computational Linguistics, New Orleans, Louisiana, pp. 2227–2237. https://doi.org/10.18653/v1/N18-1202
Petroni, F., Rocktäschel, T., Riedel, S., Lewis, P., Bakhtin, A., Wu, Y., Miller, A., 2019. Language Models as Knowledge Bases?, in: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Presented at the EMNLP-IJCNLP 2019, Association for Computational Linguistics, Hong Kong, China, pp. 2463–2473. https://doi.org/10.18653/v1/D19-1250
Prasad, A., Hase, P., Zhou, X., Bansal, M., 2023. GrIPS: Gradient-free, Edit-based Instruction Search for Prompting Large Language Models, in: Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics. Presented at the EACL 2023, Association for Computational Linguistics, Dubrovnik, Croatia, pp. 3845–3864. https://doi.org/10.18653/v1/2023.eacl-main.277
Pryzant, R., Iter, D., Li, J., Lee, Y.T., Zhu, C., Zeng, M., 2023. Automatic Prompt Optimization with “Gradient Descent” and Beam Search. https://doi.org/10.48550/arXiv.2305.03495
Qi, Z., Tan, X., Shi, S., Qu, C., Xu, Y., Qi, Y., 2023. PILLOW: Enhancing Efficient Instruction Fine-tuning via Prompt Matching, in: Wang, M., Zitouni, I. (Eds.), Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: Industry Track. Presented at the EMNLP 2023, Association for Computational Linguistics, Singapore, pp. 471–482.
Qin, G., Eisner, J., 2021. Learning How to Ask: Querying LMs with Mixtures of Soft Prompts, in: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Presented at the NAACL-HLT 2021, Association for Computational Linguistics, Online, pp. 5203–5212. https://doi.org/10.18653/v1/2021.naacl-main.410
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I., 2019. Language models are unsupervised multitask learners. OpenAI Blog 1, 9.
Rae, J.W., Borgeaud, S., Cai, T., Millican, K., Hoffmann, J., Song, F., Aslanides, J., Henderson, S., Ring, R., Young, S., Rutherford, E., Hennigan, T., Menick, J., Cassirer, A., Powell, R., Driessche, G. van den, Hendricks, L.A., Rauh, M., Huang, P.-S., Glaese, A., Welbl, J., Dathathri, S., Huang, S., Uesato, J., Mellor, J., Higgins, I., Creswell, A., McAleese, N., Wu, A., Elsen, E., Jayakumar, S., Buchatskaya, E., Budden, D., Sutherland, E., Simonyan, K., Paganini, M., Sifre, L., Martens, L., Li, X.L., Kuncoro, A., Nematzadeh, A., Gribovskaya, E., Donato, D., Lazaridou, A., Mensch, A., Lespiau, J.-B., Tsimpoukelli, M., Grigorev, N., Fritz, D., Sottiaux, T., Pajarskas, M., Pohlen, T., Gong, Z., Toyama, D., d’Autume, C. de M., Li, Y., Terzi, T., Mikulik, V., Babuschkin, I., Clark, A., Casas, D. de L., Guy, A., Jones, C., Bradbury, J., Johnson, M., Hechtman, B., Weidinger, L., Gabriel, I., Isaac, W., Lockhart, E., Osindero, S., Rimell, L., Dyer, C., Vinyals, O., Ayoub, K., Stanway, J., Bennett, L., Hassabis, D., Kavukcuoglu, K., Irving, G., 2022. Scaling Language Models: Methods, Analysis & Insights from Training Gopher. https://doi.org/10.48550/arXiv.2112.11446
Reynolds, L., McDonell, K., 2021. Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm, in: Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems, CHI EA ’21. Association for Computing Machinery, New York, NY, USA, pp. 1–7. https://doi.org/10.1145/3411763.3451760
Sanh, V., Webson, A., Raffel, C., Bach, S.H., Sutawika, L., Alyafeai, Z., Chaffin, A., Stiegler, A., Scao, T.L., Raja, A., Dey, M., Bari, M.S., Xu, C., Thakker, U., Sharma, S.S., Szczechla, E., Kim, T., Chhablani, G., Nayak, N., Datta, D., Chang, J., Jiang, M.T.-J., Wang, H., Manica, M., Shen, S., Yong, Z.X., Pandey, H., Bawden, R., Wang, T., Neeraj, T., Rozen, J., Sharma, A., Santilli, A., Fevry, T., Fries, J.A., Teehan, R., Bers, T., Biderman, S., Gao, L., Wolf, T., Rush, A.M., 2022. Multitask Prompted Training Enables Zero-Shot Task Generalization. https://doi.org/10.48550/arXiv.2110.08207
Schick, T., Schütze, H., 2021a. Few-Shot Text Generation with Natural Language Instructions, in: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Presented at the EMNLP 2021, Association for Computational Linguistics, Online and Punta Cana, Dominican Republic, pp. 390–402. https://doi.org/10.18653/v1/2021.emnlp-main.32
Schick, T., Schütze, H., 2021b. Exploiting Cloze-Questions for Few-Shot Text Classification and Natural Language Inference, in: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. Presented at the EACL 2021, Association for Computational Linguistics, Online, pp. 255–269. https://doi.org/10.18653/v1/2021.eacl-main.20
Schick, T., Schütze, H., 2021c. It’s Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners, in: Toutanova, K., Rumshisky, A., Zettlemoyer, L., Hakkani-Tur, D., Beltagy, I., Bethard, S., Cotterell, R., Chakraborty, T., Zhou, Y. (Eds.), Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Presented at the NAACL-HLT 2021, Association for Computational Linguistics, Online, pp. 2339–2352. https://doi.org/10.18653/v1/2021.naacl-main.185
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O., 2017. Proximal Policy Optimization Algorithms.
Shin, R., Lin, C., Thomson, S., Chen, C., Roy, S., Platanios, E.A., Pauls, A., Klein, D., Eisner, J., Van Durme, B., 2021. Constrained Language Models Yield Few-Shot Semantic Parsers, in: Moens, M.-F., Huang, X., Specia, L., Yih, S.W. (Eds.), Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Presented at the EMNLP 2021, Association for Computational Linguistics, Online and Punta Cana, Dominican Republic, pp. 7699–7715. https://doi.org/10.18653/v1/2021.emnlp-main.608
Shin, T., Razeghi, Y., Logan IV, R.L., Wallace, E., Singh, S., 2020. AutoPrompt: Eliciting Knowledge from Language Models with Automatically Generated Prompts, in: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Presented at the EMNLP 2020, Association for Computational Linguistics, Online, pp. 4222–4235. https://doi.org/10.18653/v1/2020.emnlp-main.346
Stiennon, N., Ouyang, L., Wu, J., Ziegler, D., Lowe, R., Voss, C., Radford, A., Amodei, D., Christiano, P.F., 2020. Learning to summarize with human feedback, in: Advances in Neural Information Processing Systems. Curran Associates, Inc., pp. 3008–3021.
Tang, T., Li, J., Zhao, W.X., Wen, J.-R., 2022. Context-Tuning: Learning Contextualized Prompts for Natural Language Generation, in: Proceedings of the 29th International Conference on Computational Linguistics. Presented at the COLING 2022, International Committee on Computational Linguistics, Gyeongju, Republic of Korea, pp. 6340–6354.
Taori, R., Gulrajani, I., Zhang, T., 2023. Alpaca: A strong, replicable instruction-following model. [WWW Document]. URL https://crfm.stanford.edu/2023/03/13/alpaca.html
Thoppilan, R., De Freitas, D., Hall, J., Shazeer, N., Kulshreshtha, A., Cheng, H.-T., Jin, A., Bos, T., Baker, L., Du, Y., Li, Y., Lee, H., Zheng, H.S., Ghafouri, A., Menegali, M., Huang, Y., Krikun, M., Lepikhin, D., Qin, J., Chen, D., Xu, Y., Chen, Z., Roberts, A., Bosma, M., Zhao, V., Zhou, Y., Chang, C.-C., Krivokon, I., Rusch, W., Pickett, M., Srinivasan, P., Man, L., Meier-Hellstern, K., Morris, M.R., Doshi, T., Santos, R.D., Duke, T., Soraker, J., Zevenbergen, B., Prabhakaran, V., Diaz, M., Hutchinson, B., Olson, K., Molina, A., Hoffman-John, E., Lee, J., Aroyo, L., Rajakumar, R., Butryna, A., Lamm, M., Kuzmina, V., Fenton, J., Cohen, A., Bernstein, R., Kurzweil, R., Aguera-Arcas, B., Cui, C., Croak, M., Chi, E., Le, Q., 2022. LaMDA: Language Models for Dialog Applications.
Wan, X., Sun, R., Dai, H., Arik, S., Pfister, T., 2023a. Better Zero-Shot Reasoning with Self-Adaptive Prompting, in: Findings of the Association for Computational Linguistics: ACL 2023. Presented at the Findings 2023, Association for Computational Linguistics, Toronto, Canada, pp. 3493–3514. https://doi.org/10.18653/v1/2023.findings-acl.216
Wan, X., Sun, R., Nakhost, H., Dai, H., Eisenschlos, J.M., Arik, S.O., Pfister, T., 2023b. Universal Self-adaptive Prompting. https://doi.org/10.48550/arXiv.2305.14926
Wang, Yizhong, Ivison, H., Dasigi, P., Hessel, J., Khot, T., Chandu, K.R., Wadden, D., MacMillan, K., Smith, N.A., Beltagy, I., Hajishirzi, H., 2023a. How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources. https://doi.org/10.48550/arXiv.2306.04751
Wang, Yizhong, Kordi, Y., Mishra, S., Liu, A., Smith, N.A., Khashabi, D., Hajishirzi, H., 2023b. Self-Instruct: Aligning Language Models with Self-Generated Instructions, in: Rogers, A., Boyd-Graber, J., Okazaki, N. (Eds.), Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Presented at the ACL 2023, Association for Computational Linguistics, Toronto, Canada, pp. 13484–13508. https://doi.org/10.18653/v1/2023.acl-long.754
Wang, Y., Mishra, Swaroop, Alipoormolabashi, P., Kordi, Y., Mirzaei, A., Naik, A., Ashok, A., Dhanasekaran, A.S., Arunkumar, A., Stap, D., Pathak, E., Karamanolakis, G., Lai, H., Purohit, I., Mondal, I., Anderson, J., Kuznia, K., Doshi, K., Pal, K.K., Patel, M., Moradshahi, M., Parmar, M., Purohit, M., Varshney, N., Kaza, P.R., Verma, P., Puri, R.S., Karia, R., Doshi, S., Sampat, S.K., Mishra, Siddhartha, Reddy A, S., Patro, S., Dixit, T., Shen, X., 2022. Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks, in: Goldberg, Y., Kozareva, Z., Zhang, Y. (Eds.), Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. Presented at the EMNLP 2022, Association for Computational Linguistics, Abu Dhabi, United Arab Emirates, pp. 5085–5109. https://doi.org/10.18653/v1/2022.emnlp-main.340
Wang, Yufei, Zhong, W., Li, L., Mi, F., Zeng, X., Huang, W., Shang, L., Jiang, X., Liu, Q., 2023. Aligning Large Language Models with Human: A Survey.
Wei, J., Bosma, M., Zhao, V.Y., Guu, K., Yu, A.W., Lester, B., Du, N., Dai, A.M., Le, Q.V., 2022. Finetuned Language Models Are Zero-Shot Learners.
Weller, O., Lourie, N., Gardner, M., Peters, M.E., 2020. Learning from Task Descriptions, in: Webber, B., Cohn, T., He, Y., Liu, Y. (Eds.), Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Presented at the EMNLP 2020, Association for Computational Linguistics, Online, pp. 1361–1375. https://doi.org/10.18653/v1/2020.emnlp-main.105
Wen, Y., Jain, N., Kirchenbauer, J., Goldblum, M., Geiping, J., Goldstein, T., 2023. Hard Prompts Made Easy: Gradient-Based Discrete Optimization for Prompt Tuning and Discovery. https://doi.org/10.48550/arXiv.2302.03668
Wu, Z., Wang, S., Gu, J., Hou, R., Dong, Y., Vydiswaran, V.G.V., Ma, H., 2022. IDPG: An Instance-Dependent Prompt Generation Method, in: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Presented at the NAACL-HLT 2022, Association for Computational Linguistics, Seattle, United States, pp. 5507–5521. https://doi.org/10.18653/v1/2022.naacl-main.403
Xu, C., Sun, Q., Zheng, K., Geng, X., Zhao, P., Feng, J., Tao, C., Jiang, D., 2023. WizardLM: Empowering Large Language Models to Follow Complex Instructions. https://doi.org/10.48550/arXiv.2304.12244
Xu, H., Chen, Y., Du, Y., Shao, N., Yanggang, W., Li, H., Yang, Z., 2022. GPS: Genetic Prompt Search for Efficient Few-Shot Learning, in: Goldberg, Y., Kozareva, Z., Zhang, Y. (Eds.), Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. Presented at the EMNLP 2022, Association for Computational Linguistics, Abu Dhabi, United Arab Emirates, pp. 8162–8171. https://doi.org/10.18653/v1/2022.emnlp-main.559
Yang, C., Wang, X., Lu, Y., Liu, H., Le, Q.V., Zhou, D., Chen, X., 2023. Large Language Models as Optimizers.
Zhang, Shengyu, Dong, L., Li, X., Zhang, Sen, Sun, X., Wang, S., Li, J., Hu, R., Zhang, T., Wu, F., Wang, G., 2023. Instruction Tuning for Large Language Models: A Survey.
Zhang, T., Kishore, V., Wu, F., Weinberger, K.Q., Artzi, Y., 2020. BERTScore: Evaluating Text Generation with BERT. https://doi.org/10.48550/arXiv.1904.09675
Zhang, T., Wang, X., Zhou, D., Schuurmans, D., Gonzalez, J.E., 2022. TEMPERA: Test-Time Prompting via Reinforcement Learning. https://doi.org/10.48550/arXiv.2211.11890
Zhou, C., Liu, P., Xu, P., Iyer, S., Sun, J., Mao, Y., Ma, X., Efrat, A., Yu, P., Yu, L., Zhang, S., Ghosh, G., Lewis, M., Zettlemoyer, L., Levy, O., 2023. LIMA: Less Is More for Alignment.
Zhou, J., Bhat, S., 2021. Paraphrase Generation: A Survey of the State of the Art, in: Moens, M.-F., Huang, X., Specia, L., Yih, S.W. (Eds.), Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Presented at the EMNLP 2021, Association for Computational Linguistics, Online and Punta Cana, Dominican Republic, pp. 5075–5086. https://doi.org/10.18653/v1/2021.emnlp-main.414
Zhou, Y., Muresanu, A.I., Han, Z., Paster, K., Pitis, S., Chan, H., Ba, J., 2023. Large Language Models Are Human-Level Prompt Engineers. https://doi.org/10.48550/arXiv.2211.01910

指導教授

柯士文

審核日期

2024-7-26

推文