dc.description.abstract | Generative AI services like ChatGPT, Gemini, and Copilot have gained significant attention for their ability to follow human instructions and assist with real-world tasks. The core mechanism behind their effectiveness is instruction tuning — a process involving supervised fine-tuning (SFT) with paired datasets of human instructions and responses. Despite the ability of following human instructions from instruction-tuned large language models (LLMs), studies still show that instruction-tuned LLMs exhibit sensitivity to perturbations in discrete text, which can cause the unpredictable, uncontrollable generation behavior and may lead to performance degradation. Given the emergence of general-purpose generative AI services, whether can human instructions be optimized to align with the preferences of instruction-tuned LLMs for stable, controllable and high-quality responses generation while also addressing users′ concerns about crafting precise instructions.
The concept of enhancing LLMs’ performance by optimizing discrete text to cater LLMs’ preference has already shown the effectiveness at discrete prompt engineering, which enhancing the performance of LLMs on traditional NLP tasks by finding optimal discrete templates or texts. However, unlike traditional NLP tasks, human instructions are user-friendly, highly variable, and derived from real-world interactions, making direct application of previous discrete prompt methods to human instructions impractical.
In our experiments, we demonstrate that our proposed method enhances the response quality of instruction-tuned LLMs simply by rephrasing human instructions. This enhancement is more pronounced with a richer variety of training data. Additionally, we observe that the same optimization approach applies across instruction-tuned LLMs sharing the same backbone, whereas instruction-tuned LLMs with different backbones may have different preferences for discrete text. Our method showcases the feasibility of improving instruction-tuned LLMs at the discrete level and in a black-box scenario, while maintaining the semantic consistency and explainability of human instructions. | en_US |