Self-Instruct: Aligning Language Models with Self-Generated Instructions
Yizhong Wang, Yeganeh Kordi, Swaroop Mishra, Alisa Liu, Noah A. Smith + 2 more
TLDR
Self-Instruct is a method that improves language models' ability to follow instructions by generating and refining their own instruction data, significantly enhancing performance without relying on human-annotated datasets.
Key contributions
- Introduces a self-bootstrapping pipeline where a language model generates instructions, inputs, and outputs to create training data for fine-tuning itself.
- Achieves a 33% absolute improvement on Super-NaturalInstructions using only synthetic data, matching performance of models trained with human annotations.
- Demonstrates superior results over existing public instruction datasets on novel tasks through human evaluation, narrowing the gap with proprietary instruction-tuned models.
Why it matters
This paper matters because it presents a scalable, low-cost approach to instruction tuning that reduces dependence on expensive human-labeled data, enabling broader and more diverse task generalization for large language models. By leveraging a model's own generative capabilities to create training data, Self-Instruct advances the field toward more autonomous and efficient alignment of language models with user instructions.
Original Abstract
Large "instruction-tuned" language models (i.e., finetuned to respond to instructions) have demonstrated a remarkable ability to generalize zero-shot to new tasks. Nevertheless, they depend heavily on human-written instruction data that is often limited in quantity, diversity, and creativity, therefore hindering the generality of the tuned model. We introduce Self-Instruct, a framework for improving the instruction-following capabilities of pretrained language models by bootstrapping off their own generations. Our pipeline generates instructions, input, and output samples from a language model, then filters invalid or similar ones before using them to finetune the original model. Applying our method to the vanilla GPT3, we demonstrate a 33% absolute improvement over the original model on Super-NaturalInstructions, on par with the performance of InstructGPT-001, which was trained with private user data and human annotations. For further evaluation, we curate a set of expert-written instructions for novel tasks, and show through human evaluation that tuning GPT3 with Self-Instruct outperforms using existing public instruction datasets by a large margin, leaving only a 5% absolute gap behind InstructGPT-001. Self-Instruct provides an almost annotation-free method for aligning pre-trained language models with instructions, and we release our large synthetic dataset to facilitate future studies on instruction tuning. Our code and data are available at https://github.com/yizhongw/self-instruct.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.