Instruction Tuning for Large Language Models: A Survey
This repository contains resources referenced in the paper Instruction Tuning for Large Language Models: A Survey.
If you find this repository helpful, please cite the following:
@article{zhang2023instruction,
title={Instruction Tuning for Large Language Models: A Survey},
author={Zhang, Shengyu and Dong, Linfeng and Li, Xiaoya and Zhang, Sen and Sun, Xiaofei and Wang, Shuhe and Li, Jiwei and Hu, Runyi and Zhang, Tianwei and Wu, Fei and others},
journal={arXiv preprint arXiv:2308.10792},
year={2023}
}
🥳 News
Stay tuned! More related work will be updated!
- [12 Mar, 2024] We update work (papers and projects) related to large multimodal models.
- [11 Mar, 2024] We update work (papers and projects) related to synthetic data generation and image-text generation.
- [07 Sep, 2023] The repository is created.
- [21 Aug, 2023] We release the first version of the paper.
Table of Contents
- Overview
- Instruction Tuning
- Multi-modality Instruction Tuning
- Domain-specific Instruction Tuning
- Efficient Tuning Techniques
- References
- Contact
Overview
Instruction tuning (IT) refers to the process of further training large language models (LLMs) on a dataset consisting
of (instruction, output)
pairs
in a supervised fashion,
which bridges the gap between the next-word prediction objective of LLMs and the users' objective of having LLMs adhere
to human instructions. The general pipeline of instruction tuning is shown in the following:
In the paper, we make a systematic review of the literature, including the general methodology of IT, the construction of IT datasets, the training of IT models, and applications to different modalities, domains and application, along with analysis on aspects that influence the outcome of IT (e.g., generation of instruction outputs, size of the instruction dataset, etc). We also review the potential pitfalls of IT along with criticism against it, along with efforts pointing out current deficiencies of existing strategies and suggest some avenues for fruitful research. The typology of the paper is as follows:
Instruction Tuning
Datasets
Type | Dataset Name | Paper | Project | # of Instructions | # of Lang | Construction | Open Source |
---|---|---|---|---|---|---|---|
Human-Crafted | UnifiedQA [1] | paper | project | 750K | En | human-crafted | Yes |
UnifiedSKG [2] | paper | project | 0.8M | En | human-crafted | Yes | |
Natural Instructions [3] | paper | project | 193K | En | human-crafted | Yes | |
Super-Natural Instructions [4] | paper | project | 5M | 55 Lang | human-crafted | Yes | |
P3 [5] | paper | project | 12M | En | human-crafted | Yes | |
xP3 [6] | paper | project | 81M | 46 Lang | human-crafted | Yes | |
Flan 2021 [7] | paper | project | 4.4M | En | human-crafted | Yes | |
COIG [8] | paper | project | - | - | - | Yes | |
InstructGPT [9] | paper | - | 13K | Multi | human-crafted | No | |
Dolly [10] | paper | project | 15K | En | human-crafted | Yes | |
LIMA [11] | paper | project | 1K | En | human-crafted | Yes | |
ChatGPT [12] | paper | - | - | Multi | human-crafted | No | |
OpenAssistant [13] | paper | project | 161,443 | Multi | human-crafted | Yes | |
Synthetic Data (Distillation) | OIG [14] | - | project | 43M | En | ChatGPT (No technique reports) | Yes |
Unnatural Instructions [3] | paper | project | 240K | En | InstructGPT-generated | Yes | |
InstructWild [15] | - | project | 104K | - | ChatGPT-Generated | Yes | |
Evol-Instruct / WizardLM [16] | paper | project | 52K | En | ChatGPT-Generated | Yes | |
Alpaca [17] | - | project | 52K | En | InstructGPT-generated | Yes | |
LogiCoT [18] | paper | project | - | En | GPT-4-Generated | Yes | |
GPT-4-LLM [19] | paper | project | 52K | En&Zh | GPT-4-Generated | Yes | |
Vicuna [20] | - | project | 70K | En | Real User-ChatGPT Conversations | No | |
Baize v1 [21] | paper | project | 111.5K | En | ChatGPT-Generated | Yes | |
UltraChat [ |