This repository provides a unified platform for instruction fine-tuning (IFT) of large language models (LLMs), focusing on instruction collection, parameter-efficient methods, and multi-LLM integration. It aims to lower the barrier for NLP researchers to experiment with and deploy LLMs, particularly for enhancing Chain-of-Thought (CoT) reasoning and Chinese instruction following.
How It Works
The platform unifies various LLMs (LLaMA, ChatGLM, Bloom, MOSS, InternLM) and parameter-efficient fine-tuning (PEFT) techniques (LoRA, P-tuning, AdaLoRA, Prefix Tuning) under a single interface. It leverages a comprehensive collection of instruction-tuning datasets, including English, Chinese, and CoT data, to improve model capabilities. The core advantage lies in its modular design, allowing researchers to easily mix and match LLMs, PEFT methods, and datasets for systematic empirical studies.
Quick Start & Requirements
- Install:
pip install -r requirements.txt
(ensure Python >= 3.9 for ChatGLM). For PEFT methods other than LoRA, install from the project: pip install -e ./peft
.
- Prerequisites: Access to LLM weights (e.g., from Hugging Face), potentially multiple GPUs for larger models or multi-GPU training.
- Setup: Requires downloading datasets and model weights. Training can be resource-intensive, with examples showing single 80G A100 for LLaMA-7B fine-tuning.
- Links: Official Quick Start, Data Collection, Empirical Study
Highlighted Details
- Extensive collection of instruction-tuning datasets (over 78M samples across various languages and tasks).
- Unified interface for multiple LLMs (LLaMA, ChatGLM, Bloom, MOSS, InternLM) and PEFT methods (LoRA, P-tuning, AdaLoRA, etc.).
- Significant improvements in CoT reasoning and Chinese instruction following demonstrated through empirical studies.
- Supports 4-bit quantization for PEFT methods like QLoRA.
- Includes code for parameter merging, local chatting, batch prediction, and web service building.
Maintenance & Community
- Active development with recent merges of LLMs like InternLM and PEFT methods like QLoRA.
- Welcomes community contributions (PRs).
- WeChat group available for communication (contact author for invite).
Licensing & Compatibility
- The repository itself appears to be under a permissive license, but it relies on and integrates models and datasets with their own licenses. Users must ensure compatibility with the licenses of the underlying LLMs (e.g., LLaMA's license) and datasets.
Limitations & Caveats
- Some PEFT methods (e.g., P-tuning, prompt-tuning) showed lower performance in empirical studies compared to LoRA.
- Fine-tuning certain models (e.g., ChatGLM) may require smaller batch sizes due to
load_in_8bit
incompatibility.
- Performance can vary significantly based on the choice of LLM base, PEFT method, and instruction dataset, with some combinations showing performance drops.