Alpaca-CoT  by PhoebusSi

IFT platform for instruction collection, parameter-efficient methods, and LLMs

Created 2 years ago
2,769 stars

Top 17.2% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides a unified platform for instruction fine-tuning (IFT) of large language models (LLMs), focusing on instruction collection, parameter-efficient methods, and multi-LLM integration. It aims to lower the barrier for NLP researchers to experiment with and deploy LLMs, particularly for enhancing Chain-of-Thought (CoT) reasoning and Chinese instruction following.

How It Works

The platform unifies various LLMs (LLaMA, ChatGLM, Bloom, MOSS, InternLM) and parameter-efficient fine-tuning (PEFT) techniques (LoRA, P-tuning, AdaLoRA, Prefix Tuning) under a single interface. It leverages a comprehensive collection of instruction-tuning datasets, including English, Chinese, and CoT data, to improve model capabilities. The core advantage lies in its modular design, allowing researchers to easily mix and match LLMs, PEFT methods, and datasets for systematic empirical studies.

Quick Start & Requirements

  • Install: pip install -r requirements.txt (ensure Python >= 3.9 for ChatGLM). For PEFT methods other than LoRA, install from the project: pip install -e ./peft.
  • Prerequisites: Access to LLM weights (e.g., from Hugging Face), potentially multiple GPUs for larger models or multi-GPU training.
  • Setup: Requires downloading datasets and model weights. Training can be resource-intensive, with examples showing single 80G A100 for LLaMA-7B fine-tuning.
  • Links: Official Quick Start, Data Collection, Empirical Study

Highlighted Details

  • Extensive collection of instruction-tuning datasets (over 78M samples across various languages and tasks).
  • Unified interface for multiple LLMs (LLaMA, ChatGLM, Bloom, MOSS, InternLM) and PEFT methods (LoRA, P-tuning, AdaLoRA, etc.).
  • Significant improvements in CoT reasoning and Chinese instruction following demonstrated through empirical studies.
  • Supports 4-bit quantization for PEFT methods like QLoRA.
  • Includes code for parameter merging, local chatting, batch prediction, and web service building.

Maintenance & Community

  • Active development with recent merges of LLMs like InternLM and PEFT methods like QLoRA.
  • Welcomes community contributions (PRs).
  • WeChat group available for communication (contact author for invite).

Licensing & Compatibility

  • The repository itself appears to be under a permissive license, but it relies on and integrates models and datasets with their own licenses. Users must ensure compatibility with the licenses of the underlying LLMs (e.g., LLaMA's license) and datasets.

Limitations & Caveats

  • Some PEFT methods (e.g., P-tuning, prompt-tuning) showed lower performance in empirical studies compared to LoRA.
  • Fine-tuning certain models (e.g., ChatGLM) may require smaller batch sizes due to load_in_8bit incompatibility.
  • Performance can vary significantly based on the choice of LLM base, PEFT method, and instruction dataset, with some combinations showing performance drops.
Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
7 stars in the last 30 days

Explore Similar Projects

Starred by Casper Hansen Casper Hansen(Author of AutoAWQ), Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI), and
5 more.

xtuner by InternLM

0.5%
5k
LLM fine-tuning toolkit for research
Created 2 years ago
Updated 1 day ago
Starred by Vincent Weisser Vincent Weisser(Cofounder of Prime Intellect), Ross Taylor Ross Taylor(Cofounder of General Reasoning; Cocreator of Papers with Code), and
11 more.

open-instruct by allenai

0.7%
3k
Training codebase for instruction-following language models
Created 2 years ago
Updated 15 hours ago
Starred by Junyang Lin Junyang Lin(Core Maintainer at Alibaba Qwen), Vincent Weisser Vincent Weisser(Cofounder of Prime Intellect), and
25 more.

alpaca-lora by tloen

0.0%
19k
LoRA fine-tuning for LLaMA
Created 2 years ago
Updated 1 year ago
Feedback? Help us improve.