open-instruct  by allenai

Training codebase for instruction-following language models

Created 2 years ago
3,194 stars

Top 15.0% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides a comprehensive codebase for instruction-tuning and post-training large language models, targeting researchers and developers aiming to replicate and advance open-source LLM capabilities. It offers unified tools for supervised fine-tuning (SFT), direct preference optimization (DPO), and reinforcement learning with verifiable rewards (RLVR), enabling the creation of instruction-following models.

How It Works

The project implements state-of-the-art LLM post-training techniques, including SFT, DPO, and RLVR, within a unified framework. It leverages Hugging Face's transformers library and adapts code from established RLHF and DPO implementations. The codebase supports distributed training and integrates with libraries like FlashAttention-2 for performance, facilitating efficient experimentation with various instruction datasets and model architectures.

Quick Start & Requirements

  • Installation: Local installation via pip install -r requirements.txt, pip install -e ., and python -m nltk.downloader punkt.
  • Prerequisites: PyTorch 2.5.1 with CUDA 12.1, FlashAttention-2 (v2.7.2.post1), packaging, and setuptools<70.0.0. Docker installation is also supported.
  • Resources: Requires significant GPU resources for training, with example scripts for 1-GPU and 8-GPU setups.
  • Links: TÜLU 3 README, Tulu 1 & 2 READMEs, Models on HuggingFace

Highlighted Details

  • Supports fine-tuning and evaluation of Llama 3.1 and OLMo 2 models.
  • Implements DPO and RLVR for preference learning and reward-based training.
  • Includes scripts for measuring dataset contamination.
  • Offers support for LoRA and QLoRA fine-tuning.

Maintenance & Community

The project is actively maintained by AllenAI, with recent updates in November 2024. It is associated with multiple research papers detailing its methodologies and results.

Licensing & Compatibility

The codebase is licensed under Apache 2.0. Released models have varying licenses: V1 models follow base model licenses and a custom tulu_license.txt, while V2 models use the AI2 ImpACT license. Compatibility for commercial use depends on the specific model's license.

Limitations & Caveats

The repository is a research codebase and does not guarantee backward compatibility. Evaluation scripts are noted as unmaintained, with a recommendation to use OLMES.

Health Check
Last Commit

15 hours ago

Responsiveness

1 week

Pull Requests (30d)
98
Issues (30d)
3
Star History
80 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Pawel Garbacki Pawel Garbacki(Cofounder of Fireworks AI), and
4 more.

alpaca_farm by tatsu-lab

0.1%
826
RLHF simulation framework for accessible instruction-following/alignment research
Created 2 years ago
Updated 1 year ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Junyang Lin Junyang Lin(Core Maintainer at Alibaba Qwen), and
3 more.

Alpaca-CoT by PhoebusSi

0.1%
3k
IFT platform for instruction collection, parameter-efficient methods, and LLMs
Created 2 years ago
Updated 1 year ago
Feedback? Help us improve.