open-instruct  by allenai

Training codebase for instruction-following language models

created 2 years ago
3,083 stars

Top 15.9% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides a comprehensive codebase for instruction-tuning and post-training large language models, targeting researchers and developers aiming to replicate and advance open-source LLM capabilities. It offers unified tools for supervised fine-tuning (SFT), direct preference optimization (DPO), and reinforcement learning with verifiable rewards (RLVR), enabling the creation of instruction-following models.

How It Works

The project implements state-of-the-art LLM post-training techniques, including SFT, DPO, and RLVR, within a unified framework. It leverages Hugging Face's transformers library and adapts code from established RLHF and DPO implementations. The codebase supports distributed training and integrates with libraries like FlashAttention-2 for performance, facilitating efficient experimentation with various instruction datasets and model architectures.

Quick Start & Requirements

  • Installation: Local installation via pip install -r requirements.txt, pip install -e ., and python -m nltk.downloader punkt.
  • Prerequisites: PyTorch 2.5.1 with CUDA 12.1, FlashAttention-2 (v2.7.2.post1), packaging, and setuptools<70.0.0. Docker installation is also supported.
  • Resources: Requires significant GPU resources for training, with example scripts for 1-GPU and 8-GPU setups.
  • Links: TÜLU 3 README, Tulu 1 & 2 READMEs, Models on HuggingFace

Highlighted Details

  • Supports fine-tuning and evaluation of Llama 3.1 and OLMo 2 models.
  • Implements DPO and RLVR for preference learning and reward-based training.
  • Includes scripts for measuring dataset contamination.
  • Offers support for LoRA and QLoRA fine-tuning.

Maintenance & Community

The project is actively maintained by AllenAI, with recent updates in November 2024. It is associated with multiple research papers detailing its methodologies and results.

Licensing & Compatibility

The codebase is licensed under Apache 2.0. Released models have varying licenses: V1 models follow base model licenses and a custom tulu_license.txt, while V2 models use the AI2 ImpACT license. Compatibility for commercial use depends on the specific model's license.

Limitations & Caveats

The repository is a research codebase and does not guarantee backward compatibility. Evaluation scripts are noted as unmaintained, with a recommendation to use OLMES.

Health Check
Last commit

13 hours ago

Responsiveness

1 week

Pull Requests (30d)
88
Issues (30d)
8
Star History
155 stars in the last 90 days

Explore Similar Projects

Starred by Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake).

HALOs by ContextualAI

0.2%
873
Library for aligning LLMs using human-aware loss functions
created 1 year ago
updated 2 weeks ago
Feedback? Help us improve.