trl  by huggingface

Library for transformer RL

Created 5 years ago
15,565 stars

Top 3.2% on SourcePulse

GitHubView on GitHub
Project Summary

TRL (Transformer Reinforcement Learning) is a Python library for post-training foundation models using advanced techniques like Supervised Fine-Tuning (SFT), Proximal Policy Optimization (PPO), and Direct Preference Optimization (DPO). It targets researchers and engineers working with large language models, offering efficient scaling and integration with the Hugging Face ecosystem.

How It Works

TRL provides specialized trainer classes (SFTTrainer, GRPOTrainer, DPOTrainer, RewardTrainer) that wrap 🤗 Transformers' Trainer. This design allows seamless integration with distributed training (DDP, DeepSpeed ZeRO, FSDP) and efficient fine-tuning of large models on modest hardware via 🤗 PEFT (LoRA/QLoRA) and Unsloth optimized kernels.

Quick Start & Requirements

  • Install: pip install trl
  • Prerequisites: Python, 🤗 Transformers, Datasets. GPU recommended for practical use.
  • Docs: https://huggingface.co/docs/trl/index
  • CLI: trl sft --model_name_or_path ... or trl dpo --model_name_or_path ...

Highlighted Details

  • Supports SFT, PPO, GRPO, DPO, and Reward modeling.
  • Integrates with 🤗 PEFT for parameter-efficient fine-tuning.
  • Leverages 🤗 Accelerate for distributed training.
  • Includes Unsloth integration for accelerated training.

Maintenance & Community

  • Developed by Hugging Face.
  • Active development and community support.

Licensing & Compatibility

  • License: Apache-2.0.
  • Compatibility: Permissive license allows commercial use and integration with closed-source projects.

Limitations & Caveats

The library focuses on transformer-based models and requires familiarity with the Hugging Face ecosystem for advanced customization.

Health Check
Last Commit

13 hours ago

Responsiveness

1 day

Pull Requests (30d)
145
Issues (30d)
59
Star History
435 stars in the last 30 days

Explore Similar Projects

Starred by Clement Delangue Clement Delangue(Cofounder of Hugging Face), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
20 more.

accelerate by huggingface

0.3%
9k
PyTorch training helper for distributed execution
Created 4 years ago
Updated 1 day ago
Feedback? Help us improve.