RL  by NVIDIA-NeMo

Scalable toolkit for efficient model reinforcement learning

Created 10 months ago
1,215 stars

Top 32.2% on SourcePulse

GitHubView on GitHub
Project Summary

NeMo RL is a post-training library for large language models, targeting researchers and engineers who need to fine-tune models efficiently across diverse hardware. It offers scalable reinforcement learning algorithms like GRPO, SFT, and DPO, integrating seamlessly with Hugging Face and leveraging Ray for distributed execution.

How It Works

NeMo RL utilizes a modular design built on Ray for scalable distributed training and inference. It supports advanced parallelism techniques (FSDP, Tensor Parallelism, Sequence Parallelism) via native PyTorch and Megatron Core for handling models exceeding 100 billion parameters and large context lengths. The library emphasizes worker and environment isolation for robust multi-node operations.

Quick Start & Requirements

  • Install using uv: pip install uv then uv run <command>.
  • Prerequisites: CUDA drivers, compatible PyTorch, Hugging Face CLI login, HF_HOME, WANDB_API_KEY, HF_DATASETS_CACHE environment variables.
  • Setup involves cloning the repository and installing dependencies via uv.
  • Documentation: NeMo RL Features, Examples, Checkpointing, Evaluation, Cluster Setup.

Highlighted Details

  • Supports GRPO, SFT, and DPO learning algorithms.
  • Integrates with Hugging Face for models from 1B to 32B parameters.
  • Offers multi-turn RL capabilities for tool use and games.
  • Provides fast generation via vLLM backend.
  • Enables distributed training with FSDP and Ray.
  • Supports advanced parallelism (TP, SP, activation checkpointing) for large models.

Maintenance & Community

  • Developed by NVIDIA.
  • Contributions are welcomed via Contributing Guidelines.
  • Citation available in BibTeX format.

Licensing & Compatibility

  • Licensed under the Apache License 2.0.
  • Permissive license suitable for commercial use and integration with closed-source projects.

Limitations & Caveats

The library is actively developing with "coming soon" features, including improved native performance and support for MoE models. While it supports models up to 32B parameters natively, larger models require Megatron Core integration.

Health Check
Last Commit

22 hours ago

Responsiveness

Inactive

Pull Requests (30d)
118
Issues (30d)
61
Star History
144 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Wing Lian Wing Lian(Founder of Axolotl AI), and
3 more.

ROLL by alibaba

2.3%
3k
RL library for large language models
Created 7 months ago
Updated 21 hours ago
Starred by Evan Hubinger Evan Hubinger(Head of Alignment Stress-Testing at Anthropic), Jiayi Pan Jiayi Pan(Author of SWE-Gym; MTS at xAI), and
1 more.

rl by pytorch

0.3%
3k
PyTorch library for reinforcement learning research
Created 4 years ago
Updated 14 hours ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Will Brown Will Brown(Research Lead at Prime Intellect), and
14 more.

verifiers by PrimeIntellect-ai

1.0%
4k
RL for LLMs in verifiable environments
Created 11 months ago
Updated 18 hours ago
Starred by Nat Friedman Nat Friedman(Former CEO of GitHub), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
19 more.

trlx by CarperAI

0.0%
5k
Distributed RLHF for LLMs
Created 3 years ago
Updated 2 years ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Elie Bursztein Elie Bursztein(Cybersecurity Lead at Google DeepMind), and
8 more.

h2o-llmstudio by h2oai

0.1%
5k
LLM Studio: framework for LLM fine-tuning via GUI or CLI
Created 2 years ago
Updated 3 weeks ago
Feedback? Help us improve.