RL  by NVIDIA-NeMo

Scalable toolkit for efficient model reinforcement learning

Created 6 months ago
869 stars

Top 41.4% on SourcePulse

GitHubView on GitHub
Project Summary

NeMo RL is a post-training library for large language models, targeting researchers and engineers who need to fine-tune models efficiently across diverse hardware. It offers scalable reinforcement learning algorithms like GRPO, SFT, and DPO, integrating seamlessly with Hugging Face and leveraging Ray for distributed execution.

How It Works

NeMo RL utilizes a modular design built on Ray for scalable distributed training and inference. It supports advanced parallelism techniques (FSDP, Tensor Parallelism, Sequence Parallelism) via native PyTorch and Megatron Core for handling models exceeding 100 billion parameters and large context lengths. The library emphasizes worker and environment isolation for robust multi-node operations.

Quick Start & Requirements

  • Install using uv: pip install uv then uv run <command>.
  • Prerequisites: CUDA drivers, compatible PyTorch, Hugging Face CLI login, HF_HOME, WANDB_API_KEY, HF_DATASETS_CACHE environment variables.
  • Setup involves cloning the repository and installing dependencies via uv.
  • Documentation: NeMo RL Features, Examples, Checkpointing, Evaluation, Cluster Setup.

Highlighted Details

  • Supports GRPO, SFT, and DPO learning algorithms.
  • Integrates with Hugging Face for models from 1B to 32B parameters.
  • Offers multi-turn RL capabilities for tool use and games.
  • Provides fast generation via vLLM backend.
  • Enables distributed training with FSDP and Ray.
  • Supports advanced parallelism (TP, SP, activation checkpointing) for large models.

Maintenance & Community

  • Developed by NVIDIA.
  • Contributions are welcomed via Contributing Guidelines.
  • Citation available in BibTeX format.

Licensing & Compatibility

  • Licensed under the Apache License 2.0.
  • Permissive license suitable for commercial use and integration with closed-source projects.

Limitations & Caveats

The library is actively developing with "coming soon" features, including improved native performance and support for MoE models. While it supports models up to 32B parameters natively, larger models require Megatron Core integration.

Health Check
Last Commit

16 hours ago

Responsiveness

1 day

Pull Requests (30d)
131
Issues (30d)
121
Star History
237 stars in the last 30 days

Explore Similar Projects

Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Sebastian Raschka Sebastian Raschka(Author of "Build a Large Language Model (From Scratch)"), and
14 more.

verifiers by willccbb

3.1%
3k
RL for LLMs in verifiable environments
Created 7 months ago
Updated 22 hours ago
Starred by Evan Hubinger Evan Hubinger(Head of Alignment Stress-Testing at Anthropic), Jiayi Pan Jiayi Pan(Author of SWE-Gym; MTS at xAI), and
1 more.

rl by pytorch

0.4%
3k
PyTorch library for reinforcement learning research
Created 3 years ago
Updated 2 days ago
Starred by Nat Friedman Nat Friedman(Former CEO of GitHub), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
19 more.

trlx by CarperAI

0.0%
5k
Distributed RLHF for LLMs
Created 3 years ago
Updated 1 year ago
Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Li Jiang Li Jiang(Coauthor of AutoGen; Engineer at Microsoft), and
26 more.

ColossalAI by hpcaitech

0.1%
41k
AI system for large-scale parallel training
Created 3 years ago
Updated 15 hours ago
Feedback? Help us improve.