RL by NVIDIA-NeMo

Scalable toolkit for efficient model reinforcement learning

Created 10 months ago

1,215 stars

Top 32.2% on SourcePulse

View on GitHub

6 Experts Love This Project

Vincent Weisser

Cofounder of Prime Intellect

Jeff Hammerbacher

Cofounder of Cloudera

Shang Wang

Cofounder of CentML; Manager AI Systems at NVIDIA

Philipp Moritz

Cofounder of Anyscale

and 2 more!

Project Summary

NeMo RL is a post-training library for large language models, targeting researchers and engineers who need to fine-tune models efficiently across diverse hardware. It offers scalable reinforcement learning algorithms like GRPO, SFT, and DPO, integrating seamlessly with Hugging Face and leveraging Ray for distributed execution.

How It Works

NeMo RL utilizes a modular design built on Ray for scalable distributed training and inference. It supports advanced parallelism techniques (FSDP, Tensor Parallelism, Sequence Parallelism) via native PyTorch and Megatron Core for handling models exceeding 100 billion parameters and large context lengths. The library emphasizes worker and environment isolation for robust multi-node operations.

Quick Start & Requirements

Install using uv: pip install uv then uv run <command>.
Prerequisites: CUDA drivers, compatible PyTorch, Hugging Face CLI login, HF_HOME, WANDB_API_KEY, HF_DATASETS_CACHE environment variables.
Setup involves cloning the repository and installing dependencies via uv.
Documentation: NeMo RL Features, Examples, Checkpointing, Evaluation, Cluster Setup.

Highlighted Details

Supports GRPO, SFT, and DPO learning algorithms.
Integrates with Hugging Face for models from 1B to 32B parameters.
Offers multi-turn RL capabilities for tool use and games.
Provides fast generation via vLLM backend.
Enables distributed training with FSDP and Ray.
Supports advanced parallelism (TP, SP, activation checkpointing) for large models.

Maintenance & Community

Developed by NVIDIA.
Contributions are welcomed via Contributing Guidelines.
Citation available in BibTeX format.

Licensing & Compatibility

Licensed under the Apache License 2.0.
Permissive license suitable for commercial use and integration with closed-source projects.

Limitations & Caveats

The library is actively developing with "coming soon" features, including improved native performance and support for MoE models. While it supports models up to 32B parameters natively, larger models require Megatron Core integration.

Health Check

Last Commit

22 hours ago

Responsiveness

Inactive

Pull Requests (30d)

118

Issues (30d)

Star History

144 stars in the last 30 days