RL  by NVIDIA-NeMo

Scalable toolkit for efficient model reinforcement learning

created 4 months ago
560 stars

Top 58.2% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

NeMo RL is a post-training library for large language models, targeting researchers and engineers who need to fine-tune models efficiently across diverse hardware. It offers scalable reinforcement learning algorithms like GRPO, SFT, and DPO, integrating seamlessly with Hugging Face and leveraging Ray for distributed execution.

How It Works

NeMo RL utilizes a modular design built on Ray for scalable distributed training and inference. It supports advanced parallelism techniques (FSDP, Tensor Parallelism, Sequence Parallelism) via native PyTorch and Megatron Core for handling models exceeding 100 billion parameters and large context lengths. The library emphasizes worker and environment isolation for robust multi-node operations.

Quick Start & Requirements

  • Install using uv: pip install uv then uv run <command>.
  • Prerequisites: CUDA drivers, compatible PyTorch, Hugging Face CLI login, HF_HOME, WANDB_API_KEY, HF_DATASETS_CACHE environment variables.
  • Setup involves cloning the repository and installing dependencies via uv.
  • Documentation: NeMo RL Features, Examples, Checkpointing, Evaluation, Cluster Setup.

Highlighted Details

  • Supports GRPO, SFT, and DPO learning algorithms.
  • Integrates with Hugging Face for models from 1B to 32B parameters.
  • Offers multi-turn RL capabilities for tool use and games.
  • Provides fast generation via vLLM backend.
  • Enables distributed training with FSDP and Ray.
  • Supports advanced parallelism (TP, SP, activation checkpointing) for large models.

Maintenance & Community

  • Developed by NVIDIA.
  • Contributions are welcomed via Contributing Guidelines.
  • Citation available in BibTeX format.

Licensing & Compatibility

  • Licensed under the Apache License 2.0.
  • Permissive license suitable for commercial use and integration with closed-source projects.

Limitations & Caveats

The library is actively developing with "coming soon" features, including improved native performance and support for MoE models. While it supports models up to 32B parameters natively, larger models require Megatron Core integration.

Health Check
Last commit

21 hours ago

Responsiveness

Inactive

Pull Requests (30d)
158
Issues (30d)
90
Star History
537 stars in the last 90 days

Explore Similar Projects

Starred by Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera) and Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake).

InternEvo by InternLM

1.0%
402
Lightweight training framework for model pre-training
created 1 year ago
updated 1 week ago
Starred by Nat Friedman Nat Friedman(Former CEO of GitHub), Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), and
9 more.

trlx by CarperAI

0.1%
5k
Distributed RLHF for LLMs
created 2 years ago
updated 1 year ago
Starred by Lewis Tunstall Lewis Tunstall(Researcher at Hugging Face), Robert Nishihara Robert Nishihara(Cofounder of Anyscale; Author of Ray), and
4 more.

verl by volcengine

2.4%
12k
RL training library for LLMs
created 9 months ago
updated 17 hours ago
Feedback? Help us improve.