Scalable toolkit for efficient model reinforcement learning
Top 58.2% on sourcepulse
NeMo RL is a post-training library for large language models, targeting researchers and engineers who need to fine-tune models efficiently across diverse hardware. It offers scalable reinforcement learning algorithms like GRPO, SFT, and DPO, integrating seamlessly with Hugging Face and leveraging Ray for distributed execution.
How It Works
NeMo RL utilizes a modular design built on Ray for scalable distributed training and inference. It supports advanced parallelism techniques (FSDP, Tensor Parallelism, Sequence Parallelism) via native PyTorch and Megatron Core for handling models exceeding 100 billion parameters and large context lengths. The library emphasizes worker and environment isolation for robust multi-node operations.
Quick Start & Requirements
uv
: pip install uv
then uv run <command>
.HF_HOME
, WANDB_API_KEY
, HF_DATASETS_CACHE
environment variables.uv
.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The library is actively developing with "coming soon" features, including improved native performance and support for MoE models. While it supports models up to 32B parameters natively, larger models require Megatron Core integration.
21 hours ago
Inactive