Gym by NVIDIA-NeMo

Library for building LLM RL training environments

Created 6 months ago

664 stars

Top 50.6% on SourcePulse

View on GitHub

1 Expert Loves This Project

Jeff Hammerbacher

Cofounder of Cloudera

Project Summary

<2-3 sentences summarising what the project addresses and solves, the target audience, and the benefit.> NeMo Gym builds and scales reinforcement learning (RL) environments for large language models (LLMs). It targets developers and researchers needing to accelerate RL environment creation, testing, and integration with existing training frameworks, offering a structured approach to LLM RL development.

How It Works

Provides scaffolding for complex RL environments (multi-step, multi-turn, user modeling) and enables end-to-end environment testing independent of the RL training loop. Ensures interoperability with existing systems and frameworks, complemented by a growing collection of RLVR environments and datasets.

Quick Start & Requirements

Install: Clone repo, install uv, create Python 3.12 venv, run uv sync --extra dev --group docs.
Prerequisites: Python 3.12+, Git, Internet. OpenAI API key for examples (vLLM, Azure OpenAI supported). Ray auto-installed. GPU optional for library, may be needed for servers/inference. OS: Linux, macOS, Windows (WSL2).
Links: Docs, Tutorials.

Highlighted Details

Specialized for LLM RL environment development.
Infrastructure accelerates multi-step, multi-turn, and user-modeling scenarios.
Supports independent end-to-end environment testing.
Interoperable with diverse RL training frameworks and systems.
Features a growing collection of RLVR environments and datasets.

Maintenance & Community

In early development; expect evolving APIs, incomplete docs, and bugs. Contributions and feedback welcome; open an issue first. Links: Issues, Contributing Guide.

Licensing & Compatibility

Primarily Apache 2.0 (permissive for commercial use) for core library and many servers. MIT for Mini Swe Agent is also permissive. Some math environments use Creative Commons (CC BY 4.0, CC BY-SA 4.0), requiring attribution and share-alike terms.

Limitations & Caveats

Explicitly "early development"; anticipate evolving APIs, incomplete documentation, and occasional bugs, indicating potential instability and breaking changes.

Gym by NVIDIA-NeMo

Explore Similar Projects

gem by axon-rl

Awesome-RL-based-LLM-Reasoning by bruno686

LLM-RL-Papers by WindyLab

OpenTinker by open-tinker

Awesome-AgenticLLM-RL-Papers by xhyumiracle

verl-tool by TIGER-AI-Lab

awesome-llm-agents by kaushikb11

Awesome-LLM-Post-training by mbzuai-oryx

LLM-Engineering-Essentials by Nebius-Academy

atropos by NousResearch

edu by wandb

verifiers by PrimeIntellect-ai