Gym  by NVIDIA-NeMo

Library for building LLM RL training environments

Created 4 months ago
584 stars

Top 55.5% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

<2-3 sentences summarising what the project addresses and solves, the target audience, and the benefit.> NeMo Gym builds and scales reinforcement learning (RL) environments for large language models (LLMs). It targets developers and researchers needing to accelerate RL environment creation, testing, and integration with existing training frameworks, offering a structured approach to LLM RL development.

How It Works

Provides scaffolding for complex RL environments (multi-step, multi-turn, user modeling) and enables end-to-end environment testing independent of the RL training loop. Ensures interoperability with existing systems and frameworks, complemented by a growing collection of RLVR environments and datasets.

Quick Start & Requirements

  • Install: Clone repo, install uv, create Python 3.12 venv, run uv sync --extra dev --group docs.
  • Prerequisites: Python 3.12+, Git, Internet. OpenAI API key for examples (vLLM, Azure OpenAI supported). Ray auto-installed. GPU optional for library, may be needed for servers/inference. OS: Linux, macOS, Windows (WSL2).
  • Links: Docs, Tutorials.

Highlighted Details

  • Specialized for LLM RL environment development.
  • Infrastructure accelerates multi-step, multi-turn, and user-modeling scenarios.
  • Supports independent end-to-end environment testing.
  • Interoperable with diverse RL training frameworks and systems.
  • Features a growing collection of RLVR environments and datasets.

Maintenance & Community

In early development; expect evolving APIs, incomplete docs, and bugs. Contributions and feedback welcome; open an issue first. Links: Issues, Contributing Guide.

Licensing & Compatibility

Primarily Apache 2.0 (permissive for commercial use) for core library and many servers. MIT for Mini Swe Agent is also permissive. Some math environments use Creative Commons (CC BY 4.0, CC BY-SA 4.0), requiring attribution and share-alike terms.

Limitations & Caveats

Explicitly "early development"; anticipate evolving APIs, incomplete documentation, and occasional bugs, indicating potential instability and breaking changes.

Health Check
Last Commit

21 hours ago

Responsiveness

Inactive

Pull Requests (30d)
42
Issues (30d)
53
Star History
514 stars in the last 30 days

Explore Similar Projects

Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Will Brown Will Brown(Research Lead at Prime Intellect), and
14 more.

verifiers by PrimeIntellect-ai

1.0%
4k
RL for LLMs in verifiable environments
Created 11 months ago
Updated 17 hours ago
Feedback? Help us improve.