gem  by axon-rl

Agentic LLM training environment for interactive reinforcement learning

Created 4 months ago
305 stars

Top 87.7% on SourcePulse

GitHubView on GitHub
Project Summary

GEM: A Gym for Agentic LLMs

GEM (General Experience Maker) is an open-source environment suite designed for training agentic Large Language Models (LLMs) via online reinforcement learning. It provides a standardized API, akin to OpenAI Gym, with a growing collection of diverse environments and seamless integration with popular RL training frameworks. GEM aims to accelerate LLM development by offering a high-throughput simulation platform for interactive learning.

How It Works

GEM offers a composable API for environments, which can include tasks and optional tools like Python executors or search engines. It supports asynchronous vectorized execution for efficient simulation and multi-environment training. The framework-agnostic design allows integration with six leading RL training libraries, facilitating flexible agent development and experimentation.

Quick Start & Requirements

Installation is available via pip install -U gem-llm or from source. Key resources include the paper (arXiv:2510.01051), Notion blog (https://axon-rl.notion.site/gem), and official documentation (https://axon-rl.github.io/gem/). Examples are provided for quick integration with supported frameworks.

Highlighted Details

  • Supports diverse task categories: Games, Math, Code, QA, and ReasoningGym.
  • Integrates essential tools: Python code execution, web search, and MCP API.
  • Framework-agnostic compatibility with Oat, Tinker, Verl, RL2, ROLL, and OpenRLHF.
  • Implements algorithms like REINFORCE, GRPO, PPO, and REINFORCE + ReBN.

Maintenance & Community

The project encourages community contributions and plans a collaborative technical report. A Discord server is available for discussion. Support is acknowledged from Sea AI Lab.

Licensing & Compatibility

The project's license is not specified in the README, which may impact commercial use or integration into closed-source applications.

Limitations & Caveats

No specific limitations or caveats are detailed in the README. The project appears to be presented as a comprehensive solution for agentic LLM training environments.

Health Check
Last Commit

2 days ago

Responsiveness

Inactive

Pull Requests (30d)
19
Issues (30d)
4
Star History
190 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.