gem by axon-rl

Agentic LLM training environment for interactive reinforcement learning

Created 9 months ago

461 stars

Top 65.8% on SourcePulse

View on GitHub

5 Experts Love This Project

Lewis Tunstall

Research Engineer at Hugging Face

Yaowei Zheng

Author of LLaMA-Factory

Vincent Weisser

Cofounder of Prime Intellect

Will Brown

Research Lead at Prime Intellect

and 1 more!

Project Summary

GEM: A Gym for Agentic LLMs

GEM (General Experience Maker) is an open-source environment suite designed for training agentic Large Language Models (LLMs) via online reinforcement learning. It provides a standardized API, akin to OpenAI Gym, with a growing collection of diverse environments and seamless integration with popular RL training frameworks. GEM aims to accelerate LLM development by offering a high-throughput simulation platform for interactive learning.

How It Works

GEM offers a composable API for environments, which can include tasks and optional tools like Python executors or search engines. It supports asynchronous vectorized execution for efficient simulation and multi-environment training. The framework-agnostic design allows integration with six leading RL training libraries, facilitating flexible agent development and experimentation.

Quick Start & Requirements

Installation is available via pip install -U gem-llm or from source. Key resources include the paper (arXiv:2510.01051), Notion blog (https://axon-rl.notion.site/gem), and official documentation (https://axon-rl.github.io/gem/). Examples are provided for quick integration with supported frameworks.

Highlighted Details

Supports diverse task categories: Games, Math, Code, QA, and ReasoningGym.
Integrates essential tools: Python code execution, web search, and MCP API.
Framework-agnostic compatibility with Oat, Tinker, Verl, RL2, ROLL, and OpenRLHF.
Implements algorithms like REINFORCE, GRPO, PPO, and REINFORCE + ReBN.

Maintenance & Community

The project encourages community contributions and plans a collaborative technical report. A Discord server is available for discussion. Support is acknowledged from Sea AI Lab.

Licensing & Compatibility

The project's license is not specified in the README, which may impact commercial use or integration into closed-source applications.

Limitations & Caveats

No specific limitations or caveats are detailed in the README. The project appears to be presented as a comprehensive solution for agentic LLM training environments.

Health Check

Last Commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

16 stars in the last 30 days