RAGEN by mll-lab-nu

Train LLM agents with reinforcement learning in interactive environments

Created 11 months ago

2,470 stars

Top 18.5% on SourcePulse

View on GitHub

5 Experts Love This Project

Yiran Wu

Coauthor of AutoGen

Pawel Garbacki

Cofounder of Fireworks AI

Shizhe Diao

Author of LMFlow; Research Scientist at NVIDIA

Wing Lian

Founder of Axolotl AI

and 1 more!

Project Summary

RAGEN provides a modular framework for training Large Language Model (LLM) agents in interactive, stochastic environments using reinforcement learning. It addresses challenges in multi-turn interactions and environmental uncertainty, enabling agents to learn complex reasoning and action strategies through trajectory-level optimization.

How It Works

RAGEN utilizes the StarPO (State-Thinking-Actions-Reward Policy Optimization) framework, which alternates between a rollout stage (LLM generates reasoning-guided actions and interacts with the environment) and an update stage (LLM is trained on entire interaction trajectories using importance sampling). This approach optimizes both reasoning and action strategies over multiple turns, offering flexibility in reward assignment and prompt-rollout structures.

Quick Start & Requirements

Install via bash scripts/setup_ragen.sh. Manual instructions are available in scripts/setup_ragen.md.
Requires Python and potentially specific LLM dependencies.
Training involves python train.py --config-name base or python train.py --config-name base-lora for parameter-efficient training.
Evaluation can be performed with python -m ragen.llm_agent.agent_proxy --config-name <eval_config>.

Highlighted Details

Modular design with Environment Manager, Context Manager, and Agent Proxy for easier environment integration.
Supports custom environments compatible with OpenAI Gym.
Demonstrates generalization capabilities on Sokoban and FrozenLake environments, showing stable performance with filtered trajectories and no KL divergence penalties.
Offers parameter-efficient training with LoRA.

Maintenance & Community

The project is actively updated, with recent refactoring for improved co-development and stability. Feedback is welcomed via GitHub issues. Notable contributors include researchers from multiple institutions.

Licensing & Compatibility

The README does not explicitly state a license. Users should verify licensing for commercial or closed-source use.

Limitations & Caveats

The project is described as a "minimally viable leap forward," suggesting it is research-oriented and may not be production-ready. Specific hardware requirements or LLM compatibility are not detailed.

Health Check

Last Commit

4 days ago

Responsiveness

1 week

Pull Requests (30d)

Issues (30d)

Star History

41 stars in the last 30 days