AgentGym by WooooDyy

Agent framework for LLM-based agent development and evaluation

Created 1 year ago

727 stars

Top 47.4% on SourcePulse

View on GitHub

1 Expert Loves This Project

Will Brown

Research Lead at Prime Intellect

Project Summary

AgentGym provides a unified framework for evaluating and developing generalist Large Language Model (LLM)-based agents across diverse interactive environments. It aims to facilitate research into LLM agents capable of self-evolution and broad task adaptation, offering a platform, a curated trajectory dataset (AgentTraj-L), and a benchmark suite (AgentEval).

How It Works

AgentGym decouples diverse interactive environments (14 types including web navigation, text games, and tool use) by exposing them via encapsulated HTTP services. These services provide standardized APIs for environment creation, observation, action availability, stepping, and resetting. An AgentController component orchestrates agent interaction with these services, facilitating evaluation, data collection, and training. The framework supports the ReAct format for uniform interaction and allows for real-time feedback and concurrency.

Quick Start & Requirements

Install the agentenv package: pip install agentenv
For source installation and specific environments, clone the repository and follow instructions within individual agentenv-* directories.
Requires Python. Specific environment dependencies may vary.
Links: Project Page, AgentTraj-L, AgentEval, AgentEvol-7B

Highlighted Details

Features 14 diverse environments with unified ReAct format interaction.
Includes AgentTraj-L, a large-scale trajectory dataset for training and evaluation.
Introduces AgentEval, a benchmark suite for assessing agent capabilities.
Proposes AgentEvol, a method for agent self-evolution across tasks and environments.

Maintenance & Community

The project is actively developed, with recent releases of the paper, model, and datasets in June 2024. Contributions for new environments are welcomed. Contact: zhxi22@m.fudan.edu.cn.

Licensing & Compatibility

The repository appears to be under a permissive license, but specific licensing for the datasets (AgentTraj-L, AgentEval) and the model (AgentEvol-7B) should be verified on their respective Hugging Face pages. Compatibility for commercial use is likely, but requires confirmation of underlying licenses.

Limitations & Caveats

The framework is recently released, and while it includes 14 environments, the breadth and depth of support for each may vary. The "AgentEvol" method is presented as a novel approach, and its performance and generalizability beyond the reported experiments would require further validation.

Health Check

Last Commit

5 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

29 stars in the last 30 days