AgentGym  by WooooDyy

Agent framework for LLM-based agent development and evaluation

created 1 year ago
512 stars

Top 61.9% on sourcepulse

GitHubView on GitHub
Project Summary

AgentGym provides a unified framework for evaluating and developing generalist Large Language Model (LLM)-based agents across diverse interactive environments. It aims to facilitate research into LLM agents capable of self-evolution and broad task adaptation, offering a platform, a curated trajectory dataset (AgentTraj-L), and a benchmark suite (AgentEval).

How It Works

AgentGym decouples diverse interactive environments (14 types including web navigation, text games, and tool use) by exposing them via encapsulated HTTP services. These services provide standardized APIs for environment creation, observation, action availability, stepping, and resetting. An AgentController component orchestrates agent interaction with these services, facilitating evaluation, data collection, and training. The framework supports the ReAct format for uniform interaction and allows for real-time feedback and concurrency.

Quick Start & Requirements

  • Install the agentenv package: pip install agentenv
  • For source installation and specific environments, clone the repository and follow instructions within individual agentenv-* directories.
  • Requires Python. Specific environment dependencies may vary.
  • Links: Project Page, AgentTraj-L, AgentEval, AgentEvol-7B

Highlighted Details

  • Features 14 diverse environments with unified ReAct format interaction.
  • Includes AgentTraj-L, a large-scale trajectory dataset for training and evaluation.
  • Introduces AgentEval, a benchmark suite for assessing agent capabilities.
  • Proposes AgentEvol, a method for agent self-evolution across tasks and environments.

Maintenance & Community

The project is actively developed, with recent releases of the paper, model, and datasets in June 2024. Contributions for new environments are welcomed. Contact: zhxi22@m.fudan.edu.cn.

Licensing & Compatibility

The repository appears to be under a permissive license, but specific licensing for the datasets (AgentTraj-L, AgentEval) and the model (AgentEvol-7B) should be verified on their respective Hugging Face pages. Compatibility for commercial use is likely, but requires confirmation of underlying licenses.

Limitations & Caveats

The framework is recently released, and while it includes 14 environments, the breadth and depth of support for each may vary. The "AgentEvol" method is presented as a novel approach, and its performance and generalizability beyond the reported experiments would require further validation.

Health Check
Last commit

4 months ago

Responsiveness

1+ week

Pull Requests (30d)
0
Issues (30d)
0
Star History
62 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems) and Robert Stojnic Robert Stojnic(Creator of Papers with Code).

Agent-S by simular-ai

1.2%
6k
Agentic framework for autonomous computer interaction
created 9 months ago
updated 20 hours ago
Feedback? Help us improve.