AgentGym  by WooooDyy

Agent framework for LLM-based agent development and evaluation

Created 1 year ago
577 stars

Top 56.1% on SourcePulse

GitHubView on GitHub
Project Summary

AgentGym provides a unified framework for evaluating and developing generalist Large Language Model (LLM)-based agents across diverse interactive environments. It aims to facilitate research into LLM agents capable of self-evolution and broad task adaptation, offering a platform, a curated trajectory dataset (AgentTraj-L), and a benchmark suite (AgentEval).

How It Works

AgentGym decouples diverse interactive environments (14 types including web navigation, text games, and tool use) by exposing them via encapsulated HTTP services. These services provide standardized APIs for environment creation, observation, action availability, stepping, and resetting. An AgentController component orchestrates agent interaction with these services, facilitating evaluation, data collection, and training. The framework supports the ReAct format for uniform interaction and allows for real-time feedback and concurrency.

Quick Start & Requirements

  • Install the agentenv package: pip install agentenv
  • For source installation and specific environments, clone the repository and follow instructions within individual agentenv-* directories.
  • Requires Python. Specific environment dependencies may vary.
  • Links: Project Page, AgentTraj-L, AgentEval, AgentEvol-7B

Highlighted Details

  • Features 14 diverse environments with unified ReAct format interaction.
  • Includes AgentTraj-L, a large-scale trajectory dataset for training and evaluation.
  • Introduces AgentEval, a benchmark suite for assessing agent capabilities.
  • Proposes AgentEvol, a method for agent self-evolution across tasks and environments.

Maintenance & Community

The project is actively developed, with recent releases of the paper, model, and datasets in June 2024. Contributions for new environments are welcomed. Contact: zhxi22@m.fudan.edu.cn.

Licensing & Compatibility

The repository appears to be under a permissive license, but specific licensing for the datasets (AgentTraj-L, AgentEval) and the model (AgentEvol-7B) should be verified on their respective Hugging Face pages. Compatibility for commercial use is likely, but requires confirmation of underlying licenses.

Limitations & Caveats

The framework is recently released, and while it includes 14 environments, the breadth and depth of support for each may vary. The "AgentEvol" method is presented as a novel approach, and its performance and generalizability beyond the reported experiments would require further validation.

Health Check
Last Commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)
4
Issues (30d)
2
Star History
66 stars in the last 30 days

Explore Similar Projects

Starred by Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
4 more.

chatarena by Farama-Foundation

0%
2k
Multi-agent environment for LLM research
Created 2 years ago
Updated 1 month ago
Feedback? Help us improve.