AgentEvolver by modelscope

Efficient self-evolving agent system training

Created 4 months ago

1,281 stars

Top 30.8% on SourcePulse

View on GitHub

1 Expert Loves This Project

Yaowei Zheng

Author of LLaMA-Factory

Project Summary

An end-to-end, self-evolving training framework, AgentEvolver empowers AI agents to autonomously improve their capabilities through integrated self-questioning, self-navigating, and self-attributing mechanisms. It targets researchers and developers seeking efficient, cost-effective, and continuous evolution of agent systems, eliminating the need for costly manual dataset construction.

How It Works

AgentEvolver integrates three core self-evolving mechanisms: Automatic Task Generation (Self-Questioning) for autonomous, diverse task creation; Experience-guided Exploration (Self-Navigating) for summarizing and reusing cross-task experience to enhance exploration efficiency; and Attribution-based Credit Assignment (Self-Attributing) for fine-grained policy optimization by uncovering causal contributions in long trajectories. Its service-oriented dataflow architecture features modular services for environment sandboxes, LLMs, and experience management, offering standardized interfaces for broad environment compatibility and a flexible context manager for complex interaction logic.

Quick Start & Requirements

Primary Install/Run: Requires conda and CUDA toolkit. Installation involves running bash install.sh, followed by environment-specific setup scripts (e.g., bash env_service/environments/appworld/setup.sh). Optional ReMe setup uses bash external/reme/install_reme.sh. Training is initiated via python launcher.py with specified YAML configurations (e.g., examples/basic.yaml, examples/overall.yaml).
Prerequisites: Conda, CUDA toolkit, API keys (for configuration).
Documentation: Links provided for Environment Service, Task Manager, Experience Manager, and Advantage Processor.

Highlighted Details

AgentEvolver demonstrates superior performance on AppWorld and BFCL-v3 benchmarks compared to baseline models of similar parameter counts.
A 7B parameter AgentEvolver model achieves an average benchmark score of 45.2%, significantly outperforming a 7B Qwen2.5 baseline (15.8%).
A 14B parameter AgentEvolver model reaches an average score of 57.6%, surpassing a 14B Qwen2.5 baseline (29.8%).

Maintenance & Community

The project acknowledges foundational work from ReMe, veRL, and mkdocs. Specific community channels (e.g., Discord, Slack) or active contributor information are not detailed in the provided text.

Licensing & Compatibility

The software license is not explicitly stated in the provided README content. Consequently, compatibility for commercial use or closed-source linking cannot be determined.

Limitations & Caveats

Future development plans include enhancements for multi-agent scenarios and cross-stage collaborative self-evolution. No specific current limitations, known bugs, or alpha status are mentioned for the released framework.

Health Check

Last Commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

148 stars in the last 30 days