AgentCPM  by OpenBMB

End-to-end LLM agent infrastructure

Created 2 weeks ago

New!

637 stars

Top 52.2% on SourcePulse

GitHubView on GitHub
Project Summary

OpenBMB/AgentCPM provides an end-to-end, open-source infrastructure for training and evaluating LLM agents. It targets researchers and developers, offering a competitive 4B parameter model (AgentCPM-Explore) and a unified tool sandbox to accelerate agent development and benchmarking on long-horizon tasks.

How It Works

The project features AgentCPM-Explore, a 4B LLM agent excelling in deep exploration via 100+ interaction turns, dynamic strategy adjustment, and multi-source validation. Its key advantage is SOTA performance at its scale, rivaling larger models. The accompanying infrastructure includes AgentDock (tool sandbox), AgentRL (async training), and AgentToLeaP (evaluation), forming a complete ecosystem for agent research.

Quick Start & Requirements

Setup involves launching the AgentDock tool sandbox (docker compose up -d). For evaluation, use the Docker image yuyangfu/agenttoleap-eval:v1.0 (docker pull, docker run -dit --gpus all ...). Run custom tasks via python quickstart.py after configuring API keys, model details, and AgentDock URL in quickstart.py. Prerequisites include Docker and GPU access. Links to Hugging Face/ModelScope models are available.

Highlighted Details

  • AgentCPM-Explore is the first open-source 4B agent model achieving top performance on eight long-horizon benchmarks (e.g., GAIA, XBench).
  • Demonstrates SOTA performance at 4B scale, matching or surpassing 8B models and rivaling some 30B+ or closed-source LLMs.
  • Features "Deep Exploration" with 100+ turns, multi-source cross-validation, and dynamic strategy adjustment.
  • Provides a complete end-to-end open-source infrastructure for training, evaluation, and community extensions.

Maintenance & Community

Developed collaboratively by THUNLP, Renmin University of China, ModelBest, and OpenBMB. Specific community channels (Discord, Slack) or a public roadmap are not detailed. The "Latest News" date of 2026-01-12 appears to be a future placeholder.

Licensing & Compatibility

Released under the permissive Apache-2.0 license, generally allowing commercial use and integration into closed-source projects without significant restrictions.

Limitations & Caveats

Key components like the Technical Report and AgentRL framework are "Coming Soon." The QuickStart script, by default, skips automatic scoring, focusing on execution demonstration. The futuristic "Latest News" date may indicate potentially outdated or aspirational information.

Health Check
Last Commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)
1
Issues (30d)
12
Star History
641 stars in the last 19 days

Explore Similar Projects

Feedback? Help us improve.