agisdk  by agi-inc

AI agent toolkit for real-world web environments

Created 6 months ago
399 stars

Top 72.3% on SourcePulse

GitHubView on GitHub
Project Summary

AGI SDK is a Python toolkit for building, evaluating, and benchmarking AI browser agents against real-world web applications. It targets AI researchers and developers seeking to test agent capabilities in realistic, complex environments, offering a standardized benchmark and a platform for comparing agent performance.

How It Works

The SDK provides high-fidelity, deterministic web application clones (e.g., Amazon, DoorDash) built with modern web stacks. It uses a harness that orchestrates agent interactions with these environments, providing structured observations (DOM, accessibility tree, screenshots) and accepting actions as function calls. This approach allows for reproducible evaluation and direct comparison of agents on standardized tasks.

Quick Start & Requirements

Highlighted Details

  • Supports OpenAI, Anthropic, and OpenRouter models.
  • Includes full-stack web replicas of popular applications like Amazon, DoorDash, Airbnb, Gmail, and LinkedIn.
  • Features a leaderboard (REAL Bench) for agent performance comparison.
  • Offers a robust agent API with observations, actions, memory, and error handling.
  • Allows customization of the harness and integration of custom agents.

Maintenance & Community

  • Community channels are listed as "coming soon" on Discord.
  • AGI Inc. can be followed on LinkedIn.

Licensing & Compatibility

  • The repository does not explicitly state a license in the provided README.

Limitations & Caveats

  • Discord community channels are not yet available.
  • The README does not specify licensing details, which may impact commercial use or closed-source integration.
Health Check
Last Commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)
9
Issues (30d)
4
Star History
29 stars in the last 30 days

Explore Similar Projects

Starred by Peter Norvig Peter Norvig(Author of "Artificial Intelligence: A Modern Approach"; Research Director at Google), Zhen Lu Zhen Lu(Cofounder of Runpod), and
1 more.

agents-towards-production by NirDiamant

0.9%
15k
Production-ready GenAI agent tutorials
Created 4 months ago
Updated 4 days ago
Feedback? Help us improve.