agisdk  by agi-inc

AI agent toolkit for real-world web environments

Created 5 months ago
388 stars

Top 73.9% on SourcePulse

GitHubView on GitHub
Project Summary

AGI SDK is a Python toolkit for building, evaluating, and benchmarking AI browser agents against real-world web applications. It targets AI researchers and developers seeking to test agent capabilities in realistic, complex environments, offering a standardized benchmark and a platform for comparing agent performance.

How It Works

The SDK provides high-fidelity, deterministic web application clones (e.g., Amazon, DoorDash) built with modern web stacks. It uses a harness that orchestrates agent interactions with these environments, providing structured observations (DOM, accessibility tree, screenshots) and accepting actions as function calls. This approach allows for reproducible evaluation and direct comparison of agents on standardized tasks.

Quick Start & Requirements

Highlighted Details

  • Supports OpenAI, Anthropic, and OpenRouter models.
  • Includes full-stack web replicas of popular applications like Amazon, DoorDash, Airbnb, Gmail, and LinkedIn.
  • Features a leaderboard (REAL Bench) for agent performance comparison.
  • Offers a robust agent API with observations, actions, memory, and error handling.
  • Allows customization of the harness and integration of custom agents.

Maintenance & Community

  • Community channels are listed as "coming soon" on Discord.
  • AGI Inc. can be followed on LinkedIn.

Licensing & Compatibility

  • The repository does not explicitly state a license in the provided README.

Limitations & Caveats

  • Discord community channels are not yet available.
  • The README does not specify licensing details, which may impact commercial use or closed-source integration.
Health Check
Last Commit

2 days ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
115 stars in the last 30 days

Explore Similar Projects

Starred by Shizhe Diao Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA), Gregor Zunic Gregor Zunic(Cofounder of Browser Use), and
1 more.

BrowserGym by ServiceNow

0.8%
895
Gym environment for web task automation research
Created 1 year ago
Updated 1 day ago
Starred by Peter Norvig Peter Norvig(Author of "Artificial Intelligence: A Modern Approach"; Research Director at Google), Zhen Lu Zhen Lu(Cofounder of Runpod), and
1 more.

agents-towards-production by NirDiamant

2.2%
13k
Production-ready GenAI agent tutorials
Created 3 months ago
Updated 2 weeks ago
Feedback? Help us improve.