WebCanvas  by iMeanAI

Web agent framework for online development, training, and evaluation

created 1 year ago
266 stars

Top 96.9% on sourcepulse

GitHubView on GitHub
Project Summary

WebCanvas is an open-source framework designed for building, training, and evaluating web agents in dynamic, real-time online environments. It addresses the limitations of static or isolated web agent development by providing a comprehensive suite of tools for realistic interaction and assessment, targeting researchers and developers building sophisticated web-based AI agents.

How It Works

WebCanvas employs a "KEY-NODE" based approach for web trajectory annotation, enabling granular, phase-based assessment of agent performance. It integrates live web environments for realistic feedback, supporting dynamic evaluation functions and offering metrics like USD efficiency. The framework is built with plug-and-play modules for planning, observation, memory, reward, action execution, and evaluation, facilitating easy iteration on LLM-based web agents.

Quick Start & Requirements

  • Install: conda create -n webcanvas python=3.11, conda activate webcanvas, pip install -r requirements.txt.
  • Prerequisites: Node.js, Google API Key and Custom Search Engine ID for search actions, Browserbase API Key for cloud browser integration.
  • Setup: Requires API key configuration and Node.js dependencies.
  • Docs: How to guide, Data download, Demo video.

Highlighted Details

  • Supports multiple LLM providers (OpenAI, Claude, Gemini, together.ai) and OpenAI's o1 models.
  • Introduces a JavaScript event listener-based evaluation system decoupling evaluation from action space.
  • Offers a "USD efficiency score" to quantify agent cost-effectiveness.
  • Provides the Mind2Web-Live dataset with 542 tasks and 2439 intermediate states for benchmarking.

Maintenance & Community

  • Active development with recent releases (v0.0.4 in Dec 2024).
  • Community channels include GitHub Discussions and Discord.
  • Paper presented at ICML 2024 and ACL 2024 workshops.

Licensing & Compatibility

  • The repository does not explicitly state a license in the README.
  • Open data is available for research use.

Limitations & Caveats

The framework is in early stages (v0.0.4) with several items still in the TODO list, including batch evaluation, captcha solving services, and integration with more benchmark datasets like WebArena. The README notes that experimental environment (e.g., Windows server, US-based servers) can significantly impact agent performance.

Health Check
Last commit

3 weeks ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
24 stars in the last 90 days

Explore Similar Projects

Starred by Ying Sheng Ying Sheng(Author of SGLang), Jiayi Pan Jiayi Pan(Author of SWE-Gym; AI Researcher at UC Berkeley), and
1 more.

webarena by web-arena-x

1.1%
1k
Web environment for autonomous agent development
created 2 years ago
updated 5 months ago
Feedback? Help us improve.