WebCanvas  by iMeanAI

Web agent framework for online development, training, and evaluation

Created 1 year ago
269 stars

Top 95.5% on SourcePulse

GitHubView on GitHub
Project Summary

WebCanvas is an open-source framework designed for building, training, and evaluating web agents in dynamic, real-time online environments. It addresses the limitations of static or isolated web agent development by providing a comprehensive suite of tools for realistic interaction and assessment, targeting researchers and developers building sophisticated web-based AI agents.

How It Works

WebCanvas employs a "KEY-NODE" based approach for web trajectory annotation, enabling granular, phase-based assessment of agent performance. It integrates live web environments for realistic feedback, supporting dynamic evaluation functions and offering metrics like USD efficiency. The framework is built with plug-and-play modules for planning, observation, memory, reward, action execution, and evaluation, facilitating easy iteration on LLM-based web agents.

Quick Start & Requirements

  • Install: conda create -n webcanvas python=3.11, conda activate webcanvas, pip install -r requirements.txt.
  • Prerequisites: Node.js, Google API Key and Custom Search Engine ID for search actions, Browserbase API Key for cloud browser integration.
  • Setup: Requires API key configuration and Node.js dependencies.
  • Docs: How to guide, Data download, Demo video.

Highlighted Details

  • Supports multiple LLM providers (OpenAI, Claude, Gemini, together.ai) and OpenAI's o1 models.
  • Introduces a JavaScript event listener-based evaluation system decoupling evaluation from action space.
  • Offers a "USD efficiency score" to quantify agent cost-effectiveness.
  • Provides the Mind2Web-Live dataset with 542 tasks and 2439 intermediate states for benchmarking.

Maintenance & Community

  • Active development with recent releases (v0.0.4 in Dec 2024).
  • Community channels include GitHub Discussions and Discord.
  • Paper presented at ICML 2024 and ACL 2024 workshops.

Licensing & Compatibility

  • The repository does not explicitly state a license in the README.
  • Open data is available for research use.

Limitations & Caveats

The framework is in early stages (v0.0.4) with several items still in the TODO list, including batch evaluation, captcha solving services, and integration with more benchmark datasets like WebArena. The README notes that experimental environment (e.g., Windows server, US-based servers) can significantly impact agent performance.

Health Check
Last Commit

2 months ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
7 stars in the last 30 days

Explore Similar Projects

Starred by Shizhe Diao Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA), Gregor Zunic Gregor Zunic(Cofounder of Browser Use), and
1 more.

BrowserGym by ServiceNow

0.8%
895
Gym environment for web task automation research
Created 1 year ago
Updated 1 day ago
Feedback? Help us improve.