Open-source framework for web agent development, testing, and benchmarking
Top 77.4% on sourcepulse
AgentLab is an open-source framework for developing, testing, and benchmarking web agents, targeting researchers and developers in the AI agent space. It provides a scalable and reproducible environment to accelerate research by offering building blocks for agent creation, unified LLM API integration, and support for various benchmarks like WebArena and WorkArena.
How It Works
AgentLab leverages BrowserGym for web interaction and task execution, enabling agents to navigate and act within web environments. It utilizes Ray for large-scale parallel experiment execution, allowing for efficient testing of multiple agents across numerous tasks and seeds. The framework supports a unified LLM API, abstracting interactions with providers like OpenAI, Azure, and OpenRouter, and includes features for reproducibility and result analysis.
Quick Start & Requirements
pip install agentlab
playwright install
).OPENAI_API_KEY
, OPENROUTER_API_KEY
, AZURE_OPENAI_API_KEY
, AZURE_OPENAI_ENDPOINT
, AGENTLAB_EXP_ROOT
.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
AgentLab is presented as a research framework, not a consumer product, and should be used with caution. Benchmarks like WebArena and VisualWebArena have a ~5-minute instance reset time per agent evaluation, and task dependencies can limit parallelism; WorkArena is suggested for smoother parallel experiences. Gradio for AgentXray is noted as potentially unstable.
1 day ago
1 week