Web agent for autonomous task completion on websites
Top 46.5% on sourcepulse
SeeAct is a generalist web agent system designed for autonomously executing tasks across any website, with a primary focus on Large Multimodal Models (LMMs) like GPT-4V. It provides a robust codebase for running web agents on live websites and an innovative framework leveraging LMMs for task completion, targeting researchers and developers building automated web interaction tools.
How It Works
SeeAct utilizes a two-component architecture: a Playwright-based tool for interfacing with live websites and an LMM-driven framework. The Playwright tool acts as an intermediary, translating agent actions into browser events and tunneling browser inputs to the agent. This approach allows for direct interaction with live web pages, enabling evaluation and demonstration of web agents in realistic environments.
Quick Start & Requirements
pip install seeact
playwright install
Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The system is research/experimental and requires cautious monitoring during operation. It explicitly states it does not support direct login actions and advises against using it for tasks requiring account access due to safety and legal risks.
6 months ago
1 week