Discover and explore top open-source AI tools and projects—updated daily.
Automate web browsing with natural language commands
Top 30.3% on SourcePulse
This project provides a browser automation agent capable of executing natural language instructions. It targets developers and researchers needing to automate web interactions programmatically, offering a way to translate high-level commands into browser actions via large language models. The primary benefit is enabling complex web task automation through simple, human-readable queries.
How It Works
The agent interprets natural language queries using either Google's Gemini Developer API or Vertex AI. It then leverages browser automation libraries, specifically Playwright for local execution or Browserbase for cloud-based control, to interact with web pages. This approach allows for dynamic, intent-driven web navigation and task completion without manual scripting for each step.
Quick Start & Requirements
python3 -m venv .venv
, source .venv/bin/activate
), install dependencies (pip install -r requirements.txt
), and install Playwright's browser and system dependencies (playwright install-deps chrome
, playwright install chrome
).requirements.txt
dependencies, a Google Gemini API key OR Vertex AI project ID and location, and a Chrome browser.GEMINI_API_KEY
or USE_VERTEXAI
, VERTEXAI_PROJECT
, VERTEXAI_LOCATION
) and optionally Browserbase credentials (BROWSERBASE_API_KEY
, BROWSERBASE_PROJECT_ID
) if using that environment.python main.py --query "Your natural language command"
with optional --env
(playwright
or browserbase
) and --initial_url
flags.Highlighted Details
highlight_mouse
option for visual debugging during Playwright execution.Maintenance & Community
Information regarding maintainers, community channels (like Discord or Slack), sponsorships, or a public roadmap is not detailed in the provided README.
Licensing & Compatibility
The README does not specify a software license. Therefore, licensing terms, restrictions, and compatibility for commercial or closed-source use are undetermined.
Limitations & Caveats
As a "preview" release, the project may be experimental or subject to significant changes. Successful operation is contingent on obtaining and configuring necessary API keys for Gemini/Vertex AI and potentially Browserbase, representing a key adoption hurdle. The lack of explicit licensing information poses a risk for integration into production systems.
4 days ago
Inactive