Discover and explore top open-source AI tools and projects—updated daily.
RichardHrubyAI agent for universal website login automation
New!
Top 93.8% on SourcePulse
This project provides an AI-powered browser agent designed to automate website login processes, addressing the common challenge of brittle, website-specific automation scripts. It targets engineers building browser agents by offering a single, adaptable loop that handles diverse login flows, including multi-step credentials, SSO, MFA, and magic links, significantly reducing development effort and improving robustness against website redesigns.
How It Works
The system employs an LLM (Claude Sonnet 4.5) with vision capabilities to interpret login pages. It navigates to a URL, captures a screenshot, and extracts a stripped-down version of the HTML, focusing on form-relevant elements and traversing Shadow DOM. The LLM then classifies the page into one of six predefined types (e.g., credential_login_form, choice_screen). Based on this classification, the agent either prompts the user for input or directly interacts with the browser using Playwright. A key feature is credential isolation, where the LLM never sees user credentials; they flow directly into the DOM. Furthermore, Playwright locators generated by the LLM are validated against the live DOM, and errors are fed back to the LLM for self-correction and retries.
Quick Start & Requirements
.env.example to .env.local, fill in API keys, run npm install, then npm run dev.http://localhost:3000. A live demo is also available.Highlighted Details
credential_login_form, choice_screen, magic_login_link) with Zod schemas.Maintenance & Community
This project is built by @RichardHruby and @jesse-olympus at Anon. No specific community channels (e.g., Discord, Slack) or roadmap links are provided in the README.
Licensing & Compatibility
The project is released under the MIT License, which is permissive for commercial use and integration into closed-source projects.
Limitations & Caveats
The system's effectiveness is dependent on the LLM's ability to accurately classify screen types and generate valid locators. Handling of loading_screen and blocked_screen types has a maximum retry limit of 12 attempts. The setup requires obtaining and configuring multiple API keys for external services.
6 days ago
Inactive
SawyerHood
BrowserMCP
vercel-labs
Skyvern-AI