login-machine by RichardHruby

AI agent for universal website login automation

Created 5 months ago

289 stars

Top 90.8% on SourcePulse

View on GitHub

1 Expert Loves This Project

Travis Fischer

Founder of Agentic

Project Summary

This project provides an AI-powered browser agent designed to automate website login processes, addressing the common challenge of brittle, website-specific automation scripts. It targets engineers building browser agents by offering a single, adaptable loop that handles diverse login flows, including multi-step credentials, SSO, MFA, and magic links, significantly reducing development effort and improving robustness against website redesigns.

How It Works

The system employs an LLM (Claude Sonnet 4.5) with vision capabilities to interpret login pages. It navigates to a URL, captures a screenshot, and extracts a stripped-down version of the HTML, focusing on form-relevant elements and traversing Shadow DOM. The LLM then classifies the page into one of six predefined types (e.g., credential_login_form, choice_screen). Based on this classification, the agent either prompts the user for input or directly interacts with the browser using Playwright. A key feature is credential isolation, where the LLM never sees user credentials; they flow directly into the DOM. Furthermore, Playwright locators generated by the LLM are validated against the live DOM, and errors are fed back to the LLM for self-correction and retries.

Quick Start & Requirements

Install/Run: Copy .env.example to .env.local, fill in API keys, run npm install, then npm run dev.
Prerequisites: ANTHROPIC_API_KEY, BROWSERBASE_API_KEY, BROWSERBASE_PROJECT_ID.
Access: Local development runs at http://localhost:3000. A live demo is also available.
Dependencies: Next.js 16, React 19, Tailwind 4, Claude Sonnet 4.5, Playwright via BrowserBase.

Highlighted Details

Stripped HTML Extraction: Reduces LLM token usage by approximately 10x and minimizes hallucinated locators by recursively walking the DOM and stripping non-form elements.
Credential Isolation: User credentials are never exposed to the LLM, flowing directly from the user input to the browser DOM via Playwright.
Self-Correcting Locators: LLM-generated Playwright locators are validated against the live DOM, with errors fed back to the LLM for context-aware retries (up to 3 attempts).
Screen Type Classification: Robust handling of diverse login scenarios through LLM classification into specific types (e.g., credential_login_form, choice_screen, magic_login_link) with Zod schemas.

Maintenance & Community

This project is built by @RichardHruby and @jesse-olympus at Anon. No specific community channels (e.g., Discord, Slack) or roadmap links are provided in the README.

Licensing & Compatibility

The project is released under the MIT License, which is permissive for commercial use and integration into closed-source projects.

Limitations & Caveats

The system's effectiveness is dependent on the LLM's ability to accurately classify screen types and generate valid locators. Handling of loading_screen and blocked_screen types has a maximum retry limit of 12 attempts. The setup requires obtaining and configuring multiple API keys for external services.

Health Check

Last Commit

4 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

2 stars in the last 30 days