Discover and explore top open-source AI tools and projects—updated daily.
AI agent for browser automation using goal-driven planning
Top 42.7% on SourcePulse
Cerebellum is an AI-driven browser automation system designed for users who need to accomplish complex tasks on webpages. It leverages a planning agent powered by a Large Language Model (LLM) to navigate websites, interact with elements, and achieve user-defined goals, simplifying automated web interactions.
How It Works
Cerebellum models web browsing as a directed graph, where each page is a node and user actions are edges. An LLM, currently Claude 3.5 Sonnet, analyzes the current page state and past actions to determine the next optimal action (e.g., click, type). This action is executed, and the resulting new state is fed back to the LLM, creating an iterative planning loop until the goal is met or deemed unachievable. This approach allows for dynamic strategy adjustment and goal-oriented navigation.
Quick Start & Requirements
Highlighted Details
Maintenance & Community
Maintained by Han Wang, with collaborators Darwin Lo, Michael Shuffett, and Shane Moran. Contributions are welcome.
Licensing & Compatibility
Licensed under the MIT License. Permissive for commercial use and integration with closed-source projects.
Limitations & Caveats
Currently experiences safety refusals from Claude 3.5 Sonnet, preventing it from solving CAPTCHAs or navigating pages with political content. Data extraction and horizontal scrolling are marked as TODOs.
7 months ago
Inactive