SDK for browser-controlling agents
Top 58.6% on sourcepulse
This Python framework enables the creation of browser-controlling agents with minimal code, targeting developers and researchers looking to automate web interactions. It simplifies complex browser automation tasks by leveraging large language models (LLMs) to interpret goals and generate browser actions.
How It Works
The core of Sentient relies on an LLM to parse a user-defined goal and translate it into a sequence of browser actions. It uses a remote debugging protocol to control a Chrome or Brave browser instance. The framework is designed for flexibility, supporting multiple LLM providers (OpenAI, Anthropic, Ollama, Groq, Together AI, OpenRouter, and custom OpenAI-compatible servers) and allowing customization of agent behavior through task-specific instructions.
Quick Start & Requirements
pip install sentient
OPENAI_API_KEY
environment variable./Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --remote-debugging-port=9222
google-chrome --remote-debugging-port=9222
"C:\Program Files\Google\Chrome\Application\chrome.exe" --remote-debugging-port=9222
Highlighted Details
litellm
for broad LLM provider compatibility.Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The reliability of the agent is highly dependent on the LLM's ability to produce consistent JSON outputs, with smaller local models being less reliable. Groq provider support is noted as experimental with potential reliability issues. The project is in beta, indicating potential for breaking changes.
9 months ago
1 day