sentient by sentient-engineering

SDK for browser-controlling agents

Created 1 year ago

565 stars

Top 56.9% on SourcePulse

Project Summary

This Python framework enables the creation of browser-controlling agents with minimal code, targeting developers and researchers looking to automate web interactions. It simplifies complex browser automation tasks by leveraging large language models (LLMs) to interpret goals and generate browser actions.

How It Works

The core of Sentient relies on an LLM to parse a user-defined goal and translate it into a sequence of browser actions. It uses a remote debugging protocol to control a Chrome or Brave browser instance. The framework is designed for flexibility, supporting multiple LLM providers (OpenAI, Anthropic, Ollama, Groq, Together AI, OpenRouter, and custom OpenAI-compatible servers) and allowing customization of agent behavior through task-specific instructions.

Quick Start & Requirements

Install: pip install sentient
Prerequisites:
- OpenAI API key (or other provider's key) set via OPENAI_API_KEY environment variable.
- Chrome or Brave browser must be running with remote debugging enabled on port 9222.
  - macOS: /Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --remote-debugging-port=9222
  - Linux: google-chrome --remote-debugging-port=9222
  - Windows: "C:\Program Files\Google\Chrome\Application\chrome.exe" --remote-debugging-port=9222
Setup time: Minimal, assuming browser setup is straightforward.
Documentation: Cookbook

Highlighted Details

Achieves browser automation in as few as 3 lines of Python code.
Supports fine-tuning agent behavior with custom natural language instructions.
Leverages litellm for broad LLM provider compatibility.
Recommends GPT-4o models for optimal reliability due to JSON output quality.

Maintenance & Community

Beta status.
Community chat available via Discord: https://discord.gg/umgnyQU2K8

Licensing & Compatibility

License: Not explicitly stated in the README.
Compatibility: Designed for Python. Commercial use implications are unclear due to the unstated license.

Limitations & Caveats

The reliability of the agent is highly dependent on the LLM's ability to produce consistent JSON outputs, with smaller local models being less reliable. Groq provider support is noted as experimental with potential reliability issues. The project is in beta, indicating potential for breaking changes.

sentient by sentient-engineering

Explore Similar Projects

openator by agentlabs-dev

agent-browse by browserbase

ActGPT by ethanhe42

BrowserGPT by mayt

browser-agent by m1guelpf

browserbee by parsaghaffari

browserable by browserable

browsernode by leoning60

omniplex by Omniplex-ai

steel-browser by steel-dev

ChuanhuChatGPT by GaiZhenbiao

browser-use by browser-use