sentient  by sentient-engineering

SDK for browser-controlling agents

created 11 months ago
555 stars

Top 58.6% on sourcepulse

GitHubView on GitHub
Project Summary

This Python framework enables the creation of browser-controlling agents with minimal code, targeting developers and researchers looking to automate web interactions. It simplifies complex browser automation tasks by leveraging large language models (LLMs) to interpret goals and generate browser actions.

How It Works

The core of Sentient relies on an LLM to parse a user-defined goal and translate it into a sequence of browser actions. It uses a remote debugging protocol to control a Chrome or Brave browser instance. The framework is designed for flexibility, supporting multiple LLM providers (OpenAI, Anthropic, Ollama, Groq, Together AI, OpenRouter, and custom OpenAI-compatible servers) and allowing customization of agent behavior through task-specific instructions.

Quick Start & Requirements

  • Install: pip install sentient
  • Prerequisites:
    • OpenAI API key (or other provider's key) set via OPENAI_API_KEY environment variable.
    • Chrome or Brave browser must be running with remote debugging enabled on port 9222.
      • macOS: /Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --remote-debugging-port=9222
      • Linux: google-chrome --remote-debugging-port=9222
      • Windows: "C:\Program Files\Google\Chrome\Application\chrome.exe" --remote-debugging-port=9222
  • Setup time: Minimal, assuming browser setup is straightforward.
  • Documentation: Cookbook

Highlighted Details

  • Achieves browser automation in as few as 3 lines of Python code.
  • Supports fine-tuning agent behavior with custom natural language instructions.
  • Leverages litellm for broad LLM provider compatibility.
  • Recommends GPT-4o models for optimal reliability due to JSON output quality.

Maintenance & Community

Licensing & Compatibility

  • License: Not explicitly stated in the README.
  • Compatibility: Designed for Python. Commercial use implications are unclear due to the unstated license.

Limitations & Caveats

The reliability of the agent is highly dependent on the LLM's ability to produce consistent JSON outputs, with smaller local models being less reliable. Groq provider support is noted as experimental with potential reliability issues. The project is in beta, indicating potential for breaking changes.

Health Check
Last commit

9 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
7 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.