sentient  by sentient-engineering

SDK for browser-controlling agents

Created 1 year ago
558 stars

Top 57.5% on SourcePulse

GitHubView on GitHub
Project Summary

This Python framework enables the creation of browser-controlling agents with minimal code, targeting developers and researchers looking to automate web interactions. It simplifies complex browser automation tasks by leveraging large language models (LLMs) to interpret goals and generate browser actions.

How It Works

The core of Sentient relies on an LLM to parse a user-defined goal and translate it into a sequence of browser actions. It uses a remote debugging protocol to control a Chrome or Brave browser instance. The framework is designed for flexibility, supporting multiple LLM providers (OpenAI, Anthropic, Ollama, Groq, Together AI, OpenRouter, and custom OpenAI-compatible servers) and allowing customization of agent behavior through task-specific instructions.

Quick Start & Requirements

  • Install: pip install sentient
  • Prerequisites:
    • OpenAI API key (or other provider's key) set via OPENAI_API_KEY environment variable.
    • Chrome or Brave browser must be running with remote debugging enabled on port 9222.
      • macOS: /Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --remote-debugging-port=9222
      • Linux: google-chrome --remote-debugging-port=9222
      • Windows: "C:\Program Files\Google\Chrome\Application\chrome.exe" --remote-debugging-port=9222
  • Setup time: Minimal, assuming browser setup is straightforward.
  • Documentation: Cookbook

Highlighted Details

  • Achieves browser automation in as few as 3 lines of Python code.
  • Supports fine-tuning agent behavior with custom natural language instructions.
  • Leverages litellm for broad LLM provider compatibility.
  • Recommends GPT-4o models for optimal reliability due to JSON output quality.

Maintenance & Community

Licensing & Compatibility

  • License: Not explicitly stated in the README.
  • Compatibility: Designed for Python. Commercial use implications are unclear due to the unstated license.

Limitations & Caveats

The reliability of the agent is highly dependent on the LLM's ability to produce consistent JSON outputs, with smaller local models being less reliable. Groq provider support is noted as experimental with potential reliability issues. The project is in beta, indicating potential for breaking changes.

Health Check
Last Commit

11 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
7 stars in the last 30 days

Explore Similar Projects

Starred by Kevin Hou Kevin Hou(Head of Product Engineering at Windsurf), Eric Zhu Eric Zhu(Coauthor of AutoGen; Research Scientist at Microsoft Research), and
29 more.

browser-use by browser-use

0.6%
70k
SDK for AI agent browser control
Created 10 months ago
Updated 1 day ago
Feedback? Help us improve.