agent-sdk by browser-use

Minimalist agent framework for LLM-powered applications

Created 1 month ago

653 stars

Top 51.2% on SourcePulse

Project Summary

<2-3 sentences summarising what the project addresses and solves, the target audience, and the benefit.> browser-use/agent-sdk provides a minimalist framework for building LLM-powered agents, emphasizing a simple "for-loop" execution model with minimal abstractions. It aims to empower developers by giving LLMs maximum freedom within a defined action space, making it ideal for researchers and engineers seeking direct control over agent behavior and leveraging the capabilities of advanced RL'd models. The framework powers BU.app and offers a direct path to agent development.

How It Works

The core design revolves around a straightforward for loop executing tool calls, eschewing complex abstractions. It operates on the principle that LLMs themselves hold the primary value, requiring only a complete action space, an explicit exit mechanism (via the done tool), and effective context management. This approach prioritizes simplicity and direct LLM interaction over framework-imposed magic.

Quick Start & Requirements

Primary install / run command: uv add bu-agent-sdk
Non-default prerequisites and dependencies: Python environment, access to LLM APIs (e.g., Anthropic, OpenAI, Google).
Links: Comprehensive examples are available in the bu_agent_sdk/examples/ directory, including claude_code.py (sandboxed coding assistant) and dependency_injection.py.

Highlighted Details

Done Tool Pattern: Enforces explicit task completion via a TaskComplete exception, preventing premature agent termination.
Ephemeral Messages: Manages context size by retaining only the last N messages for large outputs like browser states or screenshots.
Simple LLM Primitives: Offers a unified interface for multiple LLM providers (Anthropic, OpenAI, Google) with minimal provider-specific code (~300 lines).
Context Compaction: Automatically summarizes conversation history when approaching context limits to maintain efficiency.
Dependency Injection: Integrates a FastAPI-style system for type-safe dependency management within tools.
Streaming Events: Supports real-time event streaming (ToolCallEvent, ToolResultEvent, FinalResponseEvent) for dynamic agent feedback.
Sandboxed Execution: Includes examples demonstrating secure execution environments for code generation and file operations.

Maintenance & Community

No specific details regarding maintainers, community channels (like Discord/Slack), or roadmap were found in the provided README.

Licensing & Compatibility

The project is released under the MIT License, which is highly permissive and generally compatible with commercial use and closed-source applications.

Limitations & Caveats

The framework's minimalist philosophy, while powerful, places a greater onus on the developer to implement sophisticated agent behaviors and "vibe-restriction" through evaluation, rather than relying on extensive built-in guardrails. Complex agent orchestration might require significant custom development.

agent-sdk by browser-use

Explore Similar Projects

alphora by opencmit

self_improving_coding_agent by MaximeRobeyns

agentscope-runtime by agentscope-ai

software-agent-sdk by OpenHands

AgentStack by agentstack-ai

HelloAgents by jjyaoao

TaskingAI by TaskingAI

voltagent by VoltAgent

AIOS by agiresearch

PocketFlow by The-Pocket

Qwen-Agent by QwenLM

letta by letta-ai