cuga-agent by cuga-project

Enterprise agent for complex web and API task execution

Created 5 months ago

680 stars

Top 49.8% on SourcePulse

View on GitHub

1 Expert Loves This Project

Gabriel Almeida

Cofounder of Langflow

Project Summary

Summary

CUGA is an open-source generalist agent for enterprises, simplifying complex web/API task execution. It targets engineers and power users by offering a configurable, composable architecture for faster deployment of domain-specific agents with reduced complexity.

How It Works

CUGA combines planner-executor and code-act patterns with structured planning and variable management for reliability. It features configurable reasoning modes (fast to accurate) and flexible tool integration (OpenAPI, MCP, Langchain). Its composable design allows CUGA to act as a tool in multi-agent systems, with experimental policy-aware execution and save/reuse capabilities.

Quick Start & Requirements

Primary Install: Clone repo, set up Python 3.12 uv environment, uv sync, configure .env with API keys.
Run Command: cuga start demo.
Prerequisites: Python 3.12+, uv. Optional: Docker/Podman (sandbox), Playwright (hybrid).
Links: Docs: https://docs.cuga.dev, Discord: https://discord.gg/aH6rAEEW, Hugging Face: https://huggingface.co/spaces/ibm-research/cuga-agent.

Highlighted Details

Benchmark Performance: #1 on AppWorld (750 tasks) and top-tier on WebArena.
Key Features: High-performance generalist, configurable reasoning modes, seamless OpenAPI/MCP/Langchain integration.
Composable Architecture: CUGA can be a tool for other agents, enabling nested reasoning.
Task Modes: API-only, Web-only (extension), and Hybrid modes for versatile workflows.
Security: Optional secure code execution sandbox via Docker/Podman.

Maintenance & Community

The project encourages community contributions (use cases, features, bugs) via GitHub Issues. A Discord server is available for engagement. The roadmap includes policy support and performance enhancements.

Licensing & Compatibility

The specific open-source license is not explicitly stated in the provided README content. Commercial use compatibility is also not detailed.

Limitations & Caveats

Features like policy-aware instructions, human-in-the-loop, and save-and-reuse are experimental. Default local Python execution is faster but less secure than the optional Docker/Podman sandbox.

Health Check

Last Commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

32 stars in the last 30 days