sutando  by sonichi

Autonomous AI assistant for real-time interaction and self-improvement

Created 1 month ago
313 stars

Top 86.1% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

Sutando is an open-source AI agent designed to act as a personal "Stand," learning user patterns and performing tasks autonomously across various communication channels and applications. It targets technically savvy users seeking a deeply integrated, self-improving AI assistant that operates locally, offering real-time voice, vision, and interaction capabilities with minimal external dependencies beyond a Claude Code subscription. The primary benefit is an AI that not only assists but also evolves and writes its own code based on user interaction and observed patterns.

How It Works

Sutando employs a multi-process architecture: a Voice agent (Gemini Live) for real-time browser interaction, a Conversation server (Gemini Live) for phone calls, and a Core agent (Claude Code CLI) for executing tasks with full system access. Input is processed by the relevant agent, tasks are queued in tasks/, executed by the Core agent, and results are placed in results/, which are then communicated back through the appropriate channel. A key feature is its autonomous build loop, triggered by a cron job, which monitors its own performance, detects user work patterns, discovers new skills, and generates its own code improvements, aiming for continuous self-enhancement.

Quick Start & Requirements

To get started, clone the repository (git clone https://github.com/sonichi/sutando.git), navigate into the directory, copy the example environment file (cp .env.example .env), add your GEMINI_API_KEY to .env, and run the startup script (bash src/startup.sh).

  • Primary Prerequisites: macOS 15+, Claude Code (run once to authenticate), Node.js 22+ (brew install node), fswatch (brew install fswatch), and a Gemini API key.
  • Optional Prerequisites: Twilio account and ngrok for phone call capabilities, ffmpeg for video/audio processing.
  • Documentation: Official app preview available at sutando.ai.

Highlighted Details

  • Autonomous Code Generation: A significant portion of Sutando's codebase was generated autonomously by its own build loop.
  • Deep System Integration: Interacts via voice, screen capture/analysis, calendar, email (Gmail integration), phone calls, meeting joining (Zoom/Meet), and can scale across multiple Macs.
  • Cost-Effective: Leverages existing Claude Code subscriptions ($20-$200/month) and free tiers of services like Gemini, with paid components (e.g., Twilio) being optional.
  • Security Focus: Runs locally with elevated permissions (--dangerously-skip-permissions) but includes STIR/SHAKEN for call verification, 3-tier access control, and auditable action logs.

Maintenance & Community

Sutando is currently in Alpha status, with a call for contributors to test and harden its capabilities. Community discussion and support are available via a Discord channel.

Licensing & Compatibility

The project is licensed under the MIT License, permitting broad use and modification. Its primary compatibility is with macOS 15+ due to specific system permission handling requirements.

Limitations & Caveats

As an alpha project, Sutando is in early development. It has a strict dependency on macOS 15+ for full functionality, particularly concerning system permission management (Screen Recording, Accessibility, Input Monitoring). Users must be comfortable granting these deep system access levels, understanding the associated security implications, and relying on a paid Claude Code subscription for core operation.

Health Check
Last Commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)
417
Issues (30d)
100
Star History
199 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.