opensquilla by opensquilla

AI agent for token-efficient intelligence and cost optimization

Created 2 months ago

5,712 stars

Top 8.7% on SourcePulse

Project Summary

OpenSquilla is a token-efficient AI agent designed to maximize capability within a fixed budget, targeting developers and power users seeking more intelligent and cost-effective AI interactions. It offers enhanced performance through smart routing, persistent memory, and broad LLM provider integration, streamlining complex AI tasks.

How It Works

OpenSquilla employs a novel token-efficient routing mechanism via its local SquillaRouter, which uses a hybrid approach (LightGBM, ONNX BGE classifier, semantic embeddings) to dynamically select the most cost-effective LLM for each turn. This routing, combined with adaptive reasoning that only bills for deep thought and on-demand skills, minimizes token waste. Its architecture integrates persistent, four-tier cognitive memory and a layered security sandbox for robust and secure operation.

Quick Start & Requirements

Users can opt for a recommended preview release package (Windows portable zip) or install from source.

Prerequisites: Git and Git LFS (for source installs), uv (recommended installer, falls back to pip), Python 3.12+ (optional for portable, required for pip fallback/development), and Windows Visual C++ runtime (for bundled router on Windows).
Installation: Download and extract the preview package, or clone the repo and run install.sh/install.ps1.
Configuration: Use opensquilla onboard for interactive setup, or non-interactive methods for automation, specifying LLM providers and API keys.
Running: Start the gateway with opensquilla gateway run and access the Web UI at http://127.0.0.1:18790/control/.
Docs: Links to Git, Git LFS, and uv installation are provided.

Highlighted Details

Token-Efficient Routing: Local SquillaRouter routes turns across four tiers using hybrid features and on-device classification, selecting the cheapest capable model.
Adaptive Reasoning & Skills: Reasoning-token billing is demand-driven, and only necessary skills are loaded, preventing token waste.
Four-Tier Cognitive Memory: Supports working, episodic, semantic, and raw memory tiers with hybrid keyword and semantic search using local ONNX embeddings.
Layered Security Sandbox: Offers Standard, Strict, and Locked policy tiers with Bubblewrap (Linux) or Seatbelt (macOS) for isolated code execution, plus denial ledgers and input sanitization.
Broad LLM Support: Integrates with over 20 LLM providers including OpenAI, Anthropic, Gemini, Ollama, and Groq via a pluggable provider layer.

Maintenance & Community

The project welcomes contributions via GitHub issues and pull requests. Specific details on maintainers, sponsorships, or dedicated community channels (like Discord/Slack) are not explicitly detailed in the README.

Licensing & Compatibility

The README does not specify a software license. This absence is a significant factor for potential adopters, as it leaves licensing terms and commercial use compatibility undefined.

Limitations & Caveats

The macOS security sandbox backend currently renders SBPL profiles only, with process execution pending. On Windows, onnxruntime may require manual installation of the Visual C++ Redistributable for the bundled router to function correctly. The lack of a stated license is a primary adoption blocker.

Health Check

Last Commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)

255

Issues (30d)

Star History

2,221 stars in the last 30 days