NadirClaw  by NadirRouter

LLM routing and AI cost optimization proxy

Created 1 month ago
363 stars

Top 77.5% on SourcePulse

GitHubView on GitHub
Project Summary

NadirClaw is an open-source LLM router and AI cost optimizer designed to significantly reduce API expenses by intelligently routing prompts. It acts as a local, OpenAI-compatible proxy, directing simple requests to cheaper or local models and complex ones to premium models, saving users 40-70% on AI API costs without requiring code changes.

How It Works

NadirClaw functions as a self-hosted, OpenAI-compatible proxy that intercepts LLM requests. It employs a lightweight classifier using sentence embeddings (all-MiniLM-L6-v2) and pre-computed centroids to determine prompt complexity in approximately 10ms. Based on this classification, it routes requests to either a designated simple/cheap model or a complex/premium model, ensuring optimal cost-efficiency and performance. The system also incorporates advanced routing modifiers for agentic tasks, reasoning, session persistence, and context window management.

Quick Start & Requirements

  • Installation: pip install nadirclaw (recommended) or via a shell script (curl -fsSL https://raw.githubusercontent.com/doramirdor/NadirClaw/main/install.sh | sh). Docker Compose is also available for a local setup with Ollama.
  • Prerequisites: Python 3.10+, Git. Requires API keys for cloud providers (OpenAI, Anthropic, Google Gemini) or a local LLM setup (e.g., Ollama). OAuth login is supported for major providers.
  • Setup: An interactive setup wizard (nadirclaw setup) guides users through provider configuration, API key entry, and model selection.
  • Running: nadirclaw serve --verbose starts the proxy.
  • Links: Installation script URL: https://raw.githubusercontent.com/doramirdor/NadirClaw/main/install.sh.

Highlighted Details

  • Cost Savings: Achieves 40-70% reduction in AI API costs.
  • Low Latency: ~10ms classification overhead per prompt.
  • Drop-in Proxy: OpenAI-compatible API works with existing tools without code changes.
  • Local Execution: All API keys and data remain on the user's machine.
  • Advanced Routing: Detects agentic tasks, reasoning, session persistence, and context window needs.
  • Fallback Chains: Automatic failover for model unavailability or rate limits.
  • Cost Tracking: Built-in dashboard, reports, and budget alerts.
  • Multi-Provider Support: Integrates with Gemini, OpenAI, Anthropic, Ollama, and LiteLLM-supported providers.

Maintenance & Community

No specific details on active maintenance, notable contributors, sponsorships, or community channels (like Discord/Slack) are present in the provided README.

Licensing & Compatibility

Licensed under MIT. Compatible with any OpenAI-compatible tool. Supports commercial use as it runs locally and does not impose copyleft restrictions.

Limitations & Caveats

The initial loading of the sentence embedding model may introduce a ~2-3 second delay on the very first request. While it routes prompts intelligently, users still need to manage API keys or local model deployments.

Health Check
Last Commit

6 days ago

Responsiveness

Inactive

Pull Requests (30d)
22
Issues (30d)
16
Star History
143 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.