CodexSaver  by fendouai

Cost-aware LLM routing for coding

Created 3 weeks ago

New!

546 stars

Top 57.9% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

CodexSaver addresses the high cost of using powerful LLMs like Codex for routine coding tasks. It acts as an MCP tool, routing low-risk development work (e.g., tests, docs, search) to cheaper worker LLMs, while retaining high-risk judgment and final review for Codex. This pattern enables significant cost savings for developers without compromising critical decision-making quality.

How It Works

The system employs a Router to classify tasks by risk. Low-risk tasks are delegated via a Provider Client to a configured worker LLM (defaulting to DeepSeek), with context managed by a Context Packer. A Verifier validates outputs, and a Cost Estimator tracks savings. High-risk, ambiguous, or sensitive tasks (architecture, security, final review) remain with Codex. This approach ensures expensive models handle judgment, while cheaper models manage volume.

Quick Start & Requirements

Installation involves cloning the repository, setting up an API key for a chosen provider (e.g., DeepSeek), and running python cli.py install. A global MCP entry is created in ~/.codex/config.toml. Project-local installation is also supported via python cli.py install --project. Requires Python and Git. API keys are necessary for most hosted providers. Setup is rapid, with a "60-Second Demo" and doctor command for verification.

Highlighted Details

  • Cost Savings: Benchmarks show an average estimated saving of 48.4%, reducing the cost index from 1.00 to 0.52 across typical low-risk development tasks.
  • Transparent Routing: LLM responses include an interaction block detailing the routing decision (e.g., delegated_execution, route_label), making the tool's activity visible.
  • Flexible Provider Support: Integrates with DeepSeek, OpenAI, Anthropic, Gemini, Qwen, Ollama, LM Studio, and custom OpenAI-compatible endpoints.
  • End-to-End Verification: Features a robust verification flow and a doctor command to confirm setup readiness and provider configuration.

Maintenance & Community

The provided README does not detail specific contributors, sponsorships, or community channels (e.g., Discord, Slack).

Licensing & Compatibility

The repository's license is not specified in the README. This absence requires clarification for commercial use or integration into closed-source projects.

Limitations & Caveats

Current routing is primarily rule-based; advanced "cost-aware dynamic routing" and "cost-aware provider selection" are listed as future roadmap items. The system includes explicit keyword guards (e.g., "production logic") to retain tasks with Codex, indicating a deliberate, rather than fully adaptive, risk assessment. The lack of a stated license is a significant adoption blocker.

Health Check
Last Commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)
1
Issues (30d)
1
Star History
550 stars in the last 21 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera).

LLMRouter by ulab-uiuc

1.2%
2k
Optimize LLM inference with intelligent routing
Created 7 months ago
Updated 2 weeks ago
Feedback? Help us improve.