OrcaRouter-Lite  by Continuum-AI-Corp

Self-hosted LLM router with intelligent model selection

Created 3 weeks ago

New!

683 stars

Top 49.3% on SourcePulse

GitHubView on GitHub
Project Summary

A self-hosted, OpenAI-compatible LLM router, OrcaRouter Lite addresses the need for efficient and cost-effective LLM API management. It targets developers and power users by offering automatic model selection based on capability and cost, a managed hosted fallback, and local analytics, simplifying integration and optimizing spend across various LLM providers.

How It Works

OrcaRouter Lite operates as a local server proxying LLM requests. Its key innovation is model="auto", which intelligently routes requests to the cheapest model capable of fulfilling the user's requirements (e.g., tools, vision, JSON mode) across configured providers. It supports streaming responses and provides a built-in dashboard for configuration and analytics, functioning without mandatory external databases like Postgres or Redis.

Quick Start & Requirements

  • Self-hosted (Path A): Clone the repository, copy .env.example to .env, add provider API keys (e.g., OPENAI_API_KEY), and run docker compose up. The API base URL is http://localhost:8000/v1.
  • Hosted (Path B): Register at https://www.orcarouter.ai, obtain an API key, and use https://api.orcarouter.ai/v1 as the base URL.
  • Prerequisites: Docker is required for the self-hosted option.
  • Documentation: Further details are available at docs.orcarouter.ai/introduction.

Highlighted Details

  • model="auto" Routing: Dynamically selects the cheapest capable LLM provider based on real-time pricing and feature requirements, eliminating manual routing logic.
  • OpenAI Compatibility: Seamlessly integrates with any tool or SDK designed for OpenAI's Chat Completions API.
  • Streaming Support: Provides drop-in compatibility for streaming Server-Sent Events (SSE) responses.
  • BYOK & Hosted Fallback: Supports Bring Your Own Keys for local providers and integrates with api.orcarouter.ai as a managed upstream provider.
  • Cross-Provider Prompt Caching: Caches deterministic requests across different LLM providers for zero-cost, instant retrieval.
  • Local Analytics Dashboard: Offers insights into spend, latency, savings, and model reachability without sending telemetry.
  • Extensive Model Catalog: Dynamically loads and manages over 100 chat models from LiteLLM's pricing database.

Maintenance & Community

The project has a clear roadmap with many features already implemented. No specific community channels (e.g., Discord, Slack) are listed in the README.

Licensing & Compatibility

The project is released under the MIT License, permitting commercial use and integration within closed-source applications.

Limitations & Caveats

This "single-workspace edition" deliberately omits features such as multi-tenancy, role-based access control (RBAC), single sign-on (SSO), billing management, and advanced administrative consoles. These capabilities are reserved for the hosted product or a forthcoming Teams edition.

Health Check
Last Commit

2 weeks ago

Responsiveness

Inactive

Pull Requests (30d)
36
Issues (30d)
1
Star History
715 stars in the last 24 days

Explore Similar Projects

Starred by Eric Zhang Eric Zhang(Founding Engineer at Modal) and Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI).

smg by lightseekorg

5.6%
284
High-performance LLM gateway for diverse inference backends
Created 6 months ago
Updated 11 hours ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and David Cramer David Cramer(Cofounder of Sentry).

llmgateway by theopenco

1.2%
1k
LLM API gateway for unified provider access
Created 1 year ago
Updated 10 hours ago
Feedback? Help us improve.