llm-router by kcolemangt

Reverse proxy for routing chat/completions API requests to OpenAI-compatible LLMs

Created 1 year ago

369 stars

Top 76.5% on SourcePulse

View on GitHub

2 Experts Love This Project

Chris Van Pelt

Cofounder of Weights & Biases

Ed Huang

Cofounder of PingCAP

Project Summary

This project provides a reverse proxy for routing chat and completions API requests to multiple LLM providers, including OpenAI, Groq, and local Ollama instances. It addresses the limitations of tools like Cursor, which struggle with integrating local models or easily switching between external providers. The primary benefit is enabling seamless access to a diverse range of LLMs through a single, configurable interface.

How It Works

LLM-router acts as a reverse proxy, intercepting API requests and forwarding them to different LLM backends based on configured model prefixes. It supports features like model aliasing (mapping client-recognized model names to specific backend models) and role rewrites (translating custom message roles to backend-compatible ones). This approach allows for optimized prompting strategies and ensures compatibility across various LLM providers and client applications.

Quick Start & Requirements

Install/Run: Download pre-compiled binaries or build from source. Run ./llm-router-<os>-<arch>.
Prerequisites:
- ngrok for creating a public HTTPS endpoint for local models.
- API keys for services like OpenAI and Groq (via environment variables or .env file).
- macOS users may need to adjust file permissions or use spctl to bypass Gatekeeper.
Setup: Requires configuration of config.json to define backends and routing rules.
Links: GitHub Repository

Highlighted Details

Routes requests to OpenAI, Groq, and local Ollama (or any OpenAI-compatible backend).
Supports streaming responses.
Enables model aliasing for custom model name mapping.
Allows role rewrites for message role compatibility.
Can strip unsupported parameters from requests.

Maintenance & Community

Project maintained by kcolemangt.
Contact: X (Twitter) @kcolemangt, LinkedIn Keith.

Licensing & Compatibility

License: Not explicitly stated in the README.
Compatibility: Designed to work with clients like Cursor by overriding their base URL settings.

Limitations & Caveats

The project is distributed as pre-compiled binaries, but the license is not specified, which may impact commercial use. macOS users may encounter Gatekeeper issues requiring manual intervention. The effectiveness of "optimized reasoning prompts" relies on the client's implementation and the specific models used.

Health Check

Last Commit

9 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

3 stars in the last 30 days