modelrelay  by ellipticmarketing

Local router for optimizing AI coding model selection

Created 2 months ago
279 stars

Top 93.0% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

Modelrelay is an OpenAI-compatible local router that benchmarks and dynamically selects the best-performing free coding AI models across providers. It offers developers a cost-effective solution by automatically routing requests to the fastest, most capable LLM, eliminating direct API payments and simplifying integration.

How It Works

This project acts as a local API gateway, continuously evaluating free AI coding models from providers like NVIDIA, Groq, and Ollama. Upon receiving a request, modelrelay intelligently selects the optimal backend model based on speed and capability, forwarding the query. Its OpenAI-compatible interface ensures seamless integration with existing applications as a drop-in replacement for direct API calls.

Quick Start & Requirements

Installation is via NPM (npm install -g modelrelay) or Docker. For NPM, run modelrelay post-install; the service is accessible at http://localhost:7352/. Docker users require Docker Engine/Compose, then docker compose up -d --build after fetching repository files. Both methods expose an OpenAI-compatible endpoint at http://127.0.0.1:7352/v1.

Highlighted Details

  • Supports 80+ models from 11+ providers (e.g., Kimi K2.5, Minimax M2.5, Deepseek V3.2).
  • Features auto-fastest routing and grouped model IDs (e.g., minimax-m2.5) for provider-specific QoS selection.
  • Offers seamless integration with OpenClaw and OpenCode via modelrelay onboard.
  • Supports streaming and non-streaming API requests.
  • Includes an automatic update mechanism for the CLI tool, enabled by default.

Maintenance & Community

Community support, discussions, and feature requests are managed via a dedicated Discord server. The project provides robust CLI commands for updates (modelrelay update, modelrelay autoupdate), configuration management, and service status, indicating active development.

Licensing & Compatibility

The repository's README does not explicitly state a software license. This lack of clear licensing information is a significant adoption blocker, especially for commercial use. Its OpenAI-compatible API facilitates broad compatibility with existing AI tooling.

Limitations & Caveats

The primary limitation is the absence of a declared software license, leaving usage rights ambiguous. The project relies on free tiers from external providers, whose terms are subject to change. The README offers limited insight into known bugs or performance bottlenecks beyond the dynamic routing capabilities.

Health Check
Last Commit

19 hours ago

Responsiveness

Inactive

Pull Requests (30d)
8
Issues (30d)
14
Star History
84 stars in the last 30 days

Explore Similar Projects

Starred by Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI), Johannes Hagemann Johannes Hagemann(Cofounder of Prime Intellect), and
3 more.

minions by HazyResearch

0.2%
1k
Communication protocol for cost-efficient LLM collaboration
Created 1 year ago
Updated 1 month ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera).

LLMRouter by ulab-uiuc

2.8%
2k
Optimize LLM inference with intelligent routing
Created 6 months ago
Updated 1 month ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Elie Bursztein Elie Bursztein(Cybersecurity Lead at Google DeepMind), and
10 more.

RouteLLM by lm-sys

0.4%
5k
Framework for LLM routing and cost reduction (research paper)
Created 1 year ago
Updated 1 year ago
Feedback? Help us improve.