LLMRouter  by ulab-uiuc

Optimize LLM inference with intelligent routing

Created 3 months ago
1,030 stars

Top 36.4% on SourcePulse

GitHubView on GitHub
Project Summary

LLMRouter provides an intelligent, open-source system for dynamically routing queries to the most suitable Large Language Model (LLM), optimizing inference for cost and performance. It targets researchers and developers seeking to manage complex LLM deployments efficiently. The library offers a unified command-line interface (CLI) and a comprehensive data generation pipeline, simplifying the process of training, deploying, and managing diverse LLM routing strategies.

How It Works

LLMRouter employs a sophisticated approach to smart routing, automatically selecting optimal LLMs based on task complexity, cost constraints, and performance requirements. It supports over 16 distinct routing models, categorized into single-round, multi-round, agentic, and personalized routers. These models encompass a wide array of techniques, including K-Nearest Neighbors (KNN), Support Vector Machines (SVM), Multi-Layer Perceptrons (MLP), Matrix Factorization, Elo Rating, graph-based methods, and BERT-based routing, offering flexibility for various use cases.

Quick Start & Requirements

Installation is available via PyPI (pip install llmrouter-lib) or from source for editable installs. Source installation requires Python 3.10 and potentially specific versions of PyTorch (e.g., 2.4.0) and vLLM (e.g., 0.6.3) for optional features like router-r1. A crucial requirement for inference, chat, and data generation is setting the API_KEYS environment variable with valid LLM API keys. Configuration is managed through YAML files, allowing per-model or router-level API endpoint specification.

Highlighted Details

  • Supports 16+ diverse routing models across four categories: single-round, multi-round, agentic, and personalized.
  • Features a unified CLI for training, inference, and interactive chat via a Gradio-based UI.
  • Includes a complete data generation pipeline utilizing 11 benchmark datasets for creating training data with embeddings and performance metrics.
  • Offers an extensible plugin system for easily adding custom router implementations and task definitions.

Maintenance & Community

The project is presented as a "living, extensible research framework" actively welcoming community contributions, including new routing strategies and training paradigms. While specific community channels like Discord or Slack are not detailed, the repository encourages pull requests for integration. Several research papers are acknowledged as inspirations for the router implementations.

Licensing & Compatibility

The provided README does not explicitly state the software license. This omission requires further investigation before adoption, particularly concerning commercial use or integration into closed-source projects.

Limitations & Caveats

Future development areas identified in the TODO list include improving personalized routers, integrating multimodal routing capabilities, and adding continual/online learning for routers. The necessity of configuring API keys for most functionalities is a practical consideration for deployment.

Health Check
Last Commit

4 days ago

Responsiveness

Inactive

Pull Requests (30d)
60
Issues (30d)
12
Star History
1,053 stars in the last 30 days

Explore Similar Projects

Starred by Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
9 more.

LightLLM by ModelTC

0.3%
4k
Python framework for LLM inference and serving
Created 2 years ago
Updated 1 day ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Elie Bursztein Elie Bursztein(Cybersecurity Lead at Google DeepMind), and
10 more.

RouteLLM by lm-sys

0.5%
5k
Framework for LLM routing and cost reduction (research paper)
Created 1 year ago
Updated 1 year ago
Feedback? Help us improve.