Discover and explore top open-source AI tools and projects—updated daily.
anyscaleLLM router framework for optimized responses
Top 99.8% on SourcePulse
This project introduces a framework for training LLM routers that dynamically route queries to either high-quality closed-source LLMs or cost-effective open-source LLMs. It targets developers building LLM-powered applications who need to balance response quality with operational costs. The primary benefit is significant cost reduction (up to 70%) on benchmarks like MT Bench, while maintaining high response quality.
How It Works
The core approach involves training a causal LLM classifier, specifically fine-tuning Llama3-8B, to predict the quality of a potential response from a cost-effective model (Mixtral-8x7B) relative to a high-quality reference (GPT-4). Queries are routed to Mixtral-8x7B if the predicted quality score is high (>=4), and to GPT-4 otherwise. Data labeling is performed using an LLM-as-a-judge methodology, where GPT-4 evaluates Mixtral's responses against its own, assigning a 1-5 star rating. This method allows for scalable, high-quality synthetic data generation.
Quick Start & Requirements
pip install -r requirements.txtANYSCALE_API_KEY, OPENAI_API_KEY, LLAMA2_HF_TOKEN (for evaluation). GPU resources are recommended for training (e.g., 8xA10 GPUs).https://github.com/lm-sys/RouteLLM/.Highlighted Details
Maintenance & Community
This project is developed in collaboration with the Berkeley LMSys group. The primary community and evaluation resources are found within the lm-sys/RouteLLM GitHub repository. No specific community channels (e.g., Discord, Slack) are mentioned in the provided text.
Licensing & Compatibility
The license for the anyscale/llm-router code itself is not explicitly stated in the README. Usage is dependent on API access and terms of service for Anyscale, OpenAI (GPT-4), and Hugging Face (Llama3-8B). Commercial use may be restricted by the licensing of these underlying models and services.
Limitations & Caveats
The framework requires API keys for Anyscale and OpenAI, and access to Llama3-8B for evaluation. The tutorial focuses on a specific routing strategy (causal LLM classifier) and a defined set of models (GPT-4, Mixtral-8x7B, Llama3-8B), which may not be universally applicable. The license for the router code is unspecified.
1 year ago
Inactive
redotvideo
modal-labs
ulab-uiuc
ray-project
lm-sys
tensorzero