Discover and explore top open-source AI tools and projects—updated daily.
lightseekorgHigh-performance LLM gateway for diverse inference backends
Top 92.0% on SourcePulse
<2-3 sentences summarising what the project addresses and solves, the target audience, and the benefit.> Shepherd Model Gateway (SMG) is a high-performance, engine-agnostic LLM gateway built in Rust. It addresses the complexity of managing large-scale LLM deployments by centralizing worker lifecycle management and traffic balancing across diverse HTTP/gRPC/OpenAI-compatible backends. SMG offers enterprise-ready control over history storage, privacy, and custom logic, benefiting users aiming for efficient, unified, and observable LLM infrastructure.
How It Works
SMG leverages native Rust for speed, featuring a gRPC pipeline and sub-millisecond routing decisions. Its core differentiator is "cache-aware routing," which intelligently understands the KV cache state of inference engines (SGLang, vLLM, TensorRT-LLM) to reuse computation prefixes, thereby maximizing GPU utilization and reducing redundant work. It provides a single, unified API endpoint that routes requests to self-hosted models or various cloud providers, simplifying integration and abstracting backend diversity.
Quick Start & Requirements
docker pull lightseekorg/smg:latest), Python (pip install smg), or Rust (cargo install smg).Highlighted Details
cache_aware for KV cache optimization, prefix_hash, consistent_hashing, and round_robin.Maintenance & Community
The project welcomes contributions, with a reference to a "Contributing Guide." No specific community channels (e.g., Discord, Slack) or details on core maintainers, sponsorships, or roadmap are present in the provided text.
Licensing & Compatibility
The README does not specify the project's license or any compatibility notes for commercial use or closed-source linking.
Limitations & Caveats
The provided README does not detail specific limitations, known bugs, alpha status, or unsupported platforms. The complexity of configuring and managing diverse LLM backends and enterprise features may present a practical adoption hurdle.
11 hours ago
Inactive
theopenco
HazyResearch
llm-d
ai-dynamo