aibrix  by vllm-project

Cloud-native infrastructure for scalable GenAI inference

Created 1 year ago
4,331 stars

Top 11.3% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

AIBrix provides cloud-native infrastructure components for scalable GenAI inference, targeting enterprises needing to deploy, manage, and scale LLMs. It offers cost-efficient, pluggable building blocks for high-density LoRA management, LLM gateway routing, and app-tailored autoscaling.

How It Works

AIBrix employs a unified AI runtime sidecar for metric standardization and model management, coupled with distributed inference capabilities. Its architecture supports distributed KV cache for high-capacity reuse and heterogeneous serving across mixed GPU configurations to reduce costs while maintaining SLO guarantees. GPU hardware failure detection is also integrated.

Quick Start & Requirements

  • Install: Clone the repository and use kubectl create -k commands for either nightly or stable releases (v0.2.1 mentioned).
  • Prerequisites: Kubernetes cluster, kubectl.
  • Documentation: https://aibrix.io/docs

Highlighted Details

  • High-Density LoRA Management
  • LLM Gateway and Routing
  • App-Tailored Autoscaler
  • Distributed KV Cache for high-capacity reuse
  • Cost-efficient Heterogeneous Serving (mixed GPU)

Maintenance & Community

  • Active development with recent releases (v0.2.1 on 2025-03-09).
  • Community support via Slack channel: #aibrix.
  • Contributing guidelines available.

Licensing & Compatibility

  • Licensed under Apache 2.0.
  • Permissive license suitable for commercial use and integration with closed-source applications.

Limitations & Caveats

The project is described as an "initiative" and its quick start relies on Kubernetes, indicating a focus on orchestrated environments and potentially a steeper learning curve for users not familiar with Kubernetes.

Health Check
Last Commit

17 hours ago

Responsiveness

1 day

Pull Requests (30d)
63
Issues (30d)
56
Star History
69 stars in the last 30 days

Explore Similar Projects

Starred by Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
3 more.

serve by pytorch

0.0%
4k
Serve, optimize, and scale PyTorch models in production
Created 6 years ago
Updated 2 months ago
Feedback? Help us improve.