matrix  by facebookresearch

Engine for scalable LLM-powered data generation and inference

Created 9 months ago
253 stars

Top 99.4% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

Summary

Matrix is a versatile engine for multi-agent conversational data generation, LLM inference, model benchmarking, and data processing. It targets engineers and researchers seeking a fast, scalable, and easy-to-use solution for complex LLM workflows, offering high throughput and concurrent task execution.

How It Works

Matrix operates on a Ray cluster for scalability, leveraging Slurm or local resources via submitit. It integrates seamlessly with Hugging Face LLMs through vLLM and SGLang, and supports proprietary models via proxy servers. Key features include robust data pipelines with code execution (bubblewrap) and quality checks, alongside a novel peer-to-peer multi-agent orchestration system designed for high throughput and concurrent workflows. This architecture enables efficient LLM inference and data generation tasks.

Quick Start & Requirements

Installation requires Python 3.10+ (example uses 3.11) and can be managed via Conda. Primary installation involves pip install fair-matrix[vllm_0112]. A Ray cluster is essential, with resource acquisition configurable for Slurm or local environments. Deployment involves starting a Ray cluster and then deploying LLM applications using commands like matrix deploy_applications. Docker support is available for execution within a containerized environment. Links to "Getting Started" and "Advanced Deployment" are provided.

Highlighted Details

  • Supports large-scale inference for open-source (via vLLM/SGLang) and proprietary LLMs (Azure OpenAI, SageMaker, Gemini).
  • Features high-throughput data
Health Check
Last Commit

20 hours ago

Responsiveness

Inactive

Pull Requests (30d)
6
Issues (30d)
0
Star History
15 stars in the last 30 days

Explore Similar Projects

Starred by Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
9 more.

LightLLM by ModelTC

0.3%
4k
Python framework for LLM inference and serving
Created 2 years ago
Updated 1 day ago
Feedback? Help us improve.