matrix by facebookresearch

Engine for scalable LLM-powered data generation and inference

Created 11 months ago

264 stars

Top 96.7% on SourcePulse

View on GitHub

1 Expert Loves This Project

Elvis Saravia

Founder of DAIR.AI

Project Summary

Summary

Matrix is a versatile engine for multi-agent conversational data generation, LLM inference, model benchmarking, and data processing. It targets engineers and researchers seeking a fast, scalable, and easy-to-use solution for complex LLM workflows, offering high throughput and concurrent task execution.

How It Works

Matrix operates on a Ray cluster for scalability, leveraging Slurm or local resources via submitit. It integrates seamlessly with Hugging Face LLMs through vLLM and SGLang, and supports proprietary models via proxy servers. Key features include robust data pipelines with code execution (bubblewrap) and quality checks, alongside a novel peer-to-peer multi-agent orchestration system designed for high throughput and concurrent workflows. This architecture enables efficient LLM inference and data generation tasks.

Quick Start & Requirements

Installation requires Python 3.10+ (example uses 3.11) and can be managed via Conda. Primary installation involves pip install fair-matrix[vllm_0112]. A Ray cluster is essential, with resource acquisition configurable for Slurm or local environments. Deployment involves starting a Ray cluster and then deploying LLM applications using commands like matrix deploy_applications. Docker support is available for execution within a containerized environment. Links to "Getting Started" and "Advanced Deployment" are provided.

matrix by facebookresearch

Explore Similar Projects

ThinkMesh by martianlantern

llm-swarm by huggingface

ai-infra-learning by cr7258

MARTI by TsinghuaC3I

xLAM by SalesforceAIResearch

llama-cpp-agent by Maximilian-Winter

mellea by generative-computing

parallax by GradientHQ

LLM-VM by anarchy-ai

distributed-llama by b4rtaz

AgentBench by THUDM

LightLLM by ModelTC