exo by exo-explore

AI cluster for running models on diverse devices

Created 1 year ago

39,731 stars

Top 0.8% on SourcePulse

View on GitHub

17 Experts Love This Project

Jeff Hammerbacher

Cofounder of Cloudera

Anton Bukov

Cofounder of 1inch Network

Chip Huyen

Author of "AI Engineering", "Designing Machine Learning Systems"

Elvis Saravia

Founder of DAIR.AI

and 13 more!

Project Summary

exo enables users to create distributed AI inference clusters using everyday devices, including smartphones, Macs, and Raspberry Pis. It targets individuals and businesses looking to run large language models locally, offering a unified, peer-to-peer approach to distributed computing without a master-worker architecture. The primary benefit is leveraging existing hardware for powerful AI inference, with a ChatGPT-compatible API for easy integration.

How It Works

exo employs dynamic model partitioning, intelligently splitting AI models across available devices based on network topology and individual device resources. This allows for the execution of models larger than any single device could handle. The system uses automatic device discovery and a peer-to-peer (P2P) networking model, ensuring any connected device can contribute to the cluster. The default partitioning strategy is ring memory weighted partitioning, where each device processes layers proportional to its memory capacity.

Quick Start & Requirements

Installation: Install from source: git clone https://github.com/exo-explore/exo.git && cd exo && pip install -e . or source install.sh.
Prerequisites: Python >= 3.12.0. For Linux with NVIDIA GPU: NVIDIA driver, CUDA toolkit, cuDNN library.
Hardware: Sufficient total memory across all devices to fit the model (e.g., 16GB for Llama 3.1 8B fp16). Supports heterogeneous devices (GPU, CPU).
Docs: Example Usage on Multiple macOS Devices

Highlighted Details

Supports various models: LLaMA (MLX, tinygrad), Mistral, LLaVA, Qwen, Deepseek.
ChatGPT-compatible API for seamless application integration.
Peer-to-peer (P2P) device connectivity, avoiding master-worker bottlenecks.
Dynamic model partitioning optimizes resource utilization across heterogeneous hardware.

Maintenance & Community

Maintained by exo labs.
Community channels: Discord, Telegram, X.
Actively hiring and seeking business partnerships.

Licensing & Compatibility

License: GPL-3.0.
Compatibility: GPL-3.0 is a strong copyleft license, requiring derivative works to also be open-sourced under GPL-3.0. Commercial use or linking with closed-source applications may require careful consideration or a separate license.

Limitations & Caveats

The project is experimental software with expected bugs. The iOS implementation is currently behind and requires manual access requests. PyTorch and Radio/Bluetooth discovery modules are listed as under development.

Health Check

Last Commit

2 days ago

Responsiveness

1 day

Pull Requests (30d)

165

Issues (30d)

405

Star History

7,381 stars in the last 30 days