petals by bigscience-workshop

Run LLMs at home, BitTorrent-style

Created 3 years ago

9,870 stars

Top 5.1% on SourcePulse

View on GitHub

13 Experts Love This Project

Omar Sanseviero

DevRel at Google DeepMind

Chip Huyen

Author of "AI Engineering", "Designing Machine Learning Systems"

Vincent Weisser

Cofounder of Prime Intellect

Ishaan Jaffer

Cofounder of LiteLLM

and 9 more!

Project Summary

Petals enables users to run and fine-tune large language models (LLMs) like Llama 3.1 (405B) and Mixtral (8x22B) on consumer hardware by distributing model layers across a peer-to-peer network. This approach significantly speeds up inference and fine-tuning compared to traditional offloading methods, making powerful LLMs accessible for desktop users and researchers without high-end infrastructure.

How It Works

Petals utilizes a BitTorrent-like protocol to distribute LLM layers across a decentralized network of participants. When a user runs a model, their device downloads and executes specific layers, then passes the intermediate results to other participants who host subsequent layers. This collaborative execution allows for the inference and fine-tuning of models far larger than what a single machine could handle, with communication managed efficiently to maintain performance.

Quick Start & Requirements

Install: pip install git+https://github.com/bigscience-workshop/petals
Prerequisites: Python 3.x, PyTorch with CUDA 11.7+ for NVIDIA GPUs (AMD support available via separate instructions). macOS users require Homebrew. WSL is recommended for Windows.
Setup: Basic setup is quick, but running larger models may require significant RAM and VRAM.
Links: Colab Demo, Wiki, Discord

Highlighted Details

Supports inference and fine-tuning for models up to 405B parameters.
Achieves up to 6 tokens/sec for Llama 2 (70B) and 4 tokens/sec for Falcon (180B).
Offers flexibility with PyTorch and 🤗 Transformers integration for custom model paths and hidden state access.
Enables private swarms for sensitive data processing.

Maintenance & Community

Petals is a community-driven project originating from the BigScience research workshop. It has active development and a supportive Discord community.

Licensing & Compatibility

The project is licensed under the Apache 2.0 license, permitting commercial use and integration with closed-source applications.

Limitations & Caveats

Performance is dependent on network connectivity and the number of active participants serving model layers. While security measures are in place, users should be aware of the distributed nature of the system when handling highly sensitive data, though private swarms mitigate this.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

31 stars in the last 30 days