Run LLMs at home, BitTorrent-style
Top 5.2% on sourcepulse
Petals enables users to run and fine-tune large language models (LLMs) like Llama 3.1 (405B) and Mixtral (8x22B) on consumer hardware by distributing model layers across a peer-to-peer network. This approach significantly speeds up inference and fine-tuning compared to traditional offloading methods, making powerful LLMs accessible for desktop users and researchers without high-end infrastructure.
How It Works
Petals utilizes a BitTorrent-like protocol to distribute LLM layers across a decentralized network of participants. When a user runs a model, their device downloads and executes specific layers, then passes the intermediate results to other participants who host subsequent layers. This collaborative execution allows for the inference and fine-tuning of models far larger than what a single machine could handle, with communication managed efficiently to maintain performance.
Quick Start & Requirements
pip install git+https://github.com/bigscience-workshop/petals
Highlighted Details
Maintenance & Community
Petals is a community-driven project originating from the BigScience research workshop. It has active development and a supportive Discord community.
Licensing & Compatibility
The project is licensed under the Apache 2.0 license, permitting commercial use and integration with closed-source applications.
Limitations & Caveats
Performance is dependent on network connectivity and the number of active participants serving model layers. While security measures are in place, users should be aware of the distributed nature of the system when handling highly sensitive data, though private swarms mitigate this.
10 months ago
1 day