alpa  by alpa-projects

Auto-parallelization framework for large-scale neural network training and serving

Created 4 years ago
3,161 stars

Top 15.1% on SourcePulse

GitHubView on GitHub
Project Summary

Alpa is a system designed to automate the training and serving of large-scale neural networks, enabling breakthroughs like GPT-3 by simplifying complex distributed system techniques. It targets researchers and engineers working with models exceeding billions of parameters, offering a way to scale computations with minimal code changes.

How It Works

Alpa employs automatic parallelization, transforming single-device code into distributed execution across clusters. It supports data, operator, and pipeline parallelism, integrating tightly with Jax and XLA for high-performance execution. This approach aims to achieve linear scaling for massive models, abstracting away the intricacies of distributed systems.

Quick Start & Requirements

  • Installation and usage details are available in the documentation and examples folder.
  • Requires Jax and XLA.
  • Supports distributed clusters.

Highlighted Details

  • Automates data, operator, and pipeline parallelism for large-scale neural networks.
  • Achieves linear scaling on models with billions of parameters.
  • Integrates with Jax, XLA, and Ray for ecosystem compatibility.
  • Provides an interface for serving large models using Hugging Face Transformers.

Maintenance & Community

Alpa is not actively maintained as a standalone project; its core algorithms have been merged into XLA. Resources for engagement include documentation and a Slack channel.

Licensing & Compatibility

Licensed under the Apache-2.0 license, permitting commercial use and integration with closed-source projects.

Limitations & Caveats

The project is currently available as a research artifact and is not actively maintained. Users seeking the latest advancements in auto-sharding should refer to the XLA project.

Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
7 stars in the last 30 days

Explore Similar Projects

Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Li Jiang Li Jiang(Coauthor of AutoGen; Engineer at Microsoft), and
27 more.

ColossalAI by hpcaitech

0.0%
41k
AI system for large-scale parallel training
Created 4 years ago
Updated 3 weeks ago
Feedback? Help us improve.