alpa  by alpa-projects

Auto-parallelization framework for large-scale neural network training and serving

created 4 years ago
3,142 stars

Top 15.7% on sourcepulse

GitHubView on GitHub
Project Summary

Alpa is a system designed to automate the training and serving of large-scale neural networks, enabling breakthroughs like GPT-3 by simplifying complex distributed system techniques. It targets researchers and engineers working with models exceeding billions of parameters, offering a way to scale computations with minimal code changes.

How It Works

Alpa employs automatic parallelization, transforming single-device code into distributed execution across clusters. It supports data, operator, and pipeline parallelism, integrating tightly with Jax and XLA for high-performance execution. This approach aims to achieve linear scaling for massive models, abstracting away the intricacies of distributed systems.

Quick Start & Requirements

  • Installation and usage details are available in the documentation and examples folder.
  • Requires Jax and XLA.
  • Supports distributed clusters.

Highlighted Details

  • Automates data, operator, and pipeline parallelism for large-scale neural networks.
  • Achieves linear scaling on models with billions of parameters.
  • Integrates with Jax, XLA, and Ray for ecosystem compatibility.
  • Provides an interface for serving large models using Hugging Face Transformers.

Maintenance & Community

Alpa is not actively maintained as a standalone project; its core algorithms have been merged into XLA. Resources for engagement include documentation and a Slack channel.

Licensing & Compatibility

Licensed under the Apache-2.0 license, permitting commercial use and integration with closed-source projects.

Limitations & Caveats

The project is currently available as a research artifact and is not actively maintained. Users seeking the latest advancements in auto-sharding should refer to the XLA project.

Health Check
Last commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
25 stars in the last 90 days

Explore Similar Projects

Starred by Jiayi Pan Jiayi Pan(Author of SWE-Gym; AI Researcher at UC Berkeley), Thomas Wolf Thomas Wolf(Cofounder of Hugging Face), and
3 more.

levanter by stanford-crfm

0.5%
628
Framework for training foundation models with JAX
created 3 years ago
updated 18 hours ago
Feedback? Help us improve.