alpa by alpa-projects

Auto-parallelization framework for large-scale neural network training and serving

Created 4 years ago

3,174 stars

Top 15.0% on SourcePulse

View on GitHub

22 Experts Love This Project

Chip Huyen

Author of "AI Engineering", "Designing Machine Learning Systems"

Jiayi Pan

Author of SWE-Gym; MTS at xAI

Travis Fischer

Founder of Agentic

Luca Soldaini

Research Scientist at Ai2

and 18 more!

Project Summary

Alpa is a system designed to automate the training and serving of large-scale neural networks, enabling breakthroughs like GPT-3 by simplifying complex distributed system techniques. It targets researchers and engineers working with models exceeding billions of parameters, offering a way to scale computations with minimal code changes.

How It Works

Alpa employs automatic parallelization, transforming single-device code into distributed execution across clusters. It supports data, operator, and pipeline parallelism, integrating tightly with Jax and XLA for high-performance execution. This approach aims to achieve linear scaling for massive models, abstracting away the intricacies of distributed systems.

Quick Start & Requirements

Installation and usage details are available in the documentation and examples folder.
Requires Jax and XLA.
Supports distributed clusters.

Highlighted Details

Automates data, operator, and pipeline parallelism for large-scale neural networks.
Achieves linear scaling on models with billions of parameters.
Integrates with Jax, XLA, and Ray for ecosystem compatibility.
Provides an interface for serving large models using Hugging Face Transformers.

Maintenance & Community

Alpa is not actively maintained as a standalone project; its core algorithms have been merged into XLA. Resources for engagement include documentation and a Slack channel.

Licensing & Compatibility

Licensed under the Apache-2.0 license, permitting commercial use and integration with closed-source projects.

Limitations & Caveats

The project is currently available as a research artifact and is not actively maintained. Users seeking the latest advancements in auto-sharding should refer to the XLA project.

Health Check

Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

11 stars in the last 30 days