maxtext by AI-Hypercomputer

Jax LLM for high-performance, scalable training/inference on TPUs/GPUs

Created 2 years ago

2,082 stars

Top 21.2% on SourcePulse

View on GitHub

10 Experts Love This Project

Elie Bursztein

Cybersecurity Lead at Google DeepMind

Sebastian De Ro

Cofounder of Magic AI

Edward Sun

Research Scientist at Meta Superintelligence Lab

Jeremy Howard

Cofounder of fast.ai

and 6 more!

Project Summary

MaxText is a high-performance, scalable LLM framework built in pure JAX, designed for training and inference on Google Cloud TPUs and GPUs. It targets researchers and production engineers seeking a performant, flexible starting point for large-scale LLM projects, offering high Model FLOPs Utilization (MFU) and ease of customization through forking.

How It Works

MaxText leverages JAX and the XLA compiler to achieve high performance and scalability without manual kernel optimization. This approach allows for efficient execution across large TPU and GPU clusters, simplifying the development process by relying on XLA's automatic optimization capabilities. The framework supports various LLM architectures and offers features like ahead-of-time (AOT) compilation for faster startup and debugging tools for cluster issues.

Quick Start & Requirements

Install: python3 -m MaxText.train (after setting up environment and dependencies).
Prerequisites: JAX with TPU support (jax[tpu]), Python, and potentially specific hardware configurations (TPUs or GPUs). A setup.sh script is provided for dependency installation.
Resources: Requires access to Google Cloud TPUs or GPUs. AOT compilation can be performed on a single machine.
Docs: Getting Started, Gemma Guide, Llama2 Guide, Mixtral Guide, DeepSeek Guide.

Highlighted Details

Supports Llama 2/3/4, Mistral/Mixtral, Gemma 1-3, and DeepSeek families.
Achieves high MFU (e.g., 60-70% on TPU v5p) and scales to thousands of chips.
Features Ahead-of-Time (AOT) compilation for optimized training runs.
Includes stack trace collection for debugging distributed training issues.

Maintenance & Community

Actively updated with new model support (e.g., Llama 4, Gemma 3).
Modular import structure introduced in April 2025.
No explicit community links (Discord/Slack) are mentioned in the README.

Licensing & Compatibility

The README does not explicitly state a license.

Limitations & Caveats

Currently supports text-only models; multi-modal support is in development.
Context length is limited to 8k for some models, with ongoing optimization efforts.
AOT compilation requires matching compilation and execution environments for predictable behavior.

Health Check

Last Commit

12 hours ago

Responsiveness

1 day

Pull Requests (30d)

145

Issues (30d)

Star History

46 stars in the last 30 days