maxtext  by AI-Hypercomputer

Jax LLM for high-performance, scalable training/inference on TPUs/GPUs

created 2 years ago
1,848 stars

Top 23.9% on sourcepulse

GitHubView on GitHub
Project Summary

MaxText is a high-performance, scalable LLM framework built in pure JAX, designed for training and inference on Google Cloud TPUs and GPUs. It targets researchers and production engineers seeking a performant, flexible starting point for large-scale LLM projects, offering high Model FLOPs Utilization (MFU) and ease of customization through forking.

How It Works

MaxText leverages JAX and the XLA compiler to achieve high performance and scalability without manual kernel optimization. This approach allows for efficient execution across large TPU and GPU clusters, simplifying the development process by relying on XLA's automatic optimization capabilities. The framework supports various LLM architectures and offers features like ahead-of-time (AOT) compilation for faster startup and debugging tools for cluster issues.

Quick Start & Requirements

  • Install: python3 -m MaxText.train (after setting up environment and dependencies).
  • Prerequisites: JAX with TPU support (jax[tpu]), Python, and potentially specific hardware configurations (TPUs or GPUs). A setup.sh script is provided for dependency installation.
  • Resources: Requires access to Google Cloud TPUs or GPUs. AOT compilation can be performed on a single machine.
  • Docs: Getting Started, Gemma Guide, Llama2 Guide, Mixtral Guide, DeepSeek Guide.

Highlighted Details

  • Supports Llama 2/3/4, Mistral/Mixtral, Gemma 1-3, and DeepSeek families.
  • Achieves high MFU (e.g., 60-70% on TPU v5p) and scales to thousands of chips.
  • Features Ahead-of-Time (AOT) compilation for optimized training runs.
  • Includes stack trace collection for debugging distributed training issues.

Maintenance & Community

  • Actively updated with new model support (e.g., Llama 4, Gemma 3).
  • Modular import structure introduced in April 2025.
  • No explicit community links (Discord/Slack) are mentioned in the README.

Licensing & Compatibility

  • The README does not explicitly state a license.

Limitations & Caveats

  • Currently supports text-only models; multi-modal support is in development.
  • Context length is limited to 8k for some models, with ongoing optimization efforts.
  • AOT compilation requires matching compilation and execution environments for predictable behavior.
Health Check
Last commit

19 hours ago

Responsiveness

1 week

Pull Requests (30d)
157
Issues (30d)
6
Star History
146 stars in the last 90 days

Explore Similar Projects

Starred by Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake).

HALOs by ContextualAI

0.2%
873
Library for aligning LLMs using human-aware loss functions
created 1 year ago
updated 2 weeks ago
Starred by Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake) and Zhiqiang Xie Zhiqiang Xie(Author of SGLang).

veScale by volcengine

0.1%
839
PyTorch-native framework for LLM training
created 1 year ago
updated 3 weeks ago
Starred by Jiayi Pan Jiayi Pan(Author of SWE-Gym; AI Researcher at UC Berkeley), Thomas Wolf Thomas Wolf(Cofounder of Hugging Face), and
3 more.

levanter by stanford-crfm

0.5%
628
Framework for training foundation models with JAX
created 3 years ago
updated 21 hours ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), George Hotz George Hotz(Author of tinygrad; Founder of the tiny corp, comma.ai), and
10 more.

TinyLlama by jzhang38

0.3%
9k
Tiny pretraining project for a 1.1B Llama model
created 1 year ago
updated 1 year ago
Feedback? Help us improve.