levanter  by marin-community

Framework for training foundation models with JAX

Created 3 years ago
689 stars

Top 49.5% on SourcePulse

GitHubView on GitHub
Project Summary

Levanter is a JAX-based framework for training large foundation models, prioritizing legibility, scalability, and reproducibility. It targets researchers and engineers building and experimenting with LLMs, offering a high-performance, deterministic training environment.

How It Works

Levanter leverages JAX for its high-performance, auto-vectorizing, and JIT-compiling capabilities. It utilizes the named tensor library Haliax to enable composable and readable deep learning code, abstracting away complex tensor manipulations. This approach facilitates distributed training across GPUs and TPUs, supporting techniques like Fully Sharded Data Parallelism (FSDP) and tensor parallelism.

Quick Start & Requirements

  • Install: pip install levanter or pip install -e . after cloning the repository.
  • Prerequisites: JAX with appropriate configuration for your platform (GPU/TPU). CUDA support is in progress.
  • Example: python -m levanter.main.train_lm --config_path config/gpt2_nano.yaml
  • Docs: levanter.readthedocs.io, haliax.readthedocs.io

Highlighted Details

  • Supports distributed training on TPUs and GPUs with FSDP and tensor parallelism.
  • Compatible with Hugging Face ecosystem for model and tokenizer import/export via SafeTensors.
  • Offers bitwise deterministic training on TPUs for reproducibility.
  • Includes the Sophia optimizer for potential 2x speedup over Adam.

Maintenance & Community

  • Developed by Stanford's Center for Research on Foundation Models (CRFM).
  • Community channel: #levanter on the unofficial Jax LLM Discord.

Licensing & Compatibility

  • Licensed under the Apache License, Version 2.0. Permissive for commercial use and closed-source linking.

Limitations & Caveats

GPU support is still in progress. Resuming training on a different number of hosts currently breaks reproducibility.

Health Check
Last Commit

2 weeks ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
4 stars in the last 30 days

Explore Similar Projects

Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Woosuk Kwon Woosuk Kwon(Coauthor of vLLM), and
15 more.

torchtitan by pytorch

0.6%
5k
PyTorch platform for generative AI model training research
Created 2 years ago
Updated 1 day ago
Starred by Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), Stas Bekman Stas Bekman(Author of "Machine Learning Engineering Open Book"; Research Engineer at Snowflake), and
25 more.

gpt-neox by EleutherAI

0.1%
7k
Framework for training large-scale autoregressive language models
Created 5 years ago
Updated 1 month ago
Feedback? Help us improve.