HighPerfLLMs2024  by rwitten

Jax course for high-performance LLM construction

created 1 year ago
518 stars

Top 61.5% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides a comprehensive curriculum for building high-performance Large Language Models (LLMs) from scratch using JAX. It targets engineers and researchers aiming to understand and optimize LLM training and inference, covering topics from single-chip roofline analysis to distributed sharding and fused attention. The goal is to equip participants with the skills to design HPC systems that approach physical performance limits.

How It Works

The project guides users through implementing LLMs in JAX, focusing on performance optimization techniques. It delves into roofline analysis for single-chip performance, distributed computing via sharding, and the intricacies of attention mechanisms like fused and FlashAttention. The curriculum emphasizes understanding the underlying mechanics of LLM training and inference to achieve near-peak hardware utilization.

Quick Start & Requirements

  • Install/Run: No specific installation commands are provided; the content is delivered via recorded sessions, slides, and exercises.
  • Prerequisites: Familiarity with JAX is recommended. Access to computational resources (TPUs/GPUs) is implied for practical exercises.
  • Resources: Links to YouTube recordings, slides, and take-home exercises are available for each session.

Highlighted Details

  • Covers end-to-end LLM implementation in JAX for both training and inference.
  • Detailed analysis of single-chip and multi-chip performance, including roofline modeling.
  • Deep dives into attention mechanisms, including fused attention schedules, softmax, and FlashAttention.
  • Introduction to Pallas for low-level kernel optimization.

Maintenance & Community

  • The project is led by Rafi Witten, a tech lead at Cloud TPU/GPU Multipod, known for work on MaxText and pioneering "Accurate Quantized Training."
  • A Discord server is available for community interaction and support: https://discord.gg/2AWcVatVAw.

Licensing & Compatibility

  • The repository's license is not explicitly stated in the README.

Limitations & Caveats

  • The content is presented as a series of recorded sessions and exercises, not a directly executable codebase.
  • The project appears to be a past course offering, with the last session dated May 29, 2024.
Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
76 stars in the last 90 days

Explore Similar Projects

Starred by Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake), Nathan Lambert Nathan Lambert(AI Researcher at AI2), and
4 more.

large_language_model_training_playbook by huggingface

0%
478
Tips for training large language models
created 2 years ago
updated 2 years ago
Starred by Omar Sanseviero Omar Sanseviero(DevRel at Google DeepMind) and Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake).

llm_training_handbook by huggingface

0%
506
Handbook for large language model training methodologies
created 2 years ago
updated 1 year ago
Feedback? Help us improve.