torchtune  by pytorch

PyTorch library for LLM post-training and experimentation

Created 1 year ago
5,492 stars

Top 9.2% on SourcePulse

GitHubView on GitHub
Project Summary

torchtune is a PyTorch-native library for post-training LLMs, offering hackable recipes for SFT, KD, RLHF, and QAT. It supports popular models like Llama, Gemma, and Mistral, prioritizing memory efficiency and performance through YAML configurations and integration with PyTorch's latest APIs. This library is ideal for researchers and engineers looking to fine-tune and experiment with LLMs efficiently.

How It Works

torchtune employs a modular, recipe-driven approach, allowing users to configure training, evaluation, quantization, or inference via YAML files. It leverages PyTorch's advanced features like FSDP2 for distributed training, torchao for quantization, and torch.compile for performance gains. The library emphasizes memory efficiency through techniques like activation offloading, packed datasets, and fused optimizers, enabling larger models and batch sizes on limited hardware.

Quick Start & Requirements

  • Install: pip install torchtune (stable) or pip install --pre --upgrade torchtune --extra-index-url https://download.pytorch.org/whl/nightly/cpu (nightly).
  • Prerequisites: PyTorch (latest stable or nightly), torchvision, torchao. CUDA 12.x recommended for GPU acceleration. Hugging Face Hub token required for downloading model weights.
  • Get Started: First Finetune Tutorial, End-to-End Workflow Tutorial.
  • CLI: tune --help to list commands.

Highlighted Details

  • Supports a wide range of LLMs including Llama 4, Llama 3.3/3.2/3.1, Gemma 2, Mistral, Phi, and Qwen.
  • Offers comprehensive post-training methods: SFT, Knowledge Distillation, DPO, PPO, GRPO, and QAT.
  • Demonstrates significant memory and speed improvements via optimization flags (e.g., QLoRA on Llama 3.1 405B uses 44.8GB on 8x A100).
  • Integrates with ecosystem tools like Hugging Face Hub, LM Eval Harness, Weights & Biases, and ExecuTorch.

Maintenance & Community

  • Actively developed with recent updates adding support for Llama 4, Llama 3.3/3.2, Gemma 2, and multi-node training.
  • Community contributions are highlighted, including PPO, Qwen2, Gemma 2, and DPO implementations.
  • Integrates with Hugging Face, EleutherAI, and Weights & Biases.

Licensing & Compatibility

  • Released under the BSD 3-Clause license.
  • Compatible with commercial use, but users must adhere to terms of service for third-party models.

Limitations & Caveats

Knowledge Distillation is not supported for full weight updates across multiple devices or nodes. PPO and GRPO have limited multi-device/node support for full weight updates. QAT is not supported on single devices.

Health Check
Last Commit

1 day ago

Responsiveness

1 day

Pull Requests (30d)
4
Issues (30d)
10
Star History
79 stars in the last 30 days

Explore Similar Projects

Starred by Théophile Gervet Théophile Gervet(Cofounder of Genesis AI), Jason Knight Jason Knight(Director AI Compilers at NVIDIA; Cofounder of OctoML), and
6 more.

lingua by facebookresearch

0.1%
5k
LLM research codebase for training and inference
Created 11 months ago
Updated 2 months ago
Starred by Aravind Srinivas Aravind Srinivas(Cofounder of Perplexity), Stas Bekman Stas Bekman(Author of "Machine Learning Engineering Open Book"; Research Engineer at Snowflake), and
4 more.

Awesome-pytorch-list by bharathgs

0.1%
16k
Curated list of PyTorch content on GitHub
Created 8 years ago
Updated 1 year ago
Feedback? Help us improve.