pytorch-lightning  by Lightning-AI

Deep learning framework for pretraining, finetuning, and deploying AI models

created 6 years ago
29,887 stars

Top 1.3% on sourcepulse

GitHubView on GitHub
Project Summary

PyTorch Lightning is a framework designed to streamline the training, finetuning, and deployment of AI models, particularly for large-scale or multi-device setups. It targets AI researchers and engineers by abstracting away boilerplate code, allowing them to focus on model architecture and scientific experimentation while maintaining flexibility and control.

How It Works

PyTorch Lightning organizes PyTorch code by separating the "science" (model definition, loss calculation) from the "engineering" (training loops, hardware acceleration, distributed training). It achieves this through LightningModule and Trainer classes. LightningModule encapsulates the model, optimizer configuration, and training/validation steps, while Trainer handles the execution logic, including device placement, mixed precision, and scaling strategies. This approach simplifies complex training setups and promotes code readability and reproducibility.

Quick Start & Requirements

Highlighted Details

  • Supports seamless scaling across multiple GPUs, TPUs, and nodes with zero code changes.
  • Offers advanced features like mixed-precision training, experiment tracking (TensorBoard, W&B, etc.), checkpointing, and early stopping.
  • Includes Lightning Fabric for expert control over training loops and scaling strategies, suitable for complex models like LLMs and diffusion models.
  • Provides utilities for exporting models to TorchScript and ONNX for production deployment.

Maintenance & Community

Maintained by a core team of 10+ contributors and over 800 community contributors. Active Discord community for support and discussions.

Licensing & Compatibility

Licensed under Apache 2.0, which is permissive for commercial use and closed-source linking.

Limitations & Caveats

While designed for flexibility, the abstraction layer might introduce a slight overhead (around 300ms per epoch compared to pure PyTorch). The extensive feature set can also lead to a steeper learning curve for users unfamiliar with distributed training concepts.

Health Check
Last commit

2 days ago

Responsiveness

1 day

Pull Requests (30d)
38
Issues (30d)
29
Star History
584 stars in the last 90 days

Explore Similar Projects

Starred by Logan Kilpatrick Logan Kilpatrick(Product Lead on Google AI Studio), Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), and
3 more.

catalyst by catalyst-team

0%
3k
PyTorch framework for accelerated deep learning R&D
created 7 years ago
updated 1 month ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Zhuohan Li Zhuohan Li(Author of vLLM), and
6 more.

torchtitan by pytorch

0.9%
4k
PyTorch platform for generative AI model training research
created 1 year ago
updated 18 hours ago
Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers), and
13 more.

axolotl by axolotl-ai-cloud

0.6%
10k
CLI tool for streamlined post-training of AI models
created 2 years ago
updated 21 hours ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Stefan van der Walt Stefan van der Walt(Core Contributor to scientific Python ecosystem), and
8 more.

litgpt by Lightning-AI

0.2%
13k
LLM SDK for pretraining, finetuning, and deploying 20+ high-performance LLMs
created 2 years ago
updated 1 week ago
Feedback? Help us improve.