lightning-hydra-template  by ashleve

ML experimentation template using PyTorch Lightning + Hydra

created 4 years ago
4,813 stars

Top 10.5% on sourcepulse

GitHubView on GitHub
Project Summary

This template provides a robust and user-friendly structure for deep learning projects, leveraging PyTorch Lightning and Hydra for efficient experimentation and configuration management. It's designed for researchers and engineers who need to quickly set up, manage, and scale their ML experiments, offering a clean boilerplate and MLOps best practices.

How It Works

The core of the template relies on Hydra for flexible configuration management, allowing dynamic composition and overriding of settings via YAML files and the command line. PyTorch Lightning handles the training loop, device management, and logging, abstracting away much of the boilerplate. Modules (models, datasets, callbacks) are dynamically instantiated using hydra.utils.instantiate based on paths defined in the configuration files, enabling easy swapping and iteration.

Quick Start & Requirements

  • Install: Clone the repository and install dependencies using pip install -r requirements.txt. PyTorch installation instructions should be followed separately.
  • Prerequisites: Python 3.8-3.10, PyTorch 2.0+, PyTorch Lightning 2.0+.
  • Resources: A basic MNIST example is provided. For larger models and datasets, GPU acceleration is recommended.
  • Docs: PyTorch Lightning, Hydra

Highlighted Details

  • Dynamic instantiation of PyTorch Lightning modules via Hydra configs.
  • Extensive command-line interface for controlling training, debugging, and hyperparameter sweeps.
  • Support for multiple experiment trackers (Tensorboard, W&B, Neptune, etc.).
  • Integrated testing framework with pytest and @RunIf decorator.
  • Built-in CI workflows for testing and code quality.

Maintenance & Community

This is an unofficial community project. Contributions are welcome via issues and pull requests. Links to community channels are not explicitly provided in the README.

Licensing & Compatibility

  • License: MIT License.
  • Compatibility: Permissive MIT license allows for commercial use and integration with closed-source projects.

Limitations & Caveats

The template notes that integration of evolving libraries like Lightning and Hydra can occasionally lead to breaking changes. It's also noted that the configuration setup is primarily for simple Lightning training and may require adjustments for more complex use cases like Lightning Fabric. Resuming Hydra multiruns or hyperparameter searches is not supported.

Health Check
Last commit

11 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
206 stars in the last 90 days

Explore Similar Projects

Starred by Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake) and Travis Fischer Travis Fischer(Founder of Agentic).

lingua by facebookresearch

0.1%
5k
LLM research codebase for training and inference
created 9 months ago
updated 2 weeks ago
Starred by Logan Kilpatrick Logan Kilpatrick(Product Lead on Google AI Studio), Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), and
3 more.

catalyst by catalyst-team

0%
3k
PyTorch framework for accelerated deep learning R&D
created 7 years ago
updated 1 month ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
10 more.

open-r1 by huggingface

0.2%
25k
SDK for reproducing DeepSeek-R1
created 6 months ago
updated 3 days ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Yang Song Yang Song(Professor at Caltech; Research Scientist at OpenAI), and
16 more.

pytorch-lightning by Lightning-AI

0.1%
30k
Deep learning framework for pretraining, finetuning, and deploying AI models
created 6 years ago
updated 2 days ago
Feedback? Help us improve.