PyTorchTricks  by lartpang

Collection of PyTorch performance optimization tricks

created 5 years ago
1,194 stars

Top 33.5% on sourcepulse

GitHubView on GitHub
Project Summary

This repository compiles practical techniques and optimizations for PyTorch users, focusing on accelerating training and reducing memory consumption. It targets researchers and engineers working with deep learning models who aim to improve efficiency and performance. The collection offers actionable advice, code snippets, and links to relevant resources for faster iteration and resource management.

How It Works

The project aggregates strategies across data loading, model design, training procedures, and code-level optimizations. It covers techniques like prefetching data, using efficient image processing libraries (OpenCV, DALI), consolidating data into single files (LMDB, TFRecord), and leveraging mixed-precision training (FP16, AMP). It also details memory-saving methods such as gradient accumulation, gradient checkpointing, and in-place operations.

Quick Start & Requirements

Highlighted Details

  • Data Loading Acceleration: Strategies include pre-processing, GPU-based augmentation (DALI), faster image decoding (jpeg4py), and data consolidation (LMDB, Tar).
  • Memory Optimization: Techniques like gradient accumulation, gradient checkpointing, torch.no_grad(), set_to_none=True for zero_grad, and in-place operations are detailed.
  • Training Speed-ups: Recommendations cover torch.backends.cudnn.benchmark = True, pin_memory=True, DistributedDataParallel, mixed-precision training, and optimizing batch sizes.
  • Model Design: Insights into ShuffleNetV2 and Vision Transformer design principles for efficiency are provided.

Maintenance & Community

The repository appears to be a personal collection with updates spanning from 2019 to 2024. The author encourages community suggestions. Links to Zhihu are provided for discussion.

Licensing & Compatibility

The repository does not explicitly state a license. The content is primarily a compilation of shared knowledge and links to external resources, implying a permissive use for learning and adaptation.

Limitations & Caveats

This is a curated collection of tips rather than a runnable library, requiring users to integrate the advice into their own projects. Some techniques, like gradient accumulation, may affect batch-size-dependent layers (e.g., BatchNorm). The effectiveness of certain optimizations can be workload-dependent.

Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
8 stars in the last 90 days

Explore Similar Projects

Starred by Philipp Schmid Philipp Schmid(DevRel at Google DeepMind), Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake), and
5 more.

the-incredible-pytorch by ritchieng

0.2%
12k
Curated list of PyTorch resources
created 8 years ago
updated 1 week ago
Feedback? Help us improve.