Discover and explore top open-source AI tools and projects—updated daily.
PyTorch model profiling tutorial
Top 64.1% on SourcePulse
This repository provides a tutorial on profiling PyTorch models, focusing on identifying performance bottlenecks and improving GPU efficiency. It is targeted at researchers and engineers working with large language models or other deep learning architectures who need to optimize training loops. The tutorial demonstrates how to use the PyTorch profiler and interpret its output, leading to actionable optimization strategies.
How It Works
The tutorial guides users through a standard PyTorch training loop for a transformer model. It leverages the built-in PyTorch profiler to capture detailed performance metrics, including CPU and GPU execution times, memory usage, and GPU utilization. The approach emphasizes a step-by-step analysis, starting with a high-level overview from the PyTorch profiler and then delving into lower-level GPU metrics like SM efficiency and achieved occupancy to pinpoint inefficiencies.
Quick Start & Requirements
git clone https://github.com/Quentin-Anthony/torch-profiler-tutorial.git
docker run --privileged --shm-size=1000gb --gpus all -it --rm -v ~/torch-profiler-tutorial:/torch-profiler-tutorial nvcr.io/nvidia/pytorch:23.10-py3
python torch_prof.py
within the container../log
directory locally and run tensorboard --logdir=./log
.Highlighted Details
torch.cuda.amp.autocast
.Maintenance & Community
The repository is maintained by Quentin Anthony. There are no explicit community channels or roadmap links provided in the README.
Licensing & Compatibility
The repository's license is not explicitly stated in the README.
Limitations & Caveats
The tutorial mentions that the TensorBoard trace viewer can be RAM-intensive for large traces, suggesting alternatives like ui.perfetto.dev
. It also notes that low-level GPU profilers (NVIDIA NSYS, AMD Rocprof) are marked as "TODO" and are not yet included. The initial setup involves a privileged Docker container, which might be a security consideration for some environments.
3 weeks ago
Inactive