transformers-benchmarks  by mli

Transformer training benchmark for GPUs

created 3 years ago
911 stars

Top 40.7% on sourcepulse

GitHubView on GitHub
Project Summary

This repository benchmarks the real-world TeraFLOPS achieved by training Transformer models across various NVIDIA GPUs, including multi-GPU and multi-node setups. It targets researchers and engineers needing to estimate training times for large-scale models, providing practical performance data and tools for self-benchmarking.

How It Works

The project measures TeraFLOPS by executing micro-benchmarks and full Transformer layer forward/backward passes for models like BERT, GPT-2, and T5. It compares achieved performance against theoretical hardware limits, offering insights into how factors like precision (TF32/FP16), batch size, and specific GPU architectures impact actual throughput.

Quick Start & Requirements

  • Install/Run: Use the provided NVIDIA PyTorch Docker image (nvcr.io/nvidia/pytorch:22.07-py3).
  • Prerequisites: CUDA-enabled PyTorch, NVIDIA Docker.
  • Setup: Launch the Docker container, then run Jupyter Notebook within it.
  • Links: PyTorch Docker Image

Highlighted Details

  • Benchmarks real TeraFLOPS for Transformer training on A100, A6000, V100, 3090 Ti, and 4090 GPUs.
  • Compares theoretical vs. actual performance for matrix multiplication and full Transformer layers.
  • Includes performance data for both forward and forward+backward passes.
  • Provides Jupyter notebooks for users to run their own benchmarks.

Maintenance & Community

No specific community channels or contributor details are listed in the README.

Licensing & Compatibility

The repository's license is not specified in the README.

Limitations & Caveats

Performance figures are specific to the hardware and configurations tested by the authors and may vary significantly based on user's environment, CUDA version, and specific model implementations.

Health Check
Last commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
16 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.